8000 Chenguang-Zhu (Chenguang Zhu) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View Chenguang-Zhu's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report Chenguang-Zhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …

C++ 315 61 Updated Jun 6, 2025

Build resilient language agents as graphs.

Python 13,925 2,336 Updated Jun 10, 2025

An AI Hedge Fund Team

Python 35,379 6,147 Updated Jun 9, 2025

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 389 54 Updated May 28, 2025

A powerful obfuscator for JavaScript and Node.js

TypeScript 14,854 1,619 Updated Jul 1, 2024

An MLIR-based JavaScript intermediate representation

C++ 31 3 Updated Apr 25, 2025

Manipulating Python Programs

Python 665 29 Updated May 22, 2025

PyTorch tutorials.

Jupyter Notebook 8,603 4,171 Updated Jun 5, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,542 177 Updated Jun 25, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,691 2,197 Updated May 21, 2025

HIPIFY: Convert CUDA to Portable C++ Code

C++ 585 87 Updated Jun 9, 2025

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 40,299 3,194 Updated Jun 9, 2025

This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.

C 58 15 Updated Jun 4, 2025
C++ 138 66 Updated Jun 7, 2025

An analysis tool for Python that blurs the line between testing and type systems.

Python 1,162 61 Updated Jun 9, 2025

ROC profiler library. Profiling with perf-counters and derived metrics.

C 147 48 Updated Jun 9, 2025

A reactive Python kernel for Jupyter notebooks.

Python 1,224 23 Updated Jun 8, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,093 281 Updated Jun 10, 2025

PyTorch native post-training library

Python 5,247 620 Updated Jun 9, 2025

Time series charting library based on d3.js

CSS 144 40 Updated Oct 14, 2018

Collective communications library with various primitives for multi-machine training.

C++ 1,309 328 Updated Jun 10, 2025

A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)

Java 20 8 Updated Jun 9, 2025

An implementation of a deep learning recommendation model (DLRM)

Python 3,895 859 Updated May 30, 2025

Universal Tensor Operations in Einstein-Inspired Notation for Python.

Python 382 12 Updated Apr 8, 2025

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 688 25 Updated Apr 20, 2025

Cinder is Meta's internal performance-oriented production version of CPython.

Python 3,634 128 Updated Jun 9, 2025

Python AST read/write

Python 845 106 Updated Feb 6, 2025

A Python Parser

Python 641 110 Updated Mar 10, 2025

Tensor library for machine learning

C++ 12,663 1,251 Updated Jun 6, 2025
Next
0