-
Meta AI
- Menlo Park, CA, USA
- https://chenguang-zhu.github.io
Highlights
- Pro
Stars
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
Build resilient language agents as graphs.
The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.
A powerful obfuscator for JavaScript and Node.js
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
This is ROCgdb, the ROCm source-level debugger for Linux, based on GDB, the GNU source-level debugger.
An analysis tool for Python that blurs the line between testing and type systems.
ROC profiler library. Profiling with perf-counters and derived metrics.
A reactive Python kernel for Jupyter notebooks.
PyTorch native quantization and sparsity for training and inference
Collective communications library with various primitives for multi-machine training.
A dataset of reproducible breaking dependency updates, SANER 2024 (https://doi.org/10.1109/SANER60148.2024.00024)
An implementation of a deep learning recommendation model (DLRM)
Universal Tensor Operations in Einstein-Inspired Notation for Python.
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
Cinder is Meta's internal performance-oriented production version of CPython.