-
@tsinghua-ideal, Tsinghua University
- Beijing, China
- me.tric.space
- https://chaofanlin.com/
- @siriusneox
Highlights
- Pro
Lists (4)
Sort Name ascending (A-Z)
Stars
Distributed Triton for Parallel Systems
A Datacenter Scale Distributed Inference Serving Framework
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
[ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference
[ICML2025] Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
DeepEP: an efficient expert-parallel communication library
FlashMLA: Efficient MLA decoding kernels
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
MoBA: Mixture of Block Attention for Long-Context LLMs
Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
A bibliography and survey of the papers surrounding o1
[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
Canvas: End-to-End Kernel Architecture Search in Neural Networks
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
A framework for few-shot evaluation of language models.
Minimal reproduction of DeepSeek R1-Zero