Highlights
- Pro
Stars
Have a natural, spoken conversation with AI!
Technical report of Kimina-Prover Preview.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agent RL)
FlashInfer: Kernel Library for LLM Serving
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Sky-T1: Train your own O1 preview model within $450
verl: Volcano Engine Reinforcement Learning for LLMs
Train transformer language models with reinforcement learning.
Minimal reproduction of DeepSeek R1-Zero
Recipes to scale inference-time compute of open models
Reverse Engineering the Abstraction and Reasoning Corpus
Code for 1st place solution to Kaggle's Abstraction and Reasoning Challenge
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Machine Learning Engineering Open Book
Training LLMs with QLoRA + FSDP