Stars
[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving
[ArXiv 2025] Pseudo-Simulation for Autonomous Driving; [NeurIPS 2024] NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking
PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
Official repo for consistency models.
Official inference repo for FLUX.1 models
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
Fully open reproduction of DeepSeek-R1
Witness the aha moment of VLM with less than $3.
A python3 library for evaluating caption's BLEU, Meteor, CIDEr, SPICE,ROUGE_L,WMD score. Fork from https://github.com/ruotianluo/coco-caption
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper
Google Research
[CVPR 2024] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding
An implementation of 1D, 2D, and 3D positional encoding in Pytorch and TensorFlow
Ongoing research training transformer models at scale
🐍 Geometric Computer Vision Library for Spatial AI
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
StyleGAN2-ADA - Official PyTorch implementation
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Learning from synthetic data - code and models
PyTorch code and models for the DINOv2 self-supervised learning method.
[ECCV 2024] Official GitHub repository for the paper "LingoQA: Visual Question Answering for Autonomous Driving"
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.