Starred repositories
Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.
An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).
[ICML 2025] Gaussian Mixture Flow Matching Models (GMFlow)
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
MAGI-1: Autoregressive Video Generation at Scale
SkyReels-V2: Infinite-length Film Generative model
Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
“FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with any VAE.
The official code of "Weak-to-Strong Diffusion with Reflection".
[ICLR2025] The code of Z-Sampling, proposed in our paper "Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection".
Wan: Open and Advanced Large-Scale Video Generative Models
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
This is the first paper to explore how to effectively use RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reasoning ca…
The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
[ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"
CogView4, CogView3-Plus and CogView3(ECCV 2024)
A playbook for systematically maximizing the performance of deep learning models.
A framework for data augmentation for 2D and 3D image classification and segmentation
Muon optimizer: +>30% sample efficiency with <3% wallclock overhead
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation