A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 2,846 162 Updated May 28, 2025

FoundationVision / Infinity

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 1,337 73 Updated Apr 24, 2025

Tencent-Hunyuan / HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 10,406 940 Updated Jun 3, 2025

zju3dv / street_gaussians

[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Python 1,114 72 Updated Dec 31, 2024

hustvl / DiffusionDrive

[CVPR 2025 Highlight] Truncated Diffusion Model for Real-Time End-to-End Autonomous Driving

Python 784 52 Updated Jun 17, 2025

NVlabs / DoRA

[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation

Python 799 57 Updated Oct 1, 2024

hustvl / Senna

Bridging Large Vision-Language Models and End-to-End Autonomous Driving

Python 398 26 Updated Dec 26, 2024

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,578 1,121 Updated Jun 17, 2025

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 513 21 Updated Feb 24, 2025

FoundationVision / OmniTokenizer

[NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.

Python 297 7 Updated Jul 9, 2024

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,780 81 Updated Aug 15, 2024

zympsyche / BevWorld

111 6 Updated Jul 9, 2024

OpenDriveLab / Vista

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Python 753 50 Updated Dec 12, 2024

hustvl / ViG

[AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention

Python 111 1 Updated Jun 17, 2024

hustvl / DiG

[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Python 167 8 Updated Mar 1, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,272 511 Updated May 18, 2025

hustvl / Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 3,460 243 Updated Feb 13, 2025

opendilab / LMDrive

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Jupyter Notebook 771 64 Updated Apr 14, 2025

fengdelin / FloorplanNet

A method that can match the 3D point cloud sub-map generated by the robot during the SLAM process with the 2D map.

Python 19 2 Updated Oct 4, 2022

STAR-Center / osmAG

HTML 7 1 Updated Aug 26, 2023

wenyuqing / panacea

[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"

Python 231 12 Updated Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yifu Zhang ifzhang

Achievements

Achievements

Organizations

Block or report ifzhang

Stars

wenyuqing / rosa

SandAI-org / MAGI-1

OpenGVLab / InternVideo

hustvl / LightningDiT

LemonTwoL / ReNeg

FoundationVision / Liquid

zju3dv / street_crafter

sihyun-yu / REPA

CompVis / discrete-interpolants

facebookresearch / flow_matching