Stars
Matrix-Game: Interactive World Foundation Model
[ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
CogView4, CogView3-Plus and CogView3(ECCV 2024)
[SIGGRAPH Asia 2024, Best Paper Honorable Mention] This is the official implementation of our SIGGRAPH Asia journal artical: TEXGen: a Generative Diffusion Model for Mesh Textures
A light-weight and high-efficient training framework for accelerating diffusion tasks.
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
In 2024, the strongest open-source implementation of asymmetric magvit_v2 supports inference code but excludes VQVAE. It supports the joint encoding of images and videos, accommodating arbitrary vi…
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation
Lumina-T2X is a unified framework for Text to Any Modality Generation
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
An Open-source Toolkit for LLM Development
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Open-Sora: Democratizing Efficient Video Production for All
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
[CVPR 2024] CoSeR: Bridging Image and Language for Cognitive Super-Resolution
[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models
[CVPR 2024] Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Official repo for VideoComposer: Compositional Video Synthesis with Motion Controllability
ICLR 2024 (Spotlight) - SEAL: A Framework for Systematic Evaluation of Real-World Super-Resolution