More
Stars
Accelerating Diffusion Transformers with Token-wise Feature Caching
An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
GPT-ImgEval: Evaluating GPT-4o’s state-of-the-art image generation capabilities
[CVPR 2024 Highlight] Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer
Official repository for "Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration", which has been accepted by CVPR 2025.
Neuroscience Inspired Agent Reasoning Framework
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Awesome Unified Multimodal Models
An AI-driven daily arXiv paper crawler, analyzer, and organizer tool, focusing on AIGC
Recommend new arxiv papers of your interest daily according to your Zotero libarary.
DreamO: A Unified Framework for Image Customization
MAGI-1: Autoregressive Video Generation at Scale
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
SkyReels-V2: Infinite-length Film Generative model
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
🚀🚀🚀A curated list of papers on controllable video generation.
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training