Lists (5)
Sort Name ascending (A-Z)
Stars
Official repository for "Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment"
Official implementation of "AR-RAG: Autoregressive Retrieval Augmentation for Image Generation".
[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision
🔥 Official impl. of "DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction"
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
Wan: Open and Advanced Large-Scale Video Generative Models
SkyReels-V2: Infinite-length Film Generative model
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
Scalable and memory-optimized training of diffusion models
Enjoy the magic of Diffusion models!
[TPAMI 2025] Learning Efficient Deep Discriminative Spatial and Temporal Networks for Video Deblurring
[CVPR 2025] Efficient Video Super-Resolution for Real-time Rendering with Decoupled G-buffer Guidance
Towards Unified Deep Image Deraining: A Survey and A New Benchmark (TPAMI 2025)
Cascaded Temporal Updating Network for Efficient Video Super-Resolution
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
Collection of papers and repos for multimodal chain-of-thought
[ICCV 2025] Light-A-Video: Training-free Video Relighting via Progressive Light Fusion
Improving Video Generation with Human Feedback
Janus-Series: Unified Multimodal Understanding and Generation Models
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vue
[ICCV 2025] FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration