Stars
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
[CVPR2025] MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model
A universal foundation model for grounded biomedical image interpretation
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer
[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
[CVPR 2025] Official Implementation of "MixerMDM: Learnable Composition of Human Motion Diffusion Models".
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
[ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
Official implementation of TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
[CVPR 2025 Highlight] GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
FlashMLA: Efficient MLA decoding kernels
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
[CVPR 2025 Highlight] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild. ECCV 2022.
A generative world for general-purpose robotics & embodied AI learning.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[CVPR'25 Highlight] You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Scalable and memory-optimized training of diffusion models
GeoCalib: Learning Single-image Calibration with Geometric Optimization (ECCV 2024)
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields