-
Huazhong University of Science&Technology
- Wuhan,China
- https://leoshen917.github.io/
Lists (6)
Sort Name ascending (A-Z)
Stars
DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation
A high-fidelity 3D face reconstruction library from monocular RGB image(s)
This Git offers a faster and easy-to-use 3DMM tracking pipeline with FaceVerse V4 (CVPR 2022), which is a full head model that includes separate eyeballs, teeth, and tongue.
Enjoy the magic of Diffusion models!
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Wan: Open and Advanced Large-Scale Video Generative Models
MAGI-1: Autoregressive Video Generation at Scale
(CVPR 2025) DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
[ICLR 2024] Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Scalable and memory-optimized training of diffusion models
DUSt3R + Gaussian Splatting
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Official implementation of TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Official implementation of Continuous 3D Perception Model with Persistent State
[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
This is the official implementation of our Señorita-2M [Weights and Dataset] : A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
[CVPR 2025 Best Paper Award Candidate] VGGT: Visual Geometry Grounded Transformer
[ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
[ARXIV'25] GameFactory: Creating New Games with Generative Interactive Videos
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion.
(CVPR25) Exploring Contextual Attribute Density in Referring Expression Counting