Stars
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Janus-Series: Unified Multimodal Understanding and Generation Models
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
FMBoost: Boosting Latent Diffusion with Flow Matching (ECCV 2024 Oral)
[ECCV 2024, Oral] Official code release for Flying With Photons: Rendering Novel Views of Propagating Light.
ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild (ECCV2024)
Gaussian Haircut: Human Hair Reconstruction with Strand-Aligned 3D Gaussians
Official PyTorch implementation of "Expressive Whole-Body 3D Gaussian Avatar", ECCV 2024.
High-resolution models for human tasks.
Supporting PyTorch models with the Google AI Edge TFLite runtime.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions
[CVPR 2024] The official repo for FlashAvatar
[AAAI 2025] DepthFM: Fast Monocular Depth Estimation with Flow Matching
HAHA: Highly Articulated Gaussian Human Avatars with Textured Mesh Prior
Inference code for "StylePeople: A Generative Model of Fullbody Human Avatars" paper. This code is for the part of the paper describing video-based avatars.
[CVPR 2024 Highlight] The official repo for "GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians"
[CVPR 2024] Official PyTorch implementation of SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
Code of [CVPR 2024] "Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling"
[SIGGRAPH Asia '23] FLARE: Fast Learning of Animatable and Relightable Mesh Avatars
[Siggraph '23] NeRSemble: Neural Radiance Field Reconstruction of Human Heads