Stars
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV 2024] GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
[NeurIPS 2024] VFIMamba: Video Frame Interpolation with State Space Models
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion
A curated list of recent diffusion models for video generation, editing, and various other applications.
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
[ACM MM 2023] Lightweight Super-Resolution Head for Human Pose Estimation
[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio