BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
Resources for Multiple Object Tracking (MOT)
A curated list of awesome Deep Stereo Matching resources
A curated list of awesome computer vision resources
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official repository for CF-Font: Content Fusion for Few-shot Font Generation.
Open-source and strong foundation image recognition models.
[ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".
[SIGGRAPH 2024] Motion I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
MotionDirector Training For AnimateDiff. Train a MotionLoRA and run it on any compatible AnimateDiff UI.
Official Code for MotionCtrl [SIGGRAPH 2024]
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)
[ECCV 2022] Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework
Generative Models by Stability AI
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
This is the official implementation of our CVPR 2024 paper "BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition"
Learning Generative Structure Prior for Blind Text Image Super-resolution [CVPR 2023]
ModelScope: bring the notion of Model-as-a-Service to life.