More
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.
The simplest, fastest repository for training/finetuning small-sized VLMs.
Janus-Series: Unified Multimodal Understanding and Generation Models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Enjoy the magic of Diffusion models!
A minimal and universal controller for FLUX.1.
FastVideo is a unified framework for accelerated video generation.
Scripts and doc for https://www.dolthub.com/repositories/chenditc/investment_data
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
A PyTorch native platform for training generative AI models
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
[CVPR2025] We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference ima…
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
🎬 人人影视 机器人和网站,包含人人影视全部资源以及众多网友的网盘分享
HunyuanVideo: A Systematic Framework For Large Video Generation Model
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
InstantIR: Blind Image Restoration with Instant Generative Reference 🔥
Example models using DeepSpeed
Unofficial PyTorch Implementation for paper FlashFace
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
A general fine-tuning kit geared toward diffusion models.