Lists (1)
Sort Name ascending (A-Z)
Stars
Official implementation for "Stable Flow: Vital Layers for Training-Free Image Editing" [CVPR 2025]
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-e…
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official inference repo for FLUX.1 models
SwinIR: Image Restoration Using Swin Transformer (official repository)
Collect super-resolution related papers, data, repositories
Build your own Face App with Stable Diffusion 2.1
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Official repo for Images that sound: a special spectrogram that can be seen as images and played as sound generated by diffusions
A collection of resources and papers on Diffusion Models
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
WhisperPlus: Faster, Smarter, and More Capable 🚀
Rembg is a tool to remove images background
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Unifying Variational Autoencoder (VAE) implementations in Pytorch (NeurIPS 2022)
High-quality QR Code generator library in Java, TypeScript/JavaScript, Python, Rust, C++, C.
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Official Code for Stable Cascade
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
Image-to-Image Translation in PyTorch
Image-to-image translation with conditional adversarial nets