Stars
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
[TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
StableLM: Stability AI Language Models
Code and documentation to train Stanford's Alpaca models, and generate the data.
Transfer the ControlNet with any basemodel in diffusers🔥
Official Implementation for "Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models" (SIGGRAPH 2023)
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Tools to train a generative model on arbitrary audio samples
A latent text-to-image diffusion model
REALY: Rethinking the Evaluation of 3D Face Reconstruction (ECCV 2022)
Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Summary of related papers on visual attention. Related code will be released based on Jittor gradually.
Blendshape and kinematics calculator for Mediapipe/Tensorflow.js Face, Eyes, Pose, and Finger tracking models.
code accompanying "DeepBach: a Steerable Model for Bach Chorales Generation" paper
AI-powered Text-to-Art Generator - Text2Art.com
PyTorch implementation of AnimeGANv2
A powerful cross-platform raw photo processing program
This is the curriculum for "Learn Computer Vision" by Siraj Raval on Youtube
[TOG 2022] SofGAN: A Portrait Image Generator with Dynamic Styling
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.