Stars
[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"
A port of muerrilla's sd-webui-Detail-Daemon as a node for ComfyUI, to adjust sigmas that control detail.
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enou…
DreamO: A Unified Framework for Image Customization
Implementation for Describe Anything: Detailed Localized Image and Video Captioning
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
Wan: Open and Advanced Large-Scale Video Generative Models
CVPR 2024-Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention
[CVPR 2024] Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
⚡️ Open Source No Code Web Data Extraction Platform • Turn Websites To APIs & Spreadsheets In Minutes ⚡️
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface"
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
PDF craft can convert PDF files into various other formats. This project will focus on processing PDF files of scanned books.
Minimal code and examnples for inferencing Sapiens foundation human models in Pytorch
3D Human Part Segmentation with Point Transformer
Code Implementation of "PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data"
[arXiv 2024] GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting
DiffuEraser is a diffusion model for video inpainting, which performs great content completeness and temporal consistency while maintaining acceptable efficiency.
[CVPR2025] Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
[CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation