Starred repositories
Document Scanning With TensorFlow And OpenCV
DreamO: A Unified Framework for Image Customization
Image editing is worth a single LoRA! 0.1% training data and 1% training parameters for fantastic image editing! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM…
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks
The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.
Blind Geometric Distortion Correction on Images Through Deep Learning
📝 A curated list of image rectification papers.
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes
[ICLR 2025 spotlight] 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Batch computation of the linear assignment problem on GPU.
InstantIR: Blind Image Restoration with Instant Generative Reference 🔥
hykilpikonna / HiDream-I1-nf4
Forked from HiDream-ai/HiDream-I14Bit Quantized Model for HiDream I1
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Advanced Vision Model Loader for Comfy UI
GenEval: An object-focused framework for evaluating text-to-image alignment
PosterMaker [CVPR 2025] https://poster-maker.github.io/
Official implementation code of the paper <AnyText2: Visual Text Generation and Editing With Customizable Attributes>