Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Starred repositories
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.
collection of diffusion model papers categorized by their subareas
[CVPR 2025] h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform
Official repository of T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
[ICLR 2025] "Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances" (Official Implementation)
A final sanity checklist to help your CS paper get accepted, not desk rejected.
Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024)
Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Training released! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM is enou…
Linux kernel driver for Xbox One and Xbox Series X|S accessories
The official repo of AIGC Image Quality Assessment via Image-Prompt Correspondence [CVPRW2024, NTIRE2024].
PixelHacker: Image Inpainting with Structural and Semantic Consistency
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Concept Lancet: Image Editing with Compositional Representation Transplant (CVPR 2025)
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
pkunlp-icler / UltraEdit
Forked from HaozheZhao/UltraEdit[Neurips 2024] Code for paper: UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
Scripts to teach Flux the task of image editing from language with the Flux Control framework.
The offical repository of "SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model"
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
An open-source implementaion for fine-tuning Qwen2-VL and Qwen2.5-VL series by Alibaba Cloud.
[CVPR2025] Any-Resolution AI-Generated Image Detection by Spectral Learning
Official repository of paper “IML-ViT: Benchmarking Image manipulation localization by Vision Transformer”
This is the code for paper: ``PIMoG : An Effective Screen-shooting Noise-Layer Simulation for Deep-Learning-Based Watermarking Network. .Fang, Han, et al. Proceedings of the 30th ACM International …
[CVPR 2025] Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations