Highlights
- Pro
Stars
[CVPR2025] RORem: Training a Robust Object Remover with Human-in-the-Loop
Official inference repo for FLUX.1 models
[ICLR2025] Halton Scheduler for Masked Generative Image Transformer
[ICLR 2025] 3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline
Pytorch implementation of 'Learning Latent Embedding Alignment Model for fMRI Decoding and Encoding' In TMLR, 2024
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!
A Unified Tokenizer for Visual Generation and Understanding
期刊分区查询小工具,包括中科院分区表升级版(2025、2023、2022)及国际期刊预警名单(2025、2024、2023、2021、2020)、JCR(2023、2022、2021、2020)、CCF推荐国际会议和期刊目录(2022)、计算领域高质量科技期刊分级目录(2022)。
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
CAR: Controllable AutoRegressive Modeling for Visual Generation
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
Official Code for "Kinetic Typography Diffusion Model (ECCV 2024)"
Author's Implementation for E-LatentLPIPS
SEED-Voken: A Series of Powerful Visual Tokenizers
Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]
Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.
Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
🔥🔥🔥A curated list of papers on recent diffusion-based high-resolution image and video synthesis works.