-
SenseTime Research, HKUST
- Beijing
-
16:09
(UTC +08:00) - https://xingtongge.github.io/
- in/xingtong-ge
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
[ICML 2025] Noise Conditional Variational Score Distillation
[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv 2025)
collection of diffusion model papers categorized by their subareas
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
[CVPR2024] Diffusion-based Blind Text Image Super-Resolution (Official)
🚀 SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation
An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
Two time-scale update rule for training GANs
A linear estimator on top of clip to predict the aesthetic quality of pictures
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
[Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation
CLIP+MLP Aesthetic Score Predictor
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
Python tool for converting files and office documents to Markdown.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Official Implementation of Video-T1: Test-Time Scaling for Video Generation
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Official repository for "AM-RADIO: Reduce All Domains Into One"
[IEEE TCSVT 2024] Preprocessing Enhanced Image Compression for Machine Vision
Wan: Open and Advanced Large-Scale Video Generative Models
Official implementation of UnifiedReward & UnifiedReward-Think