8000 GitHub - Ree1s/DiT-SR: [AAAI 2025] Effective Diffusion Transformer Architecture for Image Super-Resolution
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ DiT-SR Public
forked from kunncheng/DiT-SR

[AAAI 2025] Effective Diffusion Transformer Architecture for Image Super-Resolution

Notifications You must be signed in to change notification settings

Ree1s/DiT-SR

 
 

Repository files navigation

Effective Diffusion Transformer Architecture for Image Super-Resolution


1 Xidian University   2 Huawei Noah's Ark Lab  
3 CBG, Huawei   4 Chongqing University of Posts and Telecommunications

🔎 Introduction

We propose DiT-SR, an effective diffusion transformer for real-world image super resolution:

  • Effective yet efficient architecture design;
  • Adaptive Frequence Modulation (AdaFM) for time step.

⚙️ Dependencies and Installation

git clone https://github.com/kunncheng/DiT-SR.git
cd DiT-SR

conda create -n DiT_SR python=3.10 -y
conda activate DiT_SR
pip install -r requirements.txt

🌈 Training

Datasets

The training data comprises LSDIR, DIV2K, DIV8K, OutdoorSceneTraining, Flicker2K and the first 10K face images from FFHQ. We saved all the image paths to txt files. For simplicity, you can also just use the LSDIR dataset.

Pre-trained Models

Several checkpoints should be downloaded to weights folder, including autoencoder and other pre-trained models for loss calculation.

Training Scripts

Real-world Image Super-resolution

torchrun --standalone --nproc_per_node=8 --nnodes=1 main.py --cfg_path configs/realsr_DiT.yaml --save_dir ${save_dir}

Blind Face Restoration

torchrun --standalone --nproc_per_node=8 --nnodes=1 main.py --cfg_path configs/faceir_DiT.yaml --save_dir ${save_dir}

🚀 Inference and Evaluation

Real-world Image Super-resolution

Real-world datasets: RealSR, RealSet65; Synthetic datasets: LSDIR-Test; Pretrained checkpoints.

bash test_realsr.sh

Blind Face Restoration

Real-world datasets: LFW, WebPhoto, Wider; Synthetic datasets: CelebA-HQ; Pretrained checkpoints.

bash test_faceir.sh

For the synthetic datasets (LSDIR-Test and CelebA-HQ), we are unable to release them due to corporate review restrictions. However, you can generate them yourself using these scripts.

🎓 Citiation

If you find our work useful in your research, please consider citing:

@inproceedings{cheng2025effective,
  title={Effective diffusion transformer architecture for image super-resolution},
  author={Cheng, Kun and Yu, Lei and Tu, Zhijun and He, Xiao and Chen, Liyu and Guo, Yong and Zhu, Mingrui and Wang, Nannan and Gao, Xinbo and Hu, Jie},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={3},
  pages={2455--2463},
  year={2025}
}

❤️ Acknowledgement

We sincerely appreciate the code release of the following projects: ResShift, DiT, FFTFormer, SwinIR, SinSR, and BasicSR.

About

[AAAI 2025] Effective Diffusion Transformer Architecture for Image Super-Resolution

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.7%
  • Shell 0.3%
0