Jaewon Min1*,
Jin Hyeon Kim2*,
Paul Hyunbin Cho1,
Jaeeun Lee3,
Jihye Park4,
Minkyu Park4,
Sangpil Kim2†,
Hyunhee Park4†,
Seungryong Kim1†
1 KAIST AI · 2 Korea University · 3 Yonsei University · 4 Samsung Electronics
* Equal contribution. †Co-corresponding author.
- 🌈 2025.06.24 - TAIR Demo code released!
- ❤️ 2025.06.23 - Training code released!
- 🤗 2025.06.19 — SA-Text and Real-Text datasets are released along with the dataset pipeline!
- 📄 2025.06.12 — Arxiv paper is released!
- 🚀 2025.06.01 — Official launch of the repository and project page!
SA-Text is a newly proposed dataset for Text-Aware Image Restoration (TAIR) task. It is built from SA-1B dataset using our dataset pipeline and consists of 100K image-text instance pairs with detailed scene-level annotations. Real-Text is an evaluation dataset for real-world scenarios. It is constructed from RealSR and DrealSR using same pipeline as above.
Split | Hugging Face 🤗 | Google Drive 📁 |
---|---|---|
SA-Text | ||
Real-Text |
- Each image is paired with one or more text instances with polygon-level annotations.
- The dataset follows a consistent annotation format, detailed in the dataset pipeline.
- We recommend using the dataset from Google Drive for testing our code.
sa_text/
├── images/ # 100K hiqh-quality scene images with text instances
└── restoration_dataset.json # Annotations
real_text/
├── HQ/ # High-quality images
├── LQ/ # Low-quality degraded inputs
└── real_benchmark_dataset.json # Annotations
conda create -n tair python=3.10 -y
conda activate tair
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
cd detectron2
pip install -e .
cd testr
pip install -e .
-
Run the bash script
download_weights.sh
to download the pretrained weights for the image restoration module.
Additionally, download the pretrained text spotting module from this link and place it in the./weights
directory. -
Download the SA-Text dataset using the Google Drive link provided above. Once downloaded, unzip the contents and place the folder in your working directory.
Our text-aware restoration model, TeReDiff, comprises two main modules: an image restoration module and a text spotting module. Training is conducted in three stages:
- Stage 1: Train only the image restoration module.
- Stage 2: Train only the text spotting module.
- Stage 3: Jointly train both modules.
- Run the following bash script for Stage1 training. Its configuration file can be found here. Refer to the comments within the configuration file for a detailed explanation of each setting.
bash run_script/train_script/run_train_stage1_terediff.sh
- Run the following bash script for Stage2 training. Its configuration file can be found here
bash run_script/train_script/run_train_stage2_terediff.sh
- Run the following bash script for Stage3 training. Its configuration file can be found here
bash run_script/train_script/run_train_stage3_terediff.sh
Download the released checkpoint of our model (TeReDiff) from here, and set the appropriate parameters in the demo configuration file here. Then, run the script below to perform a demo on low-quality images and generate high-quality, text-aware restored outputs. The results will be saved in val_demo_result/ by default.
bash run_script/val_script/run_val_terediff.sh
Running the demo script above will generate the following restoration results. The visualized images are shown in the order: Low-Quality (LQ) image / Restored image / High-Quality (HQ) Ground Truth image. Note that when the text in the LQ images is severely degraded, the model may fail to accurately restore the textual content due to insufficient visual information.
If you find our work useful for your research, please consider citing it :)
@article{min2025text,
title={Text-Aware Image Restoration with Diffusion Models},
author={Min, Jaewon and Kim, Jin Hyeon and Cho, Paul Hyunbin and Lee, Jaeeun and Park, Jihye and Park, Minkyu and Kim, Sangpil and Park, Hyunhee and Kim, Seungryong},
journal={arXiv preprint arXiv:2506.09993},
year={2025}
}