Paper link https://www.sciencedirect.com/science/article/pii/S0957417424029014
arxiv link https://arxiv.org/abs/2311.17629
RQFormer is an end-to-end transformer-based oriented object detector.
RRoI Attention is shown below.
Selective Distinct Query is shown below.
Please refer to Installation for more detailed instruction.
Note: Our codes base on the newest version mmrotate-1.x, not mmrotate-0.x.
Note: All of our codes can be found in path './projects/RQFormer/'.
You can also copy these codes to your own mmrotate-1.x codabase.
DOTA and DIOR-R : Please refer to Preparation for more detailed data preparation.
ICDAR2015 : (1) Download ICDAR2015 dataset from official link. (2) The data structure is as follows:
root
├── icdar2015
│ ├── ic15_textdet_train_img
│ ├── ic15_textdet_train_gt
│ ├── ic15_textdet_test_img
│ ├── ic15_textdet_test_gt
- We train DIOR-R on a single 2080ti with batch 2.
python tools/train.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.py
- We train DOTA-v1.0 on a single 2080ti with batch 2.
python tools/train.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.py
- We train DOTA-v1.5 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.py 2
- We train DOTA-v2.0 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.py 2
- We train ICDAR2015 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py 2
- We also implement Oriented DDQ adapted from DDQ. It train DIOR-R on a single 200ti with batch 2.
python tools/train.py projects/RQFormer/configs/oriented_ddq_le90_r50_q300_layer2_1x_dior.py
- Test on DIOR-R
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.pth
- Test on DOTA-v1.0
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.pth
Upload results to DOTA official website.
- Test on DOTA-v1.5
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.pth
Upload results to DOTA official website.
- Test on DOTA-v2.0
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.pth
Upload results to DOTA official website.
- Test on ICDAR2015
(1) Get result submit.zip
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.pth
(2) Calculate precision, recall and F-measure. The script.py adapted from official website.
pip install Polygon3
python projects/icdar2015_evaluation/script.py –g=gt.zip –s=submit.zip
RQFormer :
Dataset | mAP | Backbone | lr schd | batch | Angle | Query | Configs | Aug | Download |
---|---|---|---|---|---|---|---|---|---|
DIOR-R | 67.31 | R50 | 3x | 2 | le90 | 500 | rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.85_3x_dior.py | - | model | log |
DOTA-v1.0 | 75.04 | R50 | 2x | 2 | le90 | 500 | rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav1.0.py | single scale | model | log | results |
DOTA-v1.5 | 67.43 | R50 | 2x | 2gpu*2img | le90 | 500 | rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav1.5.py | single scale | model | log | results |
DOTA-v2.0 | 53.28 | R50 | 2x | 2gpu*2img | le90 | 500 | rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav2.0.py | single scale | model | log | results |
Dataset | P | R | F-measure | Backbone | lr schd | batch | Angle | Query | Configs | Download |
---|---|---|---|---|---|---|---|---|---|---|
ICDAR2015 | 0.850406504065 | 0.7554164660568 | 0.800101988781 | R50 | 160e | 2gpu*2img | le90 | 500 | rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py | model | log | submit |
Oriented DDQ :
Dataset | mAP | Backbone | lr schd | batch | Angle | Query | Configs | Download |
---|---|---|---|---|---|---|---|---|
DIOR-R | 61.66 | R50 | 1x | 2 | le90 | 300 | oriented_ddq_le90_r50_q300_layer2_1x_dior.py | model | log |
Oriented DDQ + RRoI Attention :
Dataset | mAP | Backbone | lr schd | batch | Angle | Query | Configs | Download |
---|---|---|---|---|---|---|---|---|
DIOR-R | 61.67 | R50 | 1x | 2 | le90 | 300 | oriented_ddq_le90_r50_q300_layer2_rroiattn_1x_dior.py | model | log |
If you find RQFormer useful in your research, please consider citing:
@article{zhao2025rqformer,
title={RQFormer: Rotated Query Transformer for end-to-end oriented object detection},
author={Zhao, Jiaqi and Ding, Zeyu and Zhou, Yong and Zhu, Hancheng and Du, Wen-Liang and Yao, Rui and El Saddik, Abdulmotaleb},
journal={Expert Systems with Applications},
volume={266},
pages={126034},
year={2025},
publisher={Elsevier}
}
Our codes construct on:
@inproceedings{zhou2022mmrotate,
title = {MMRotate: A Rotated Object Detection Benchmark using PyTorch},
author = {Zhou, Yue and Yang, Xue and Zhang, Gefan and Wang, Jiabao and Liu, Yanyi and
Hou, Liping and Jiang, Xue and Liu, Xingzhao and Yan, Junchi and Lyu, Chengqi and
Zhang, Wenwei and Chen, Kai},
booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
pages = {7331–7334},
numpages = {4},
year={2022}
}
@inproceedings{zhang2023dense,
title={Dense Distinct Query for End-to-End Object Detection},
author={Zhang, Shilong and Wang, Xinjiang and Wang, Jiaqi and Pang, Jiangmiao and Lyu, Chengqi and Zhang, Wenwei and Luo, Ping and Chen, Kai},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={7329--7338},
year={2023}
}