8000 GitHub - wokaikaixinxin/RQFormer: (ESWA2025) RQFormer: Rotated Query Transformer for end-to-end oriented object detection
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

(ESWA2025) RQFormer: Rotated Query Transformer for end-to-end oriented object detection

License

Notifications You must be signed in to change notification settings

wokaikaixinxin/RQFormer

Repository files navigation

(ESWA2025) RQFormer : Rotated Query Transformer for end-to-end oriented object detection

Paper link https://www.sciencedirect.com/science/article/pii/S0957417424029014

arxiv link https://arxiv.org/abs/2311.17629

Introduction

RQFormer is an end-to-end transformer-based oriented object detector.

RRoI Attention is shown below. image

Selective Distinct Query is shown below.

Installation

Please refer to Installation for more detailed instruction.

Note: Our codes base on the newest version mmrotate-1.x, not mmrotate-0.x.

Note: All of our codes can be found in path './projects/RQFormer/'.

You can also copy these codes to your own mmrotate-1.x codabase.

Data Preparation for Oriented Detection

DOTA and DIOR-R : Please refer to Preparation for more detailed data preparation.

ICDAR2015 : (1) Download ICDAR2015 dataset from official link. (2) The data structure is as follows:

root
├── icdar2015
│   ├── ic15_textdet_train_img
│   ├── ic15_textdet_train_gt
│   ├── ic15_textdet_test_img
│   ├── ic15_textdet_test_gt

Training

  1. We train DIOR-R on a single 2080ti with batch 2.
python tools/train.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.py
  1. We train DOTA-v1.0 on a single 2080ti with batch 2.
python tools/train.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.py
  1. We train DOTA-v1.5 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.py 2
  1. We train DOTA-v2.0 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.py 2
  1. We train ICDAR2015 on two 2080ti with batch 4 (2 images per gpu).
bash tools/dist_train.sh projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py 2
  1. We also implement Oriented DDQ adapted from DDQ. It train DIOR-R on a single 200ti with batch 2.
python tools/train.py projects/RQFormer/configs/oriented_ddq_le90_r50_q300_layer2_1x_dior.py

Testing

  1. Test on DIOR-R
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.85_3x_dior.pth
  1. Test on DOTA-v1.0
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.0.pth

Upload results to DOTA official website.

  1. Test on DOTA-v1.5
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav1.5.pth

Upload results to DOTA official website.

  1. Test on DOTA-v2.0
python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_2x_dotav2.0.pth

Upload results to DOTA official website.

  1. Test on ICDAR2015

(1) Get result submit.zip

python tools/test.py projects/RQFormer/configs/rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.pth

(2) Calculate precision, recall and F-measure. The script.py adapted from official website.

pip install Polygon3
python projects/icdar2015_evaluation/script.py –g=gt.zip –s=submit.zip

Main Result

RQFormer :

Dataset mAP Backbone lr schd batch Angle Query Configs Aug Download
DIOR-R 67.31 R50 3x 2 le90 500 rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.85_3x_dior.py - model | log
DOTA-v1.0 75.04 R50 2x 2 le90 500 rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav1.0.py single scale model | log | results
DOTA-v1.5 67.43 R50 2x 2gpu*2img le90 500 rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav1.5.py single scale model | log | results
DOTA-v2.0 53.28 R50 2x 2gpu*2img le90 500 rroiformer_le90_r50_q500_layer2 _sq1_dq1_t0.9_2x_dotav2.0.py single scale model | log | results
Dataset P R F-measure Backbone lr schd batch Angle Query Configs Download
ICDAR2015 0.850406504065 0.7554164660568 0.800101988781 R50 160e 2gpu*2img le90 500 rroiformer_le90_r50_q500_layer2_sq1_dq1_t0.9_160e_icdar2015.py model | log | submit

Oriented DDQ :

Dataset mAP Backbone lr schd batch Angle Query Configs Download
DIOR-R 61.66 R50 1x 2 le90 300 oriented_ddq_le90_r50_q300_layer2_1x_dior.py model | log

Oriented DDQ + RRoI Attention :

Dataset mAP Backbone lr schd batch Angle Query Configs Download
DIOR-R 61.67 R50 1x 2 le90 300 oriented_ddq_le90_r50_q300_layer2_rroiattn_1x_dior.py model | log

Visualization

Citing RQFormer

If you find RQFormer useful in your research, please consider citing:

@article{zhao2025rqformer,
  title={RQFormer: Rotated Query Transformer for end-to-end oriented object detection},
  author={Zhao, Jiaqi and Ding, Zeyu and Zhou, Yong and Zhu, Hancheng and Du, Wen-Liang and Yao, Rui and El Saddik, Abdulmotaleb},
  journal={Expert Systems with Applications},
  volume={266},
  pages={126034},
  year={2025},
  publisher={Elsevier}
}

Recommendation

Our codes construct on:

@inproceedings{zhou2022mmrotate,
  title   = {MMRotate: A Rotated Object Detection Benchmark using PyTorch},
  author  = {Zhou, Yue and Yang, Xue and Zhang, Gefan and Wang, Jiabao and Liu, Yanyi and
             Hou, Liping and Jiang, Xue and Liu, Xingzhao and Yan, Junchi and Lyu, Chengqi and
             Zhang, Wenwei and Chen, Kai},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages = {7331–7334},
  numpages = {4},
  year={2022}
}

@inproceedings{zhang2023dense,
  title={Dense Distinct Query for End-to-End Object Detection},
  author={Zhang, Shilong and Wang, Xinjiang and Wang, Jiaqi and Pang, Jiangmiao and Lyu, Chengqi and Zhang, Wenwei and Luo, Ping and Chen, Kai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7329--7338},
  year={2023}
}

About

(ESWA2025) RQFormer: Rotated Query Transformer for end-to-end oriented object detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0