[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3573428.3573449acmotherconferencesArticle/Chapter ViewAbstractPublication PageseitceConference Proceedingsconference-collections
research-article

Multi-scale Fusion based Multi-stage Small Object Detection in Aerial Images ∗

Published: 15 March 2023 Publication History

Abstract

In aerial images, the objects are mostly small. The number of objects is large and the scale is diverse, so it is difficult to extract the features of multiple scale objects at the same time. The location distribution of object in aerial images is usually dense, making it difficult to locate. These factors bring great challenges to aerial image object feature extraction, and then reduce the performance of detection. Therefore, a multi-scale fusion based multi-stage small object detection method (MSMSD) for aerial images is proposed in this paper. MSMSD adopts EfficientNet as feature extraction backbone and add deformable convolution blocks to achieve better detection performance on objects with irregular shapes. NAS-FPN is leveraged to fuse multi-scale features effectively. A cascade detection mechanism is also designed to reduce noisy detection and mis-detection in this task. In experiment section, the proposed MSMSD outperforms five benchmark object detection algorithms on two aerial image datasets. Experimental results demonstrate that MSMSD can handle the small object detection task in aerial images efficiently.

References

[1]
Tan, Mingxing, and Quoc Le. "Efficientnet: Rethinking model scaling for convolutional neural networks." International conference on machine learning. PMLR, 2019.
[2]
Tan, Mingxing, Ruoming Pang, and Quoc V. Le. "Efficientdet: Scalable and efficient object detection." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[3]
Ren, Shaoqing, "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems 28, 2015.
[4]
He, Kaiming, "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.
[5]
Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767, 2018.
[6]
Liu, Wei, "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
[7]
Cai, Liping, "A multi-feature fusion-based change detection method for remote sensing images." Journal of the Indian Society of Remote Sensing 46.12, 2018: 2015-2022.
[8]
Yin, Shoulin, Hang Li, and Lin Teng. "Airport detection based on improved faster RCNN in large scale remote sensing images." Sensing and Imaging 21.1, 2020: 1-13.
[9]
He, Yihui, "Bounding box regression with uncertainty for accurate object detection." Proceedings of the ieee/cvf conference on computer vision and pattern recognition. 2019.
[10]
Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[11]
Dai J, Qi H, Xiong Y, Deformable convolutional networks[C]//Proceedings of the IEEE international conference on computer vision. 2017: 764-773.
[12]
Xia, Gui-Song, "DOTA: A large-scale dataset for object detection in aerial images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[13]
Lin, Tsung-Yi, "Focal loss for dense object detection." Proceedings of the IEEE international conference on computer vision. 2017.
[14]
Li, Ke, "Object detection in optical remote sensing images: A survey and a new benchmark." ISPRS Journal of Photogrammetry and Remote Sensing 159, 2020: 296-307.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering
October 2022
1999 pages
ISBN:9781450397148
DOI:10.1145/3573428
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Computer vision
  2. Deep learning
  3. Image processing
  4. Multi-scale
  5. Small object detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EITCE 2022

Acceptance Rates

Overall Acceptance Rate 508 of 972 submissions, 52%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 39
    Total Downloads
  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media