MFRENet: efficient detection of drone image based on multiscale feature aggregation and receptive field expanded

Hao Chen¹,
Wenzhu Yang^1,2,
Guoyu Zhou¹,
Guodong Zhang¹ &
…
Zhaoyu Nian¹

219 Accesses
Explore all metrics

Abstract

The field of object detection in images captured by drones is witnessing a growing surge in research interest. However, because of the abundance of densely packed small objects in the majority of drone images, efficiently detecting dense small objects and achieving accurate classification remain a formidable challenge. To solve the problems mentioned above, we introduce an effective object detection network for drone images based on Multiscale Feature aggregation and Receptive field Expansion (MFRENet). First, we design an effective module named Receptive Field Expanded Feature Extraction Module (RFEFE), which can improve the model's perception ability of objects with irregular shapes and varying sizes. Next, we introduce the Multiscale Cross Stage Parallel Feature Fusion Module (MCSPFF), which integrates the RFEFE module, and then add the Shuffle Attention module to enable MCSPFF to obtain more semantic information. Then, we propose the Extended Simplified Spatial Pyramid Pooling-Fast and Feature Enhancement Module (ESimSPP2FE), which is inspired by the attention mechanism and enhances the features of small objects. Finally, we propose a small target detection head specially used to detect small targets, which enhances the detection ability of our model. Comprehensive experiments are performed on the VisDrone2021-DET dataset, and the proposed model is compared with the baseline YOLOv8m. The experimental results demonstrate that, in comparison to YOLOv8m, the proposed model achieves improvements of 1.9 and 2.7% in mAP and AP50, respectively. The code is available at https://github.com/chenhao-123-sudo/MFRENet-achive.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

PTCDet: advanced UAV imagery target detection

Article Open access 09 November 2024

Improved YOLOv5 Algorithm for Small Object Detection in Drone Images

Ldstd: low-altitude drone aerial small target detector

Article 21 January 2025

Availability of data and materials

The VisDrone dataset that support the findings of this study are available from the website, [https://gitcode.com/visdrone/visdrone-dataset/overview].

References

Li C, Li L, Jiang H, et al (2022) YOLOv6: A single-stage object detection framework for industrial applications
Sensors | Free Full-Text | Deep learning for object detection, classification and tracking in industry applications. https://www.mdpi.com/1424-8220/21/21/7349. Accessed 7 Aug 2023
Zhang H, Sun M, Li Q et al (2021) An empirical study of multi-scale object detection in high resolution UAV images. Neurocomputing 421:173–182. https://doi.org/10.1016/j.neucom.2020.08.074
Article Google Scholar
Yu D, Ji S (2022) A new spatial-oriented object detection framework for remote sensing images. IEEE Trans Geosci Remote Sens 60:1–16. https://doi.org/10.1109/TGRS.2021.3127232
Article Google Scholar
Sun Y, Shao Z, Cheng G et al (2022) Road and car extraction using uav images via efficient dual contextual parsing network. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/TGRS.2022.3214246
Article Google Scholar
Bo W, Liu J, Fan X et al (2022) BASNet: burned area segmentation network for real-time detection of damage maps in remote sensing images. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/TGRS.2022.3197647
Article Google Scholar
Sun C, Ai Y, Qi X et al (2022) A single-shot model for traffic-related pedestrian detection. Pattern Anal Applic 25:853–865. https://doi.org/10.1007/s10044-022-01076-1
Article Google Scholar
Liu W, Anguelov D, Erhan D et al (2016) SSD: Single Shot MultiBox Detector. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer vision—ECCV 2016. Springer International Publishing, Cham, pp 21–37
Chapter Google Scholar
Prabu M, Chelliah BJ (2023) An intelligent approach using boosted support vector machine based arithmetic optimization algorithm for accurate detection of plant leaf disease. Pattern Anal Appl 26:367–379. https://doi.org/10.1007/s10044-022-01086-z
Article Google Scholar
Everingham M, Van Gool L, Williams CKI et al (2010) The Pascal Visual Object Classes (VOC) Challenge. Int J Comput Vis 88:303–338. https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Lin T-Y, Maire M, Belongie S, et al (2015) Microsoft COCO: common objects in context
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) YOLOv4: Optimal speed and accuracy of object detection
Ge Z, Liu S, Wang F, et al (2021) YOLOX: Exceeding YOLO series in 2021
ultralytics/yolov5: v5.0 - YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations | Semantic Scholar. https://www.semanticscholar.org/paper/ultralytics-yolov5%3A-v5.0-YOLOv5-P6-1280-models%2C-and-Jocher-Stoken/fd550b29c0efee17be5eb1447fddc3c8ce66e838. Accessed 7 Aug 2023
Wang C-Y, Bochkovskiy A, Liao H-YM (2021) Scaled-YOLOv4: Scaling Cross Stage Partial Network
Zhu X, Su W, Lu L, et al (2021) Deformable DETR: deformable transformers for end-to-end object detection
Tian Z, Shen C, Chen H, He T (2019) FCOS: fully convolutional one-stage object detection
Tan M, Pang R, Le QV (2020) EfficientDet: scalable and efficient object detection
He K, Gkioxari G, Dollár P, Girshick R (2018) Mask R-CNN
Cai Z, Vasconcelos N (2017) Cascade R-CNN: delving into high quality object detection
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: arXiv.org. https://arxiv.org/abs/1506.01497v3. Accessed 5 Jun 2023
Liu Z, Lin Y, Cao Y, et al (2021) Swin transformer: hierarchical vision transformer using shifted windows
Liu Z, Hu H, Lin Y, et al (2022) Swin transformer V2: Scaling up capacity and resolution
Jocher G, Chaurasia A, Qiu J (2023) YOLO by ultralytics
Li C, Li L, Geng Y, et al (2023) YOLOv6 v3.0: A full-scale reloading
Wang C-Y, Bochkovskiy A, Liao H-YM (2022) YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Zhou X, Koltun V, Krähenbühl P (2021) Probabilistic two-stage detection
Sun W, Yan D, Huang J, Sun C (2020) Small-scale moving target detection in aerial image by deep inverse reinforcement learning. Soft Comput 24:5897–5908. https://doi.org/10.1007/s00500-019-04404-6
Article Google Scholar
Wang J, Yang W, Guo H, et al (2021) Tiny object detection in aerial images. In: 2020 25th International conference on pattern recognition (ICPR). pp 3791–3798
Yang C, Huang Z, Wang N (2022) QueryDet: cascaded sparse query for accelerating high-resolution small object detection
Lin T-Y, Dollár P, Girshick R, et al (2017) Feature pyramid networks for object detection
Peng F, Miao Z, Li F, Li Z (2021) S-FPN: A shortcut feature pyramid network for sea cucumber detection in underwater images. Expert Syst Appl 182:115306. https://doi.org/10.1016/j.eswa.2021.115306
Article Google Scholar
Qiao S, Chen L-C, Yuille A (2021) DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR). pp 10208–10219
Liu Z, Cheng J (2023) CB-FPN: object detection feature pyramid network based on context information and bidirectional efficient fusion. Pattern Anal Appl 26:1441–1452. https://doi.org/10.1007/s10044-023-01173-9
Article Google Scholar
Yang Q-LZY-B (2021) SA-Net: shuffle attention for deep convolutional neural networks
Yu W, Yang T, Chen C (2020) Towards resolving the challenge of long-tail distribution in UAV images for object detection. arXiv e-prints
Liu Z, Gao G, Sun L, Fang Z (2021) HRDNet: high-resolution detection network for small objects. In: 2021 IEEE international conference on multimedia and expo (ICME). pp 1–6
Chalavadi V, Jeripothula P, Datla R et al (2022) mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Pattern Recogn 126:108548. https://doi.org/10.1016/j.patcog.2022.108548
Article Google Scholar
Wang X, He N, Hong C et al (2023) Improved YOLOX-X based UAV aerial photography object detection algorithm. Image Vis Comput 135:104697. https://doi.org/10.1016/j.imavis.2023.104697
Article Google Scholar
Zhu X, Hu H, Lin S, Dai J (2018) Deformable ConvNets v2: more deformable, better results
He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. pp 346–361
Wang C-Y, Liao H-YM, Yeh I-H, et al (2019) CSPNet: a new backbone that can enhance learning capability of CNN
Du D, Wen L, Zhu P et al (2020) VisDrone-det2020: the vision meets drone object detection in image challenge results. In: Bartoli A, Fusiello A (eds) Computer vision—ECCV 2020 workshops. Springer International Publishing, Cham, pp 692–712
Chapter Google Scholar
Zhu X, Lyu S, Wang X, Zhao Q (2021) TPH-YOLOv5: improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios
Li Z, Peng C, Yu G, et al (2017) Light-Head R-CNN: in defense of two-stage object detector
Law H, Deng J (2019) CornerNet: detecting objects as paired keypoints
VisDrone 2020 Leaderboard—VISDRONE. http://aiskyeye.com/%20visdrone-2020-leaderboard/. Accessed 16 Aug 2023
Zhao Q, Liu B, Lyu S et al (2023) TPH-YOLOv5++: boosting object detection on drone-captured scenarios with cross-layer asymmetric transformer. Remote Sensing 15:1687. https://doi.org/10.3390/rs15061687
Article Google Scholar
Wang C-Y, Yeh I-H, Liao H-YM (2024) YOLOv9: learning what you want to learn using programmable gradient information
Wang A, Chen H, Liu L, et al (2024) YOLOv10: real-time end-to-end object detection

Download references

Funding

The authors thank the Natural Science Foundation of Hebei Province (F2024201012) and the Post-graduate’s Innovation Fund Project of Hebei University (HBU2024SS032) for their financial support and the support of the High-Performance Computing Center of Hebei University.

Author information

Authors and Affiliations

School of Cyber Security and Computer, Hebei University, Baoding, 071002, China
Hao Chen, Wenzhu Yang, Guoyu Zhou, Guodong Zhang & Zhaoyu Nian
Machine Vision Engineering Research Center, Hebei University, Baoding, 071002, China
Wenzhu Yang

Authors

Hao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wenzhu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Guoyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyu Nian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CH: Conceptualization, Methodology, Software, Visualization, Writing—original draft. WY: Supervision, Investigation, Writing—review and editing. GZ: Reviewing, Language Correction. GZ: Language Correction. ZN: Language Correction.

Corresponding author

Correspondence to Wenzhu Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, H., Yang, W., Zhou, G. et al. MFRENet: efficient detection of drone image based on multiscale feature aggregation and receptive field expanded. Pattern Anal Applic 27, 120 (2024). https://doi.org/10.1007/s10044-024-01337-1

Download citation

Received: 10 February 2024
Accepted: 05 September 2024
Published: 21 September 2024
DOI: https://doi.org/10.1007/s10044-024-01337-1

MFRENet: efficient detection of drone image based on multiscale feature aggregation and receptive field expanded

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PTCDet: advanced UAV imagery target detection

Improved YOLOv5 Algorithm for Small Object Detection in Drone Images

Ldstd: low-altitude drone aerial small target detector

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

MFRENet: efficient detection of drone image based on multiscale feature aggregation and receptive field expanded

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PTCDet: advanced UAV imagery target detection

Improved YOLOv5 Algorithm for Small Object Detection in Drone Images

Ldstd: low-altitude drone aerial small target detector

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now