Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images
"> Figure 1
<p>The efficient framework for aircraft detection.</p> "> Figure 2
<p>The framework of the Efficient Weighted Feature Fusion and Attention Network (EWFAN) algorithm.</p> "> Figure 3
<p>The structure of weighted bi-directional feature pyramid network.</p> "> Figure 4
<p>Schematic diagram of Residual Spatial Attention Module.</p> "> Figure 5
<p>Schematic diagram of Adaptively Spatial Feature Fusion (Taking ASFF-2 as an example).</p> "> Figure 6
<p>Classification regression network.</p> "> Figure 7
<p>Aircraft target aspect ratio distribution.</p> "> Figure 8
<p>The problems of IoU Loss.</p> "> Figure 9
<p>The schematic diagram of Non-Maximum Suppression (NMS).</p> "> Figure 10
<p>Effectiveness of the proposed airport detection algorithm. The green box represents the correctly detected aircraft, and the red box represents false alarms.</p> "> Figure 11
<p>The experiment result for Airport I. (<b>a</b>) SAR image of Airport Ifrom Gaofen-3. (<b>b</b>) The ground truth of the airport. (<b>c</b>–<b>e</b>) the aircraft detection results of (<b>a</b>) by EfficientDet, YOLOv4, and EWFAN. The green box, the red box, and the yellow box represent the correctly detected aircrafts, the false alarms, and the missed alarms respectively.</p> "> Figure 12
<p>The experiment result for Airport I. (<b>a</b>) SAR image of Airport I from Gaofen-3. (<b>b</b>) The ground truth of the airport. (<b>c</b>–<b>e</b>) the aircraft detection results of (<b>a</b>) by EfficientDet, YOLOv4, and EWFAN. The green box, the red box and the yellow box represent the correctly detected aircrafts, the false alarms, and the missed alarms respectively.</p> "> Figure 13
<p>The experiment result for Airport I. (<b>a</b>) SAR image of Airport I from Gaofen-3. (<b>b</b>) The ground truth of the airport. (<b>c</b>–<b>e</b>) the aircraft detection results of (<b>a</b>) by EfficientDet, YOLOv4, and EWFAN. The green box, the red box, and the yellow box represent the correctly detected aircrafts, the false alarms, and the missed alarms respectively.</p> "> Figure 13 Cont.
<p>The experiment result for Airport I. (<b>a</b>) SAR image of Airport I from Gaofen-3. (<b>b</b>) The ground truth of the airport. (<b>c</b>–<b>e</b>) the aircraft detection results of (<b>a</b>) by EfficientDet, YOLOv4, and EWFAN. The green box, the red box, and the yellow box represent the correctly detected aircrafts, the false alarms, and the missed alarms respectively.</p> ">
Abstract
:1. Introduction
- (1)
- For the efficient management of the airport (in the civil field) and real-time acquisition of battlefield military intelligence and formulation of combat plans (in the military field). An efficient and automatic aircrafts detection framework is proposed. First, the airport runway areas are extracted, then the aircraft is detected using the Efficient Weighted Feature Fusion and Attention Network (EWFAN), and finally the runway area is employed to filter false alarms. This framework can provide a generalizable workflow for aircraft detection and achieve high-precision and rapid detection.
- (2)
- EWFAN is proposed to perform aircrafts detection by integrating SAR image analytics and deep neural networks. It effectively integrates the weighted feature fusion module and the spatial attention mechanism, with the CIF loss function. This network is a lightweight network with the advantages of high detection accuracy and fast detection speed; it provides an important reference to other scholars and can also be extended for the detection of other dense targets in SAR image analytics, such as vehicles and ships.
2. State-of-the-Art
3. Methodology
3.1. Overall Detection Framework
3.2. EfficientDet
3.3. EWFAN for Aircraft Detection
3.3.1. Weighted Feature Fusion and Attention Module (WFAM)
- Weighted Bi-directional Feature Pyramid Network (BiFPN)
- Residual Spatial Attention Module (RSAM)
- Adaptively Spatial Feature Fusion (ASFF)
3.3.2. Classification Regression Network and Priori Boxes Generation
3.3.3. CIF Loss Function CIF Loss Function
3.3.4. Non-Maximum Suppression (NMS)
3.3.5. Using Airport Masks to Remove False Alarms
4. Experimental Results and Analyses
4.1. Data Usage
4.2. Hyperparameter Settings
4.3. Results Evaluation Method
4.4. Analysis of the Role of Airport Detection Algorithms
4.5. The Influence of Different Window Overlap Rates on Aircraft Detection Performance
4.6. Aircrafts Detection Performance Analysis
4.6.1. Analysis of Aircrafts Detection for Airport I
4.6.2. Analysis of Aircrafts Detection for Airport II
4.6.3. Analysis of Aircrafts Detection for Airport III
4.7. Analysis and Evaluation
5. Discussion
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Steenson, B.O. Detection performance of a mean-level threshold. IEEE Trans. Aerosp. Electron. Syst. 1968, 4, 529–534. [Google Scholar] [CrossRef]
- Finn, H.M.; Jonson, R.S. Adaptive detection mode with threshold control as a function of spatially sampled clutter-level estimates. RCA Rev. 1968, 29, 414–464. [Google Scholar]
- Goldstein, G.B. False Alarm Regulation in Long-normal and Weibull Clutter. IEEE Trans. Aerosp. Electron. Syst. 1973, 9, 84–92. [Google Scholar] [CrossRef]
- Tan, Y.; Li, Q.; Yan, S. Aircraft detection in high-resolution SAR images based on a gradient textural saliency map. Sensors 2015, 15, 23071–23094. [Google Scholar] [CrossRef] [PubMed]
- Guo, Q.; Wang, H. Research progress on aircraft detection and recognition in SAR imagery. J. Radars 2020, 9, 497–513. [Google Scholar]
- Zhao, Q.; Principe, J.C. Support vector machines for SAR automatic target recognition. IEEE Trans. Aerosp. Electron. Syst. 2001, 37, 643–654. [Google Scholar] [CrossRef] [Green Version]
- Sun, Y.; Liu, Z.; Todorovic, S. Adaptive boosting for SAR automatic target recognition. IEEE Trans. Aerosp. Electron. Syst. 2007, 1, 112–125. [Google Scholar] [CrossRef]
- Wang, S.; Gao, X.; Sun, H. An aircraft detection method based on convolutional neural networks in high-resolution SAR images. J. Radars 2017, 6, 195–203. [Google Scholar]
- Zhengxia, Z.; Zhenwei, S. Object Detection in 20 Years: A Survey. arXiv 2019, arXiv:1905.05055v2. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; pp. 1440–1448. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1904–1916. [Google Scholar]
- He, K.; Gkioxari, G.; Dollar, P. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unifified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Tan, M.; Pan, R. EffificientDet: Scalable and Efficient Object Detection. arXiv 2019, arXiv:1911.09070v1. [Google Scholar]
- Bochkovskiy, A.; Wang, C. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934v1. [Google Scholar]
- Li, C.; Zhao, L.; Kuang, G. A Two-stage Airport Detection Model for Large Scale SAR Images Based on Faster R-CNN. In Proceedings of the Eleventh International Conference on Digital Image Processing, Guangzhou, China, 10–13 May 2019; pp. 515–525. [Google Scholar]
- Ren, S.; Kaiming, H.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Y.; Zhao, L.; Kuang, G. Fast detection aircrafts in complex large scene SAR images. Chin. J. Radio Sci. 2020, 35, 594–602. [Google Scholar]
- Chen, L.; Tan, S.; Pan, Z. A New Framework for Automatic Airports Extraction from SAR Images Using Multi-Level Dual Attention Mechanism. Remote Sens. 2020, 12, 560. [Google Scholar] [CrossRef] [Green Version]
- Zhang, P.; Chen, L.; Xing, J.; Li, Z.; Xing, X.; Yuan, Z. Water and shadow extraction in SAR image based on a new deep learning network. Sensors 2019, 19, 3576. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, L.; Zhang, P.; Xing, J.; Li, Z.; Xing, X.; Yuan, Z. A Multi-scale Deep Neural Network for Water Detection from SAR Images in the Mountainous Areas. Remote Sens. 2020, 12, 3205. [Google Scholar] [CrossRef]
- Chen, L.; Weng, T.; Xing, J.; Pan, Z.; Yuan, Z.; Xing, X.; Zhang, P. A New Deep Learning Network for Automatic Bridge Detection from SAR Images Based on Balanced and Attention Mechanism. Remote Sens. 2020, 12, 441. [Google Scholar] [CrossRef]
- Chen, L.; Cui, X.; Li, Z.; Yuan, Z.; Xing, J.; Xing, X. A new Deep Learning Algorithm for SAR Scenes Classification Based on Space Statistical Modeling and Features Re-calibration. Sensors 2019, 19, 2479. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tan, S.; Chen, L.; Pan, Z. Geospatial Contextual Attention Mechanism for Automatic and Fast Airport Detection in SAR Imagery. IEEE Access 2020, 8, 173627–173640. [Google Scholar] [CrossRef]
- Tan, M. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946v3. [Google Scholar]
- Liu, S.; Huang, D.; Wang, Y. Learning Spatial Fusion for Single-Shot Object Detection. arXiv 2019, arXiv:1911.09516v2. [Google Scholar]
- Zheng, Z.; Wang, P. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv 2019, arXiv:1911.08287v1. [Google Scholar]
- Lin, T.; Goya, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 2, 318–327. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Woo, S.; Park, J. CBAM: Convolutional Block Attention Module. arXiv 2018, arXiv:1807.06521v2. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef] [Green Version]
Network | DR (%) | FAR (%) | MAR (%) | F | M |
---|---|---|---|---|---|
EfficientDet (raw) (512) | 97.7 | 30.6 | 2.6 | 130 | 8 |
EfficientDet (mask) (512) | 96.4 | 11.8 | 3.6 | 39 | 11 |
Network | DR (%) | FAR (%) | MAR (%) | F | M |
---|---|---|---|---|---|
EWFAN (raw) (512) | 97.4 | 7.2 | 2.6 | 23 | 8 |
EWFAN (mask) (512) | 95.4 | 3.3 | 4.6 | 10 | 14 |
Window Overlap Rates | DR (%) | FAR (%) | MAR (%) | F | M |
---|---|---|---|---|---|
EWFAN (10%) | 93.4 | 2.8 | 6.6 | 8 | 20 |
EWFAN (20%) | 95.4 | 3.3 | 4.6 | 10 | 14 |
EWFAN (30%) | 95.6 | 5.5 | 4.3 | 17 | 13 |
Window Overlap Rates | Mean |
---|---|
EWFAN (10%) | 16.78 s |
EWFAN (20%) | 18.58 s |
EWFAN (30%) | 21.23 s |
Airport | DR (%) | FAR (%) | MAR (%) | F | M | |
---|---|---|---|---|---|---|
EfficientDet (512) | Airport I | 98.4 | 6.7 | 1.6 | 9 | 2 |
Airport II | 100 | 2.9 | 0 | 1 | 0 | |
Airport III | 93.0 | 17.8 | 6.3 | 29 | 9 | |
Total | 96.4 | 11.8 | 3.6 | 39 | 11 | |
YOLOv4 (512) | Airport I | 98.4 | 4.6 | 1.6 | 6 | 2 |
Airport II | 100 | 5.7 | 0 | 2 | 0 | |
Airport III | 92.3 | 10.8 | 7.7 | 16 | 11 | |
Total | 95.7 | 7.6 | 4.3 | 24 | 13 | |
EWFAN (Ours) (512) | Airport I | 98.4 | 0.8 | 1.6 | 1 | 2 |
Airport II | 100 | 0 | 0 | 0 | 0 | |
Airport III | 91.6 | 6.4 | 8.4 | 9 | 12 | |
Total | 95.4 | 3.3 | 4.6 | 10 | 14 |
Network | EfficientDet | YOLOv4 | EWFAN (Ours) |
---|---|---|---|
mAP (%) | 92.4 | 95.1 | 97.9 |
Network | Airport I | Airport II | Airport III | Mean |
---|---|---|---|---|
EfficientDet | 17.97 s | 8.89 s | 27.75 s | 18.20 s |
YOLOv4 | 15.61 s | 8.09 s | 24.04 s | 15.91 s |
EWFAN (Ours) | 18.21 s | 9.03 s | 28.50 s | 18.58 s |
Network | Airport I | Airport II | Airport III | Mean |
---|---|---|---|---|
EfficientDet | 12.79 s | 11.12 s | 21.85 s | 15.25 s |
YOLOv4 | 12.40 s | 10.90 s | 20.68 s | 14.66 s |
EWFAN (Ours) | 12.92 s | 11.19 s | 22.07 s | 15.40 s |
Airport I | Airport II | Airport III | Mean | |
---|---|---|---|---|
Time | 7.60 s | 6.73 s | 9.33 s | 7.89 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Xiao, H.; Chen, L.; Xing, J.; Pan, Z.; Luo, R.; Cai, X. Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images. Remote Sens. 2021, 13, 910. https://doi.org/10.3390/rs13050910
Wang J, Xiao H, Chen L, Xing J, Pan Z, Luo R, Cai X. Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images. Remote Sensing. 2021; 13(5):910. https://doi.org/10.3390/rs13050910
Chicago/Turabian StyleWang, Jielan, Hongguang Xiao, Lifu Chen, Jin Xing, Zhouhao Pan, Ru Luo, and Xingmin Cai. 2021. "Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images" Remote Sensing 13, no. 5: 910. https://doi.org/10.3390/rs13050910
APA StyleWang, J., Xiao, H., Chen, L., Xing, J., Pan, Z., Luo, R., & Cai, X. (2021). Integrating Weighted Feature Fusion and the Spatial Attention Module with Convolutional Neural Networks for Automatic Aircraft Detection from SAR Images. Remote Sensing, 13(5), 910. https://doi.org/10.3390/rs13050910