Learning to transfer attention in multi-level features for rotated ship detection

Zhenyu Cui ORCID: orcid.org/0000-0001-9857-2178¹,
Ying Liu^2,3,
Wei Zhao² &
…
Cheng Wang⁴

373 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Multi-scale object detection is one of the focuses of object detection, which is particularly vital for ship detection. In order to achieve the desired effects, most advanced Convolutional Neural Network-based detectors enumerate and make inferences over multi-resolution feature maps. However, the existing methods bring two critical problems: (1) Over-fitted anchor settings and supervisions for object scales will restrict the generalized performance of the algorithm. (2) Similar multi-resolution prediction branches insulate the feature space and prevent learning from branches at different levels. Drawing on the human cognitive process, this paper proposes a novel structure for multi-scale rotated ship detection called the Feature Attention Transfer module, which generates and transfers attention in multi-level feature maps to instruct each prediction branch to focus on the features that are not well extracted in other branches. Accordingly, a customized supervision method called “Inclusion–Exclusion Learning” is proposed for associative learning based on the prediction results on multi-scale branches. We employ an anchor-free rotated ship detection framework to verify the proposed module. Extensive experiments are conducted to demonstrate the effectiveness of the proposed algorithm, called SKFat, on three optical remote sensing image datasets. Experimental results show that the proposed modules improve the multi-resolution detection framework while introducing negligible inference overhead. The best result of the proposed algorithm achieves the state-of-the-art average precision while reaching a high inference speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Rotated Ship Detection with Improved YOLOv5X

Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image

Article 07 May 2020

Automatic Detection of Ship Based on Rotation Invariant RetinaNet

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Huo W, Huang Y, Pei J, Zhang Q, Gu Q, Yang J (2018) Ship detection from ocean sar image based on local contrast variance weighted information entropy. Sensors 18(4):1196
Article Google Scholar
Pelich R, Longépé N, Mercier G, Hajduch G, Garello R (2014) Ais-based evaluation of target detectors and sar sensors characteristics for maritime surveillance. IEEE J Sel Top Appl Earth Observ Remote Sens 8(8):3892–3901
Article Google Scholar
Schloenhardt A (2003) Migrant smuggling: illegal migration and organised crime in Australia and the Asia Pacific Region. Leiden, The Netherlands: Brill–Nijhoff, [Online]. Available: https://brill.com/view/title/8821
Marino A, Sugimoto M, Ouchi K, Hajnsek I (2013) Validating a notch filter for detection of targets at sea with alos-palsar data: Tokyo bay. IEEE J Sel Top Appl Earth Observ Remote Sens 7(12):4907–4918
Article Google Scholar
Cai Z, Vasconcelos N (2019) Cascade r-cnn: high quality object detection and instance segmentation. IEEE Trans Pattern Anal Mach Intell 43:1483–98
Article Google Scholar
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 9259–9266
Qiao S, Chen L-C, Yuille A (2020) Detectors: detecting objects with recursive feature pyramid and switchable atrous convolution. Preprint arXiv:2006.02334
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Zhou X, Wang D, Krähenbühl P (2019) Objects as points. Preprint arXiv:1904.07850
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Cheng B, Wei Y, Shi H, Feris R, Xiong J, Huang T (2018) Revisiting rcnn: on awakening the classification power of faster rcnn. In: Proceedings of the European conference on computer vision (ECCV), pp 453–468
Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 840–849
Singh B, Davis LS (2018) An analysis of scale invariance in object detection snip. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3578–3587
Chen X, Fang H, Lin T-Y, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015) Microsoft coco captions: data collection and evaluation server. Preprint arXiv:1504.00325
Xia G-S, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) Dota: a large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3974–3983
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Leng J, Liu Y, Chen S (2019) Context-aware attention network for image recognition. Neural Comput Appl 31(12):9295–9305
Article Google Scholar
Cui Z, Leng J, Liu Y, Zhang T, Quan P, Zhao W (2021) Sknet: detecting rotated ships as keypoints in optical remote sensing images. IEEE Trans Geosci Remote Sens 59:8826–40
Article Google Scholar
Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Ma J, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed 20(11):3111–3122
Article Google Scholar
Jiang Y, Zhu X, Wang X, Yang S, Li W, Wang H, Fu P, Luo Z (2017) R2cnn: rotational region cnn for orientation robust scene text detection. Preprint arXiv:1706.09579
Ding J, Xue N, Long Y, Xia G-S, Lu Q (2018) Learning roi transformer for detecting oriented objects in aerial images. Preprint arXiv:1812.00155
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 5551–5560
Lin Y, Feng P, Guan J (2019) Ienet: interacting embranchment one stage anchor free detector for orientation aerial object detection. Preprint arXiv:1912.00969
Cheng M-M, Zhang Z, Lin W-Y, Torr P (2014) Bing: binarized normed gradients for objectness estimation at 300fps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3286–3293
Szegedy C, Reed S, Erhan D, Anguelov D, Ioffe S (2014) Scalable, high-quality object detection. Preprint arXiv:1412.1441
Erhan D, Szegedy C, Toshev A, Anguelov D (2014) Scalable object detection using deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2147–2154
Wang J, Chen K, Yang S, Loy CC, Lin D (2019) Region proposal by guided anchoring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2965–2974
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision. Springer, pp 354–370
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems, vol 2, pp 2204–2212
Bahdanau D, Cho KH, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd international conference on learning representations, ICLR 2015
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213–229
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19
Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K (2015) Spatial transformer networks. In: Proceedings of the 28th international conference on neural information processing systems,vol 2, pp 2017–2025
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision. Springer, pp 483–499
Ross SM (2014) Introduction to probability models. Academic press
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th international conference on pattern recognition (ICPR’06). IEEE, vol 3, pp 850–855
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE international conference on computer vision, pp 6569–6578
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Liu Z, Liu Y, Weng L, Yang Y (2017) A high resolution optical satellite image dataset for ship recognition and some new baselines. In: International conference on pattern recognition applications & methods
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Preprint arXiv:1409.1556
Zhang Z, Guo W, Zhu S, Yu W (2018) Toward arbitrary-oriented ship detection with rotated region proposal and discrimination networks. IEEE Geosci Remote Sens Lett 15:1–5
Article Google Scholar
Liao M, Zhu Z, Shi B, Xia G, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society, pp 5909–5918
Yang X, Liu Q, Yan J, Li A (2019) R3det: refined single-stage detector with feature refinement for rotating object. Preprint arXiv:1908.05612
Liu L, Bai Y, Li Y (2020) Locality-aware rotated ship detection in high-resolution remote sensing imagery based on multi-scale convolutional network. Preprint arXiv:2007.12326
Koo J, Seo J, Jeon S, Choe J, Jeon T (2018) Rbox-cnn: rotated bounding box based cnn for ship detection in remote sensing image. In: Proceedings of the 26th ACM SIGSPATIAL international conference on advances in geographic information systems, pp 420–423
Wang J, Yang W, Li H-C, Zhang H, Xia G-S (2020) Learning center probability map for detecting objects in aerial images. IEEE Trans Geosci Remote Sens 59:4307–23
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740–755

Download references

Funding

This work was supported in part by the Natural Science Foundation of China under Grant 71671178, in part by the Equipment Advance Research Fund 6142502180101. It is also supported by the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Wangxuan Institute of Computer Technology, Peking University, Beijing, 100871, China
Zhenyu Cui
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 101400, China
Ying Liu & Wei Zhao
Lab of Data Mining and High Performance Computing, University of Chinese Academy of Sciences, Beijing, China
Ying Liu
Beijing Raying Technologies, Inc., Beijing, China
Cheng Wang

Authors

Zhenyu Cui
View author publications
You can also search for this author in PubMed Google Scholar
Ying Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Wang.

Ethics declarations

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, Z., Liu, Y., Zhao, W. et al. Learning to transfer attention in multi-level features for rotated ship detection. Neural Comput & Applic 34, 19831–19844 (2022). https://doi.org/10.1007/s00521-022-07491-z

Download citation

Received: 13 July 2021
Accepted: 31 May 2022
Published: 08 July 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s00521-022-07491-z

Learning to transfer attention in multi-level features for rotated ship detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Rotated Ship Detection with Improved YOLOv5X

Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image

Automatic Detection of Ship Based on Rotation Invariant RetinaNet

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Learning to transfer attention in multi-level features for rotated ship detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Rotated Ship Detection with Improved YOLOv5X

Arbitrary-oriented object detection via dense feature fusion and attention model for remote sensing super-resolution image

Automatic Detection of Ship Based on Rotation Invariant RetinaNet

Explore related subjects

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation