[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Camouflage soldier object detection network based on the attention mechanism and pyramidal feature shrinking

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the high level of information similarity between camouflage soldier objects and their background, traditional deep learning-based object detection networks encounter distinct error detection rates and miss detection rates when attempting to detect camouflage soldiers. To address these challenges, we proposed a camouflage soldier object detection network (AFSNet) based on attention mechanism and multi-scale feature fusion strategy. We employed an attention module to enhance the network’s capability for feature extraction. Furthermore, we proposed a novel strategy for multi-scale feature fusion based on pyramidal feature shrinking, aiming to mitigate interference caused by interpolation and prevent information loss resulting from pooling during the process of feature fusion. Moreover, we introduced a novel information handle module that enhances the network’s capability for feature fusion by regulating the information transmission pathway. Experiments demonstrated that our network exhibits a better camouflage object detection performance than state-of-arts networks. Compared to YOLOv7, our network can achieve 93\(\%\) AP, which is increased by 6.7\(\%\) with almost no computation overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Availability of data and materials

The datasets generated during the current study are not avaiable.

Code Availability

The Code generated during the current study are not avaiable.

References

  1. Tankus A, Yeshurun Y (1998) Detection of regions of interest and camouflage breaking by direct convexity estimation. In: Proceedings 1998 IEEE workshop on visual surveillance. pp 42–48. https://doi.org/10.1109/WVS.1998.646019

  2. Wu G-J, Lv X-L, Xing H-N, Zhang L-T, Teng Y-H (2015) The application of 3d convex surface analysis in camouflage detection. J PLA Univ Technol (Nat Sci Ed) 16(6):582–586

    Google Scholar 

  3. Xian X-D, Li K-W (2013) Detection of camouflage miner based on color and texture features. Comput Appl 33(2):539–542

    MathSciNet  Google Scholar 

  4. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 580–587

  5. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision. pp 1440–1448

  6. Dai J, Li Y, He K, Sun J (2016) R-fcn: object detection via region-based fully convolutional networks. Adv Neural Inf Process Syst 29

  7. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28

  8. Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7464–7475

  9. Ge Z, Liu S-T, Wang F, Li Z-M, Sun J (2021) Yolox: exceeding yolo series in 2021. arXiv:2107.08430

  10. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv:2004.10934

  11. Jocher G, Chaurasia A, Qiu J (2022) YOLO by Ultralytics. https://github.com/ultralytics/ultralytics

  12. Fan D-P, Ji G-P, Sun G, Cheng M-M, Shen J, Shao L (2020) Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2777–2787

  13. Liu Y, Zhang D, Zhang Q, Han J (2021) Integrating part-object relationship and contrast for camouflaged object detection. IEEE Trans Inf Forensic Secur 16:5154–5166

    Article  Google Scholar 

  14. Le T-N, Nguyen TV, Nie Z, Tran M-T, Sugimoto A (2019) Anabranch network for camouflaged object segmentation. Comput Vision Image Understand 184:45–56

    Article  Google Scholar 

  15. Skurowski P, Abdulameer H, Baszczyk J, Depta T, Kornacki A, Kozie P (2018) Animal camouflage analysis: Chameleon database. Unpublished Manuscript 2(6):7

  16. Fan D-P, Ji G-P, Sun G-L, Cheng M-M, Shen J-B, Shao L (2020) Camouflaged object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2777–2787

  17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30

  18. Zhang Q-L, Yang Y-B (2021) Sa-net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2235–2239

  19. Yu T, Li X, Cai Y, Sun M, Li P (2021) S \(\hat{2}\)-mlpv2: improved spatial-shift mlp architecture for vision. arXiv:2108.01072

  20. Zhou H, Li J, Peng J, Zhang S, Zhang S (2021) Triplet attention: rethinking the similarity in transformers. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. pp 2378–2388

  21. Zhu L, Wang X, Ke Z, Zhang W, Lau RW (2023) Biformer: vision transformer with bi-level routing attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 10323–10333

  22. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). pp 3–19

  23. Falahat S, Karami A (2023) Maize tassel detection and counting using a yolov5-based model. Multimedia Tools Appl 82(13):19521–19538

    Article  Google Scholar 

  24. Xiong L, Yi H, Huang X, Huang W (2023) An efficient multi-scale contextual feature fusion network for counting crowds with varying densities and scales. Multimedia Tools Appl 82(9):13929–13949

    Article  Google Scholar 

  25. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2117–2125

  26. Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: beyond empirical risk minimization. arXiv:1710.09412

  27. Hendrycks D, Gimpel K (2016) Bridging nonlinearities and stochastic regularizers with gaussian error linear units. arXiv:1606.08415

  28. Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11

    Article  Google Scholar 

  29. Luo N, Pan Y, Sun R, Zhang T, Xiong Z, Wu F (2023) Camouflaged instance segmentation via explicit de-camouflaging. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 17918–17927

  30. He C, Li K, Zhang Y, Tang L, Zhang Y, Guo Z, Li X (2023) Camouflaged object detection with feature decomposition and edge reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 22046–22055

  31. Li A, Zhang J, Lv Y, Zhang T, Zhong Y, He M, Dai Y (2023) Joint salient object detection and camouflaged object detection via uncertainty-aware learning. arXiv:2307.04651

  32. Lv Y, Zhang J, Dai Y, Li A, Barnes N, Fan D-P (2023) Towards deeper understanding of camouflaged object detection. IEEE Trans Circ Syst Video Technol

  33. YAN L (2023) Research on masquerade object detection algorithm for aggregating multi scale scene context features. Master’s thesis, Nanjing University of Information Science and Technology

  34. Lamdouar H, Xie W, Zisserman A (2023) The making and breaking of camouflage. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 832–842

  35. Xue F, Yong C, Xu S, Dong H, Luo Y, Jia W (2016) Camouflage performance analysis and evaluation framework based on features fusion. Multimedia Tools Appl 75:4065–4082

    Article  Google Scholar 

  36. Huang Z, Dai H, Xiang T-Z, Wang S, Chen H-X, Qin J, Xiong H (2023) Feature shrinkage pyramid for camouflaged object detection with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5557–5566

  37. Ma M, Xia C, Li J (2021) Pyramidal feature shrinking for salient object detection. Proceedings of the AAAI conference on artificial intelligence 35:2311–2318

    Article  Google Scholar 

  38. Deng C, Wang M, Liu L, Liu Y, Jiang Y (2021) Extended feature pyramid network for small object detection. IEEE Trans Multimedia 24:1968–1979

    Article  Google Scholar 

  39. Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 658–666

  40. Liu Y, Zhang D, Zhang Q, Han J (2021) Part-object relational visual saliency. IEEE Trans Pattern Anal Mach Intell 44(7):3688–3704

    Google Scholar 

  41. Liu Y, Dong X, Zhang D, Xu S (2024) Deep unsupervised part-whole relational visual saliency. Neurocomput 563:126916

    Article  Google Scholar 

  42. Xu S, Gu J, Hua Y, Liu Y (2023) Dktnet: dual-key transformer network for small object detection. Neurocomput 525:29–41

    Article  Google Scholar 

  43. Liu Y, Cheng D, Zhang D, Xu S, Han J (2024) Capsule networks with residual pose routing. IEEE Trans Neural Netw Learn Syst

Download references

Funding

This research was supported by the Defense Industrial Technology Development Program(JCKY2021602B029)

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, Y.P. and J.W.; methodology, Y.P.; software, Y.P.; validation, Y.P. and Z.Y.; formal analysis, Y.P. and Y.Y.; investigation, Y.P. and Y.S.; resources,J.W.; data curation, Y.P. and Z.Y.; writing-original draft preparation, Y.P.; writing-review and editing, J.W.; visualization, J.W.; supervision, J.W.; project administration, J.W.; funding acquisition,J.W

Corresponding author

Correspondence to Jianzhong Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethics approval

Not applicable

Consent to participate

Not applicable

Consent for publication

All authors approved the final manuscript and the submission to this journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, Y., Wang, J., Yu, Z. et al. Camouflage soldier object detection network based on the attention mechanism and pyramidal feature shrinking. Multimed Tools Appl 83, 79917–79938 (2024). https://doi.org/10.1007/s11042-024-18618-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-024-18618-w

Keywords

Navigation