Double parallel branches FCOS for human detection in a crowd

Qing Song¹,
Hao Wang¹,
Lu Yang¹,
Xueshi Xin¹,
Chun Liu¹ &
…
Mengjie Hu¹

403 Accesses
1 Altmetric
Explore all metrics

Abstract

The improvement from region-level to pixel-level and fewer hyper-parameters make anchor-free detectors popular. Most anchor-free algorithms will set a center-ness branch to reduce prediction points far away from the center of the target, which will indirectly weaken the more important features of the head in the pedestrian dataset. However, in a dense crowd, the head features of humans are critical to alleviating the problem of occlusion. In order to alleviate this problem, we have counted the characteristics of the target scale of a dense pedestrian dataset and introduced a Double Parallel Branches FCOS(DPB-FCOS) detector method. Based on the original prediction branch, we add a head branch to generate additional prediction boxes, and redefine the positive sample selection method of this branch, so that it can generate more prediction boxes in the head position of the human body. At the same time, considering the three factors of overlap area, distance, and aspect ratio, we designed a regression loss that is more suitable for anchor-free detectors. The center point distance in DIoU is used instead by the distance between the upper left and lower right corner points, which significantly improves the model’s performance. We verify our method on two popular models. Compared with baseline, FCOS can improve the accuracy by 5.9% and ATSS can improve the accuracy by 3.8% on the CrowdHuman dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Multi-branch detection network based on trigger attention for pedestrian detection under occlusion

Article 06 July 2022

Semantic Structural and Occlusive Feature Fusion for Pedestrian Detection

Improved YOLOX for pedestrian detection in crowded scenes

Article 28 February 2023

References

Bochkovskiy A, Wang CY, Liao H (2020) Yolov4: Optimal speed and accuracy of object detection
Bodla N, Singh B, Chellappa R, Davis LS (2017) Improving object detection with one line of code
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 6154–6162
Chen Y, Wang L, Li C, Hou Y, Li W (2020) Convnets-based action recognition from skeleton motion maps. Multimedia Tools and Applications, 79(3)
Dai L, Jifeng H, Yi S, Kaiming, Jian (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems 29, pp 379–387
Du X, El-Khamy M, Lee J, Davis LS (2017) Fused dnn: a deep neural network fusion approach to fast and robust pedestrian detection. In: 2017 IEEE Winter conference on applications of computer vision (WACV)
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: The IEEE international conference on computer vision (ICCV), pp 6569–6578
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd : Deconvolutional single shot detector coRR
Ge Z, Jie Z, Huang X, Xu R, Yoshie O (2020) Ps-rcnn: Detecting secondary human instances in a crowd via primary object suppression. In: IEEE
Girshick R (2015) Fast r-cnn. In: The IEEE international conference on computer vision (ICCV), pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Computer Society
He K, Gkioxari G, Dollar P, Girshick R (2017) Mask r-cnn. In: The IEEE international conference on computer vision (ICCV), pp 2961–2969
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Huang Z, Yue K, Deng J, Zhou F (2020) Visible feature guidance for crowd pedestrian detection
Jianan, Li, Xiaodan, Liang, Shengmei, Shen, Tingfa, Xu, Jiashi, Feng (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Transactions on Multimedia
Jianan, Li, Xiaodan, Liang, Shengmei, Shen, Tingfa, Xu, Jiashi, Feng (2017) Scale-aware fast r-cnn for pedestrian detection. IEEE Transactions on Multimedia
Karen S, Andrew Z (2014) Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: The european conference on computer vision (ECCV), pp 734–750
Leibe B, Matas J, Sebe N, Welling M (2016) [Lecture notes in computer science] computer vision – eccv 2016 volume 9908 —— a unified multi-scale deep convolutional neural network for fast object detection, vol. 10.1007/978-3-319-46493-0, no Chapter 22, 354–370
Lin TY, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 2117–2125
Lin TY, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. In: The IEEE international conference on computer vision (ICCV), pp 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S (2015) Ssd: Single shot multibox detector. In: The European Conference on Computer Vision (ECCV), pp 21–37
Liu S, Huang D, Wang Y (2019) Adaptive nms: Refining pedestrian detection in a crowd. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)
Liu W et al (2018) Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. Springer, Cham
Book Google Scholar
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18Th international conference on pattern recognition (ICPR’06), vol 3, pp 850–855
Pang C, Wang W, Lan R, Shi Z, Luo X (2020) Bilinear pyramid network for flower species categorization. Multimed Tools Appl 6:1–11
Google Scholar
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Computer vision & pattern recognition
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv e-prints
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Savarese S (2019) Generalized intersection over union: A metric and a loss for bounding box regression. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Santosh KC, Antani SK (2020) Recent trends in image processing and pattern recognition. Multimed Tools Appl 79(47-48):1–3
Article Google Scholar
Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) Crowdhuman: A benchmark for detecting human in a crowd
Song Q, Yang F, Yang L, Liu C, Xia L (2020) Learning point-guided localization for detection in remote sensing images. J Sel Top Appl Earth Obs Remote Sens, vol PP 99:1–1
Google Scholar
Tian Z, Shen C, Chen H, He T (2019) Fcos: Fully convolutional one-stage object detection. In: The IEEE international conference on computer vision (ICCV), pp 9627–9636
Liu W, Liao S, Hu W et al (2017) Denet: Scalable real-time object detection with directed sparse sampling. In: 2017 IEEE International conference on computer vision (ICCV)
Wang X, Chen K, Huang Z, Yao C, Liu W (2017) Point linking network for object detection
Wang S, Cheng J, Liu H, Tang M (2018) Pcn: Part and context information for pedestrian detection with cnns. arXiv
Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2017) Repulsion loss: Detecting pedestrians in a crowd
Xiao Y, Tian Z, Yu J, Zhang Y, Lan X (2020) A review of object detection based on deep learning. Multimedia Tools and Applications, (11)
Yang L, Song Q, Wang Z, Hu M, Liu C, Xin X, Jia W, Xu S (2020) Renovating parsing r-cnn for accurate multiple human parsing. In: Proceedings of European Conference on Computer Vision (ECCV)
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: an advanced object detection network. ACM
Zhang S, Chi C, Yao Y, Lei Z, Li SZ (2020) Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR)
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Occlusion-aware r-cnn: Detecting pedestrians in a crowd. Springer, Cham
Google Scholar
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 4203–4212
Zhang K, Xiong F, Sun P, Hu L, Li B, Yu G (2019) Double anchor r-cnn for human detection in a crowd
Zheng Z, Wang P, Liu W, Li J, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression. In: AAAI Conference on artificial intelligence
Zhou S, Qiu J (2021) Enhanced ssd with interactive multi-scale attention features for object detection. Multimedia Tools and Applications, (1)
Zhou C, Yuan J (2018) Bi-box regression for pedestrian detection and occlusion estimation. Springer, Cham
Book Google Scholar

Download references

Author information

Authors and Affiliations

Pattern Recognition and Intelligent Vision Lab, Beijing University of Posts and Telecommunications, Beijing, China
Qing Song, Hao Wang, Lu Yang, Xueshi Xin, Chun Liu & Mengjie Hu

Authors

Qing Song
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xueshi Xin
View author publications
You can also search for this author in PubMed Google Scholar
Chun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qing Song.

Ethics declarations

Conflict of Interests

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, Q., Wang, H., Yang, L. et al. Double parallel branches FCOS for human detection in a crowd. Multimed Tools Appl 81, 15707–15723 (2022). https://doi.org/10.1007/s11042-022-12439-5

Download citation

Received: 16 April 2021
Revised: 08 December 2021
Accepted: 25 January 2022
Published: 01 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11042-022-12439-5

Double parallel branches FCOS for human detection in a crowd

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-branch detection network based on trigger attention for pedestrian detection under occlusion

Semantic Structural and Occlusive Feature Fusion for Pedestrian Detection

Improved YOLOX for pedestrian detection in crowded scenes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Double parallel branches FCOS for human detection in a crowd

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-branch detection network based on trigger attention for pedestrian detection under occlusion

Semantic Structural and Occlusive Feature Fusion for Pedestrian Detection

Improved YOLOX for pedestrian detection in crowded scenes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now