More Web Proxy on the site http://driver.im/

short-paper

NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection

Authors:

Yanwei FuAuthors Info & Claims

ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval

Pages 481 - 485

https://doi.org/10.1145/3460426.3463588

Published: 01 September 2021 Publication History

Abstract

Non-Maximum Suppression (NMS) is essential for object detection and affects the evaluation results by incorporating False Positives (FP) and False Negatives (FN), especially in crowd occlusion scenes. In this paper, we raise the problem of weak connection between the training targets and the evaluation metrics caused by NMS and propose a novel NMS-Loss making the NMS procedure can be trained end-to-end without any additional network parameters. Our NMS-Loss punishes two cases when FP is not suppressed and FN is wrongly eliminated by NMS. Specifically, we propose a pull loss to pull predictions with the same target close to each other, and a push loss to push predictions with different targets away from each other. Experimental results show that with the help of NMS-Loss, our detector, namely NMS-Ped, achieves impressive results with Miss Rate of 5.92% on Caltech dataset and 10.08%on CityPersons dataset, which are both better than state-of-the-art competitors.

References

[1]

Navaneeth Bodla, Bharat Singh, Rama Chellappa, and Larry S Davis. 2017. Soft-NMS--Improving Object Detection With One Line of Code. In ICCV. 5561--5569.

[2]

Garrick Brazil and Xiaoming Liu. 2019. Pedestrian Detection with Autoregressive Network Phases. In CVPR. 7231--7240.

[3]

Garrick Brazil, Xi Yin, and Xiaoming Liu. 2017. Illuminating pedestrians via simultaneous detection & segmentation. In ICCV. 4950--4959.

[4]

Zhaowei Cai, Quanfu Fan, Rogerio S Feris, and Nuno Vasconcelos. 2016. A unified multi-scale deep convolutional neural network for fast object detection. In ECCV. Springer, 354--370.

[5]

Cheng Chi, Shifeng Zhang, Junliang Xing, Zhen Lei, Stan Z Li, and Xudong Zou. 2020 a. Relational learning for joint head and human detection. In AAAI. 10647--10654.

[6]

Cheng Chi, Shifeng Zhang, Junliang Xing, Zhen Lei, Stan Z Li, Xudong Zou, et al. 2020 b. PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes. In AAAI. 10639--10646.

[7]

Xuangeng Chu, Anlin Zheng, Xiangyu Zhang, and Jian Sun. 2020. Detection in Crowded Scenes: One Proposal, Multiple Predictions. In CVPR. 12214--12223.

[8]

Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning, Vol. 20, 3 (1995), 273--297.

[9]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR. 886--893.

[10]

Piotr Dollár, Ron Appel, Serge Belongie, and Pietro Perona. 2014. Fast feature pyramids for object detection. PAMI, Vol. 36, 8 (2014), 1532--1545.

Digital Library

[11]

Piotr Dollár, Zhuowen Tu, Pietro Perona, and Serge Belongie. 2009. Integral channel features. In BMVC .

[12]

P Dollar, C Wojek, B Schiele, and P Perona. 2009. Pedestrian detection: A benchmark. In CVPR. 304--311.

[13]

Piotr Dollar, Christian Wojek, Bernt Schiele, and Pietro Perona. 2011. Pedestrian detection: An evaluation of the state of the art. PAMI, Vol. 34, 4 (2011), 743--761.

Digital Library

[14]

Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, and Larry Davis. 2017. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In WACV. IEEE, 953--961.

[15]

Pedro F Felzenszwalb, Ross B Girshick, and David McAllester. 2010. Cascade object detection with deformable part models. In CVPR. IEEE, 2241--2248.

[16]

Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2009. Object detection with discriminatively trained part-based models. PAMI, Vol. 32, 9 (2009), 1627--1645.

Digital Library

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.

[18]

Xin Huang, Zheng Ge, Zequn Jie, and Osamu Yoshie. 2020. NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing. In CVPR. 10750--10759.

[19]

Songtao Liu, Di Huang, and Yunhong Wang. 2019 a. Adaptive NMS: Refining Pedestrian Detection in a Crowd. In CVPR. 6459--6468.

[20]

Wei Liu, Shengcai Liao, Weidong Hu, Xuezhi Liang, and Xiao Chen. 2018. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In ECCV. 618--634.

[21]

Wei Liu, Shengcai Liao, Weiqiang Ren, Weidong Hu, and Yinan Yu. 2019 b. High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection. In CVPR. 5187--5196.

[22]

Yan Luo, Chongyang Zhang, Muming Zhao, Hao Zhou, and Jun Sun. 2020. Where, What, Whether: Multi-Modal Learning Meets Pedestrian Detection. In CVPR. 14065--14073.

[23]

Jiayuan Mao, Tete Xiao, Yuning Jiang, and Zhimin Cao. 2017. What can help pedestrian detection?. In CVPR. 3127--3136.

[24]

Woonhyun Nam, Piotr Dollár, and Joon Hee Han. 2014. Local decorrelation for improved pedestrian detection. In NIPS. 424--432.

[25]

Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, and Ling Shao. 2019. Mask-Guided Attention Network for Occluded Pedestrian Detection. In ICCV. 4967--4975.

[26]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).

[27]

Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Jiahao Pang, Qiong Yan, Yu-Wing Tai, and Li Xu. 2017. Accurate single stage detector using recurrent rolling convolution. In CVPR. 5420--5428.

[28]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS. 91--99.

[29]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[30]

Paul Viola and Michael Jones. 2001. Rapid object detection using a boosted cascade of simple features. In CVPR. I--I.

[31]

Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, and Chunhua Shen. 2018. Repulsion loss: Detecting pedestrians in a crowd. In CVPR. 7774--7783.

[32]

Jialian Wu, Chunluan Zhou, Ming Yang, Qian Zhang, Yuan Li, and Junsong Yuan. 2020. Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians. In CVPR. 13430--13439.

[33]

Jiaolong Xu, Sebastian Ramos, David Vázquez, and Antonio M López. 2014. Domain adaptation of deformable part-based models. PAMI, Vol. 36, 12 (2014), 2367--2380.

Digital Library

[34]

Junjie Yan, Zhen Lei, Longyin Wen, and Stan Z Li. 2014. The fastest deformable part model for object detection. In CVPR. 2497--2504.

[35]

Liliang Zhang, Liang Lin, Xiaodan Liang, and Kaiming He. 2016. Is Faster R-CNN doing well for pedestrian detection?. In ECCV. Springer, 443--457.

[36]

Shanshan Zhang, Rodrigo Benenson, and Bernt Schiele. 2017. Citypersons: A diverse dataset for pedestrian detection. In CVPR. 3213--3221.

[37]

Shanshan Zhang, Rodrigo Benenson, Bernt Schiele, et al. 2015. Filtered channel features for pedestrian detection. In CVPR, Vol. 1. 4.

[38]

Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and Stan Z Li. 2018. Occlusion-aware R-CNN: detecting pedestrians in a crowd. In ECCV. 637--653.

[39]

Chengju Zhou, Meiqing Wu, and Siew-Kei Lam. 2019. SSA-CNN: Semantic Self-Attention CNN for Pedestrian Detection. arXiv preprint arXiv:1902.09080 (2019).

Cited By

Song JZhou MLuo JPu HFeng YWei XJia W(2025)Boundary-Aware Feature Fusion With Dual-Stream Attention for Remote Sensing Small Object DetectionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.351437663(1-13)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3514376
El Ghazouali SMhirit YOukhrid AMichelucci UNouira H(2024)FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment AnythingSensors10.3390/s2409288924:9(2889)Online publication date: 30-Apr-2024
https://doi.org/10.3390/s24092889
Su QMu J(2024)Complex Scene Occluded Object Detection with Fusion of Mixed Local Channel Attention and Multi-Detection Layer Anchor-Free OptimizationAutomation10.3390/automation50200115:2(176-189)Online publication date: 17-Jun-2024
https://doi.org/10.3390/automation5020011
Show More Cited By

Index Terms

NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object detection

Recommendations

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Greedy-NMS inherently raises a dilemma, where a lower NMS threshold will potentially lead to a lower recall rate and a higher threshold introduces more false positives. This problem is more severe in pedestrian detection because the instance density ...
Real-time pedestrian detection via hierarchical convolutional feature

With the development of pedestrian detection technologies, existing methods can not simultaneously satisfy high quality detection and fast calculation for practical applications. Therefore, the goal of our research is to balance of pedestrian detection ...
R-SSD: refined single shot multibox detector for pedestrian detection
Abstract
Pedestrian detection is a critical task in the field of computer vision, and it has made considerable progress with the help of Convnets. However, a persistent crucial problem is that small-scale pedestrians are notoriously difficult to detect ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '21: Proceedings of the 2021 International Conference on Multimedia Retrieval

August 2021

715 pages

ISBN:9781450384636

DOI:10.1145/3460426

General Chairs:
Wen-Huang Cheng
National Yang Ming Chiao Tung University, Taiwan
,
Mohan Kankanhalli
National University of Singapore, Singapore
,
Meng Wang
Hefei University of Technology, China
,
Program Chairs:
Wei-Ta Chu
National Cheng Kung University, Taiwan
,
Jiaying Liu
Peking University, China
,
Marcel Worring
University of Amsterdam, Netherlands

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

ICMR '21

Sponsor:

SIGMM

ICMR '21: International Conference on Multimedia Retrieval

August 21 - 24, 2021

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
247
Total Downloads

Downloads (Last 12 months)46
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Song JZhou MLuo JPu HFeng YWei XJia W(2025)Boundary-Aware Feature Fusion With Dual-Stream Attention for Remote Sensing Small Object DetectionIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.351437663(1-13)Online publication date: 2025
https://doi.org/10.1109/TGRS.2024.3514376
El Ghazouali SMhirit YOukhrid AMichelucci UNouira H(2024)FusionVision: A Comprehensive Approach of 3D Object Reconstruction and Segmentation from RGB-D Cameras Using YOLO and Fast Segment AnythingSensors10.3390/s2409288924:9(2889)Online publication date: 30-Apr-2024
https://doi.org/10.3390/s24092889
Su QMu J(2024)Complex Scene Occluded Object Detection with Fusion of Mixed Local Channel Attention and Multi-Detection Layer Anchor-Free OptimizationAutomation10.3390/automation50200115:2(176-189)Online publication date: 17-Jun-2024
https://doi.org/10.3390/automation5020011
Jiang ZHuang SLi M(2024)A Pedestrian Detection Network Based on an Attention Mechanism and Pose InformationApplied Sciences10.3390/app1418821414:18(8214)Online publication date: 12-Sep-2024
https://doi.org/10.3390/app14188214
Yi JMao JZhang HZeng KTao ZZhong HWang SWang Y(2024)PSTL-Net: A Patchwise Self-Texture-Learning Network for Transmission Line InspectionIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2023.334111873(1-14)Online publication date: 2024
https://doi.org/10.1109/TIM.2023.3341118
Le DNguyen HVong VLuong NDo T(2024)H2P×PKD: Progressive Training Pipeline with Knowledge Distillation for Lightweight Backbones in Pedestrian Detection2024 International Conference on Multimedia Analysis and Pattern Recognition (MAPR)10.1109/MAPR63514.2024.10660994(1-6)Online publication date: 15-Aug-2024
https://doi.org/10.1109/MAPR63514.2024.10660994
Gao HHuang SLi MLi T(2024)Multi-Scale Structure Perception and Global Context-Aware Method for Small-Scale Pedestrian DetectionIEEE Access10.1109/ACCESS.2024.340696812(76392-76403)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3406968
Wu JLi SKang GYang Y(2024)CPH DETR: Comprehensive Regression Loss for End-to-End Object DetectionArtificial Neural Networks and Machine Learning – ICANN 202410.1007/978-3-031-72335-3_7(93-107)Online publication date: 17-Sep-2024
https://doi.org/10.1007/978-3-031-72335-3_7
Vu TVo TTran LChoi JVo TLy CNguyen TMondal SOh J(2024)Segmenting hydrogen‐induced cracking defects in steel through scanning acoustic microscopy and deep neural networksEngineering Reports10.1002/eng2.12933Online publication date: 5-Aug-2024
https://doi.org/10.1002/eng2.12933
Борківський БТеслюк В(2023)Використання нейромережевих засобів для розпізнавання об'єктів у мобільних системах з обходом перешкодScientific Bulletin of UNFU10.36930/4033041233:4(84-89)Online publication date: 31-Aug-2023
https://doi.org/10.36930/40330412
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents