[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3394171.3413945acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

CODAN: Counting-driven Attention Network for Vehicle Detection in Congested Scenes

Published: 12 October 2020 Publication History

Abstract

Although recent object detectors have shown excellent performance for vehicle detection, they are incompetent for scenarios with a relatively large number of vehicles. In this paper, we explore the dense vehicle detection given the number of vehicles. Existing crowd counting methods cannot directly applied for dense vehicle detection due to insufficient description of density map, and the lack of effective constraint for mining the spatial awareness of dense vehicles. Inspired by these observations, a conceptually simple yet efficient framework, called CODAN, is proposed for dense vehicle detection. The proposed approach is composed of three major components: (i) an efficient strategy for generating multi-scale density maps (MDM) is designed to represent the vehicle counting, which can capture the global semantics and spatial information of dense vehicles, (ii) a multi-branch attention module (MAM) is proposed to bridging the gap between object counting and vehicle detection framework, (iii) with the well-designed density maps as explicit supervision, an effective counting-awareness loss (C-Loss) is employed to guide the attention learning by building the pixel-level constrain. Extensive experiments conducted on four benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art methods. The impressive results indicate that vehicle detection and counting can be mutually supportive, which is an important and meaningful finding.

Supplementary Material

MP4 File (3394171.3413945.mp4)
In this video, we explore the dense vehicle detection given the number of vehicles. Existing crowd counting methods cannot directly applied for dense vehicle detection due to insufficient description of density map, and the lack of effective constraint for mining the spatial awareness of dense vehicles. A conceptually simple yet efficient framework, called CODAN, is proposed for dense vehicle detection. It consists of three major components: (i) an efficient strategy for generating multi-scale density maps (MDM) is designed to represent the vehicle counting, (ii) a multi-branch attention module (MAM) is proposed to bridging the gap between object counting and vehicle detection, (iii) with the well-designed density maps as explicit supervision, an effective counting-awareness loss (C-Loss) is employed to guide the attention learning by building the pixel-level constrain. Extensive experiments conducted on four datasets demonstrate that the proposed method outperforms the state-of-the-art methods.

References

[1]
Sean Bell, C Lawrence Zitnick, Kavita Bala, and Ross Girshick. 2016. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2874--2883.
[2]
Lokesh Boominathan, Srinivas S.S. Kruthiventi, and R. Venkatesh Babu. 2016. CrowdNet: A deep convolutional network for dense crowd counting. In Proceedings of the ACM International Conference on Multimedia. 640--644.
[3]
Yuanqiang Cai, Dawei Du, Libo Zhang, Longyin Wen, Weiqiang Wang, Yanjun Wu, and Siwei Lyu. 2019. Guided attention network for object detection and counting on drones. arXiv preprint arXiv:1909.11307 (2019).
[4]
Xinkun Cao, Zhipeng Wang, Yanyun Zhao, and Fei Su. 2018. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision. 734--750.
[5]
Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, et al. 2019. MMDetection: Open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019).
[6]
Zhi-Qi Cheng, Jun-Xiu Li, Qi Dai, Xiao Wu, and Alexander G Hauptmann. 2019 a. Learning spatial awareness to improve crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 6152--6161.
[7]
Zhi-Qi Cheng, Jun-Xiu Li, Qi Dai, Xiao Wu, Jun-Yan He, and Alexander G Hauptmann. 2019 b. Improving the learning of multi-column convolutional neural network for crowd counting. In Proceedings of the ACM International Conference on Multimedia. 1897--1906.
[8]
Wenqing Chu, Yao Liu, Chen Shen, Deng Cai, and Xian-Sheng Hua. 2017. Multi-task vehicle detection with region-of-interest voting. IEEE Transactions on Image Processing, Vol. 27, 1 (2017), 432--441.
[9]
Paulo R.L. De Almeida, Luiz S. Oliveira, Alceu S. Britto Jr., Eunelson J. Silva Jr., and Alessandro L. Koerich. 2015. PKLot--A robust dataset for parking lot classification. Expert Systems with Applications, Vol. 42, 11 (2015), 4937--4949.
[10]
Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, and Qi Tian. 2018. The unmanned aerial vehicle benchmark: Object detection and tracking. In Proceedings of the European Conference on Computer Vision. 370--386.
[11]
Mark Everingham, Luc Van Gool, Christopher K.I. Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, Vol. 88, 2 (2010), 303--338.
[12]
Dan Guo, Kun Li, Zheng-Jun Zha, and Meng Wang. 2019. DADNet: Dilated-attention-deformable convnet for crowd counting. In Proceedings of the ACM International Conference on Multimedia. 1823--1832.
[13]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961--2969.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[15]
Tong He, Zhi Tian, Weilin Huang, Chunhua Shen, Yu Qiao, and Changming Sun. 2018. An end-to-end textspotter with explicit alignment and attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5020--5029.
[16]
Meng-Ru Hsieh, Yen-Liang Lin, and Winston H Hsu. 2017. Drone-based object counting by spatially regularized regional proposal network. In Proceedings of the IEEE International Conference on Computer Vision. 4145--4153.
[17]
Jie Hu, Li Shen, and Gang Sun. 2018a. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132--7141.
[18]
Xiaowei Hu, Xuemiao Xu, Yongjie Xiao, Hao Chen, Shengfeng He, Jing Qin, and Pheng-Ann Heng. 2018b. SINet: A scale-insensitive convolutional neural network for fast vehicle detection. IEEE Transactions on Intelligent Transportation Systems, Vol. 20, 3 (2018), 1010--1019.
[19]
Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. PointPillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12697--12705.
[20]
Victor Lempitsky and Andrew Zisserman. 2010. Learning to count objects in images. In Advances in Neural Information Processing Systems. 1324--1332.
[21]
Wei Li, Hongliang Li, Qingbo Wu, Xiaoyu Chen, and King Ngi Ngan. 2019. Simultaneously detecting and counting dense vehicles from drone images. IEEE Transactions on Industrial Electronics, Vol. 66, 12 (2019), 9651--9662.
[22]
Wei Li, Hongliang Li, Qingbo Wu, Fanman Meng, Linfeng Xu, and King Ngi Ngan. 2020. HeadNet: An end-to-end adaptive relational network for head detection. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 30, 2 (2020), 482--494.
[23]
Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1091--1100.
[24]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017a. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2117--2125.
[25]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017b. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980--2988.
[26]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision. 740--755.
[27]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision. 21--37.
[28]
Wei Liu, Shengcai Liao, and Weidong Hu. 2019. Towards accurate tiny vehicle detection in complex scenes. Neurocomputing, Vol. 347 (2019), 24--33.
[29]
T. Nathan Mundhenk, Goran Konjevod, Wesam A Sakla, and Kofi Boakye. 2016. A large contextual dataset for classification, detection and counting of cars with deep learning. In Proceedings of the European Conference on Computer Vision. 785--800.
[30]
Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In Proceedings of the European Conference on Computer Vision. 483--499.
[31]
Viet-Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, and Ryuzo Okada. 2015. Count Forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In Proceedings of the IEEE International Conference on Computer Vision. 3253--3261.
[32]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7263--7271.
[33]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91--99.
[34]
Deepak Babu Sam, Shiv Surya, and R. Venkatesh Babu. 2017. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4031--4039.
[35]
Chong Shang, Haizhou Ai, and Bo Bai. 2016. End-to-end crowd counting via joint learning local and global count. In Proceedings of the IEEE International Conference on Image Processing. 1215--1219.
[36]
Xin Tan, Chun Tao, Tongwei Ren, Jinhui Tang, and Gangshan Wu. 2019. Crowd counting via multi-layer regression. In Proceedings of the ACM International Conference on Multimedia. 1907--1915.
[37]
Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE International Conference on Computer Vision. 9627--9636.
[38]
Jia Wan and Antoni Chan. 2019. Adaptive density map generation for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision. 1130--1139.
[39]
Chuan Wang, Hua Zhang, Liang Yang, Si Liu, and Xiaochun Cao. 2015. Deep people counting in extremely dense crowds. In Proceedings of the ACM International Conference on Multimedia. 1299--1302.
[40]
Jiaqi Wang, Kai Chen, Shuo Yang, Chen Change Loy, and Dahua Lin. 2019 a. Region proposal by guided anchoring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2965--2974.
[41]
Li Wang, Yao Lu, Hong Wang, Yingbin Zheng, Hao Ye, and Xiangyang Xue. 2017. Evolving boxes for fast vehicle detection. In Proceedings of the IEEE international conference on multimedia and Expo. 1135--1140.
[42]
Yi Wang, Junhui Hou, and Lap-Pui Chau. 2019 b. Object counting in video surveillance using multi-scale density map regression. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 2422--2426.
[43]
Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision. 3--19.
[44]
Jian Xiong, Liguo Lu, Hengbing Wang, Jie Yang, and Guan Gui. 2019. Object-level trajectories based fine-grained action recognition in visual IoT applications. IEEE Access, Vol. 7 (July 2019), 103629--103638.
[45]
Fan Yang, Heng Fan, Peng Chu, Erik Blasch, and Haibin Ling. 2019 a. Clustered object detection in aerial images. arXiv preprint arXiv:1904.08008 (2019).
[46]
Xue Yang, Jirui Yang, Junchi Yan, Yue Zhang, Tengfei Zhang, Zhi Guo, Xian Sun, and Kun Fu. 2019 b. SCRDet: Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE International Conference on Computer Vision. 8232--8241.
[47]
Chen Zhang and Joohee Kim. 2019. Object detection with location-aware deformable convolution and backward attention filtering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9452--9461.
[48]
Shanshan Zhang, Jian Yang, and Bernt Schiele. 2018. Occluded pedestrian detection through guided attention in CNNs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6995--7003.
[49]
Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 589--597.
[50]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016a. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2921--2929.
[51]
Chengju Zhou, Meiqing Wu, and Siew-Kei Lam. 2019. SSA-CNN: Semantic self-attention CNN for pedestrian detection. arXiv preprint arXiv:1902.09080 (2019).
[52]
Yi Zhou, Li Liu, Ling Shao, and Matt Mellor. 2016b. DAVE: A unified framework for fast vehicle detection and annotation. In Proceedings of the European Conference on Computer Vision. 278--293.
[53]
Pengfei Zhu, Longyin Wen, Xiao Bian, Ling Haibin, and Qinghua Hu. 2018. Vision meets drones: A challenge. arXiv preprint arXiv:1804.07437 (2018).

Cited By

View all
  • (2025)A survey of deep learning methods for density estimation and crowd countingVicinagearth10.1007/s44336-024-00011-82:1Online publication date: 6-Feb-2025
  • (2024)Identification and counting of the reeling cocoon number per thread for the automatic silk reeling machineTextile Research Journal10.1177/00405175241260398Online publication date: 7-Oct-2024
  • (2024)A Novel Framework for Vehicle Detection and Tracking in Night Ware Surveillance SystemsIEEE Access10.1109/ACCESS.2024.341726712(88075-88085)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. intelligent transportation system
  3. vehicle counting
  4. vehicle detection

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Sichuan Science and Technology Program
  • Foundation for Department of Transportation of Henan Province
  • Fundamental Research Funds for the Central Universities

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)31
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A survey of deep learning methods for density estimation and crowd countingVicinagearth10.1007/s44336-024-00011-82:1Online publication date: 6-Feb-2025
  • (2024)Identification and counting of the reeling cocoon number per thread for the automatic silk reeling machineTextile Research Journal10.1177/00405175241260398Online publication date: 7-Oct-2024
  • (2024)A Novel Framework for Vehicle Detection and Tracking in Night Ware Surveillance SystemsIEEE Access10.1109/ACCESS.2024.341726712(88075-88085)Online publication date: 2024
  • (2024)Eliminating and mining strategies for open-world object proposalNeurocomputing10.1016/j.neucom.2024.128026599(128026)Online publication date: Sep-2024
  • (2024)Adaptive learning-enhanced lightweight network for real-time vehicle density estimationThe Visual Computer10.1007/s00371-024-03572-341:4(2857-2873)Online publication date: 30-Jul-2024
  • (2023)A Smart Traffic Control System Based on Pixel-Labeling and SORT TrackerIEEE Access10.1109/ACCESS.2023.329948811(80973-80985)Online publication date: 2023
  • (2022)Dense Traffic Detection at Highway-Railroad Grade CrossingsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2022.314094823:9(15498-15511)Online publication date: Sep-2022
  • (2022)Learning Selective Assignment Network for Scene-Aware Vehicle Detection2022 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP46576.2022.9897860(1366-1370)Online publication date: 16-Oct-2022
  • (2022)WSNetKnowledge-Based Systems10.1016/j.knosys.2022.109727255:COnline publication date: 14-Nov-2022
  • (2022)Le-SKT: Lightweight traffic density estimation method based on structured knowledge transferInformation Sciences10.1016/j.ins.2022.06.047607(947-960)Online publication date: Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media