[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3338533.3366687acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multi-Dilation Network for Crowd Counting

Published: 10 January 2020 Publication History

Abstract

With the growth of urban population, crowd analysis has become an important and necessary task in the field of computer vision. The goal of crowd counting, which is a subfield of crowd analysis, is to count the number of people in an image or a zone of a picture. Due to the problems like heavy occlusions, perspective and luminous intensity variations, it is still extremely challenging to achieve crowd counting. Recent state-of-the-art approaches are mainly designed with convolutional neural networks to generate density maps. In this work, Multi-Dilation Network (MDNet) is proposed to solve the problem of crowd counting in congested scenes. The MDNet is made up of two parts: a VGG-16 based front end for feature extraction and a back end containing multi-dilation blocks to generate density maps. Especially, a multi-dilation block has four branches which are used to collect features in different sizes. By using dilated convolutional operations, the multi-dilation block could obtain various features while the maximum kernel size is still 3 x 3. The experiments on two challenging crowd counting datasets, UCF_CC_50 and ShanghaiTech, have shown that the proposed MDNet achieves better performances than other state-of-the-art methods, with a lower mean absolute error and mean squared error. Comparing to the network with multi-scale blocks which adopt larger kernels to extract features, MDNet still gains competitive performances with fewer model parameters.

References

[1]
Deepak Babu Sam, Neeraj N Sajjan, R Venkatesh Babu, and Mukundhan Srinivasan. 2018. Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN. In Computer Vision and Pattern Recognition, IEEE Conference on. 3618--3626.
[2]
Lokesh Boominathan, Srinivas S S Kruthiventi, and R Venkatesh Babu. 2016. CrowdNet: A Deep Convolutional Network for Dense Crowd Counting. In Multimedia, ACM International Conference on. 640--644.
[3]
Antoni B Chan and Nuno Vasconcelos. 2009. Bayesian Poisson Regression for Crowd Counting. In Internatinal Conference on Computer Vision, IEEE Conference on. 545--551.
[4]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2017. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. Pattern Analysis and Machine Intelligence, IEEE Transactions on (2017), 834--848.
[5]
Navneet Dalal and Bill Triggs. 2005. Histograms of Oriented Gradients for Human Detection. In Computer Vision and Pattern Recognition, IEEE Conference on. 886--893.
[6]
Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-Source Multi-Scale Counting in Extremely Dense Crowd Images. In Computer Vision and Pattern Recognition, IEEE Conference on. 2547--2554.
[7]
Victor Lempitsky and Andrew Zisserman. 2010. Learning to Count Objects in Images. In Advances in Neural Information Processing Systems, Advances in. 1324--1332.
[8]
Teng Li, Huan Chang, Meng Wang, Bingbing Ni, Richang Hong, and Shuicheng Yan. 2015. Crowded Scene Analysis: A Survey. Circuits and Systems for Video Technology, IEEE Transactions on (2015), 367--386.
[9]
Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. In Computer Vision and Pattern Recognition, IEEE Conference on. 1091--1100.
[10]
Sheng-Fuu Lin, Jaw-Yeh Chen, and Hung-Xin Chao. 2001. Estimation of Number of People in Crowded Scenes Using Perspective Transformation. Systems, Man, and Cybernetics-Part A: Systems and Humans, IEEE Transactions on (2001), 645--654.
[11]
David G Lowe. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision (2004), 91--110.
[12]
Chen Change Loy, Ke Chen, Shaogang Gong, and Tao Xiang. 2013. Crowd Counting and Profiling: Methodology and Evaluation. In Modeling, simulation and visual analysis of crowds. 347--382.
[13]
Mark Marsden, Kevin McGuinness, Suzanne Little, and Noel E O'Connor. 2016. Fully Convolutional Crowd Counting on Highly Congested Scenes. arXiv preprint arXiv:1612.00220 (2016).
[14]
Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. 2017. Pytorch: Tensors and Dynamic Neural Networks in Python with Strong GPU Acceleration. (2017). https://github.com/pytorch/pytorch
[15]
Viet-Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, and Ryuzo Okada. 2015. COUNT Forest: CO-voting Uncertain Number of Targets using Random Forest for Crowd Density Estimation. In International Conference on Computer Vision, IEEE Conference on. 3253--3261.
[16]
Zhilin Qiu, Lingbo Liu, Guanbin Li, Qing Wang, Nong Xiao, and Liang Lin. 2019. Crowd counting via multi-view scale aggregation networks. In International Conference on Multimedia and Expo, IEEE Conference on, 1498--1503.
[17]
Deepak Babu Sam, Shiv Surya, and R Venkatesh Babu. 2017. Switching Convolutional Neural Network for Crowd Counting. In Computer Vision and Pattern Recognition, IEEE Conference on. 4031--4039.
[18]
Zenglin Shi, Le Zhang, Yun Liu, Xiaofeng Cao, Yangdong Ye, Ming-Ming Cheng, and Guoyan Zheng. 2018. Crowd Counting with Deep Negative Correlation Learning. In Computer Vision and Pattern Recognition, IEEE Conference on. 5382--5390.
[19]
Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).
[20]
Vishwanath A Sindagi and Vishal M Patel. 2017. CNN-based Cascaded Multitask Learning of High-level Prior and Density Estimation for Crowd Counting. In Advanced Video and Signal Based Surveillance, IEEE Conference on. 1--6.
[21]
Vishwanath A Sindagi and Vishal M Patel. 2018. A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognition Letters (2018), 3--16.
[22]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition, IEEE Conference on. 1--9.
[23]
Paul Viola and Michael J Jones. 2004. Robust Real-Time Face Detection. International Journal of Computer Vision (2004), 137--154.
[24]
Fisher Yu and Vladlen Koltun. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. In International Conference on Learning Representations. 1--13.
[25]
Beibei Zhan, Dorothy N Monekosso, Paolo Remagnino, Sergio A Velastin, and Li-Qun Xu. 2008. Crowd analysis: a survey. Machine Vision and Applications (2008), 345--357.
[26]
Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-Image Crowd Counting via Multi-Column Convolutional Neural Network. In Computer Vision and Pattern Recognition, IEEE Conference on. 589--597.

Cited By

View all
  • (2022)Densely Enhanced Semantic Network for Conversation System in Social MediaACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350179918:4(1-24)Online publication date: 4-Mar-2022
  • (2022)Improving Crowd Density Estimation by Fusing Aerial Images and Radio SignalsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/349234618:3(1-23)Online publication date: 4-Mar-2022
  • (2020)Lightweight solution to background noise in crowd counting2020 7th NAFOSTED Conference on Information and Computer Science (NICS)10.1109/NICS51282.2020.9335834(185-190)Online publication date: 26-Nov-2020

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia
December 2019
403 pages
ISBN:9781450368414
DOI:10.1145/3338533
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Crowd counting
  2. convolutional neural network
  3. dilated convolution

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Natural Science Foundation of China

Conference

MMAsia '19
Sponsor:
MMAsia '19: ACM Multimedia Asia
December 15 - 18, 2019
Beijing, China

Acceptance Rates

MMAsia '19 Paper Acceptance Rate 59 of 204 submissions, 29%;
Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Densely Enhanced Semantic Network for Conversation System in Social MediaACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350179918:4(1-24)Online publication date: 4-Mar-2022
  • (2022)Improving Crowd Density Estimation by Fusing Aerial Images and Radio SignalsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/349234618:3(1-23)Online publication date: 4-Mar-2022
  • (2020)Lightweight solution to background noise in crowd counting2020 7th NAFOSTED Conference on Information and Computer Science (NICS)10.1109/NICS51282.2020.9335834(185-190)Online publication date: 26-Nov-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media