More Web Proxy on the site http://driver.im/

research-article

Multi-Dilation Network for Crowd Counting

Authors:

Qinyu LiAuthors Info & Claims

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

Article No.: 56, Pages 1 - 6

https://doi.org/10.1145/3338533.3366687

Published: 10 January 2020 Publication History

Abstract

With the growth of urban population, crowd analysis has become an important and necessary task in the field of computer vision. The goal of crowd counting, which is a subfield of crowd analysis, is to count the number of people in an image or a zone of a picture. Due to the problems like heavy occlusions, perspective and luminous intensity variations, it is still extremely challenging to achieve crowd counting. Recent state-of-the-art approaches are mainly designed with convolutional neural networks to generate density maps. In this work, Multi-Dilation Network (MDNet) is proposed to solve the problem of crowd counting in congested scenes. The MDNet is made up of two parts: a VGG-16 based front end for feature extraction and a back end containing multi-dilation blocks to generate density maps. Especially, a multi-dilation block has four branches which are used to collect features in different sizes. By using dilated convolutional operations, the multi-dilation block could obtain various features while the maximum kernel size is still 3 x 3. The experiments on two challenging crowd counting datasets, UCF_CC_50 and ShanghaiTech, have shown that the proposed MDNet achieves better performances than other state-of-the-art methods, with a lower mean absolute error and mean squared error. Comparing to the network with multi-scale blocks which adopt larger kernels to extract features, MDNet still gains competitive performances with fewer model parameters.

References

[1]

Deepak Babu Sam, Neeraj N Sajjan, R Venkatesh Babu, and Mukundhan Srinivasan. 2018. Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN. In Computer Vision and Pattern Recognition, IEEE Conference on. 3618--3626.

[2]

Lokesh Boominathan, Srinivas S S Kruthiventi, and R Venkatesh Babu. 2016. CrowdNet: A Deep Convolutional Network for Dense Crowd Counting. In Multimedia, ACM International Conference on. 640--644.

Digital Library

[3]

Antoni B Chan and Nuno Vasconcelos. 2009. Bayesian Poisson Regression for Crowd Counting. In Internatinal Conference on Computer Vision, IEEE Conference on. 545--551.

[4]

Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2017. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. Pattern Analysis and Machine Intelligence, IEEE Transactions on (2017), 834--848.

[5]

Navneet Dalal and Bill Triggs. 2005. Histograms of Oriented Gradients for Human Detection. In Computer Vision and Pattern Recognition, IEEE Conference on. 886--893.

[6]

Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-Source Multi-Scale Counting in Extremely Dense Crowd Images. In Computer Vision and Pattern Recognition, IEEE Conference on. 2547--2554.

[7]

Victor Lempitsky and Andrew Zisserman. 2010. Learning to Count Objects in Images. In Advances in Neural Information Processing Systems, Advances in. 1324--1332.

[8]

Teng Li, Huan Chang, Meng Wang, Bingbing Ni, Richang Hong, and Shuicheng Yan. 2015. Crowded Scene Analysis: A Survey. Circuits and Systems for Video Technology, IEEE Transactions on (2015), 367--386.

[9]

Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. In Computer Vision and Pattern Recognition, IEEE Conference on. 1091--1100.

[10]

Sheng-Fuu Lin, Jaw-Yeh Chen, and Hung-Xin Chao. 2001. Estimation of Number of People in Crowded Scenes Using Perspective Transformation. Systems, Man, and Cybernetics-Part A: Systems and Humans, IEEE Transactions on (2001), 645--654.

[11]

David G Lowe. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision (2004), 91--110.

Digital Library

[12]

Chen Change Loy, Ke Chen, Shaogang Gong, and Tao Xiang. 2013. Crowd Counting and Profiling: Methodology and Evaluation. In Modeling, simulation and visual analysis of crowds. 347--382.

[13]

Mark Marsden, Kevin McGuinness, Suzanne Little, and Noel E O'Connor. 2016. Fully Convolutional Crowd Counting on Highly Congested Scenes. arXiv preprint arXiv:1612.00220 (2016).

[14]

Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. 2017. Pytorch: Tensors and Dynamic Neural Networks in Python with Strong GPU Acceleration. (2017). https://github.com/pytorch/pytorch

[15]

Viet-Quoc Pham, Tatsuo Kozakaya, Osamu Yamaguchi, and Ryuzo Okada. 2015. COUNT Forest: CO-voting Uncertain Number of Targets using Random Forest for Crowd Density Estimation. In International Conference on Computer Vision, IEEE Conference on. 3253--3261.

Digital Library

[16]

Zhilin Qiu, Lingbo Liu, Guanbin Li, Qing Wang, Nong Xiao, and Liang Lin. 2019. Crowd counting via multi-view scale aggregation networks. In International Conference on Multimedia and Expo, IEEE Conference on, 1498--1503.

[17]

Deepak Babu Sam, Shiv Surya, and R Venkatesh Babu. 2017. Switching Convolutional Neural Network for Crowd Counting. In Computer Vision and Pattern Recognition, IEEE Conference on. 4031--4039.

[18]

Zenglin Shi, Le Zhang, Yun Liu, Xiaofeng Cao, Yangdong Ye, Ming-Ming Cheng, and Guoyan Zheng. 2018. Crowd Counting with Deep Negative Correlation Learning. In Computer Vision and Pattern Recognition, IEEE Conference on. 5382--5390.

[19]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2014).

[20]

Vishwanath A Sindagi and Vishal M Patel. 2017. CNN-based Cascaded Multitask Learning of High-level Prior and Density Estimation for Crowd Counting. In Advanced Video and Signal Based Surveillance, IEEE Conference on. 1--6.

[21]

Vishwanath A Sindagi and Vishal M Patel. 2018. A survey of recent advances in CNN-based single image crowd counting and density estimation. Pattern Recognition Letters (2018), 3--16.

[22]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In Computer Vision and Pattern Recognition, IEEE Conference on. 1--9.

[23]

Paul Viola and Michael J Jones. 2004. Robust Real-Time Face Detection. International Journal of Computer Vision (2004), 137--154.

Digital Library

[24]

Fisher Yu and Vladlen Koltun. 2016. Multi-Scale Context Aggregation by Dilated Convolutions. In International Conference on Learning Representations. 1--13.

[25]

Beibei Zhan, Dorothy N Monekosso, Paolo Remagnino, Sergio A Velastin, and Li-Qun Xu. 2008. Crowd analysis: a survey. Machine Vision and Applications (2008), 345--357.

[26]

Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-Image Crowd Counting via Multi-Column Convolutional Neural Network. In Computer Vision and Pattern Recognition, IEEE Conference on. 589--597.

Cited By

Li YWang ZYu J(2022)Densely Enhanced Semantic Network for Conversation System in Social MediaACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350179918:4(1-24)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3501799
Yang KHuang YHuang JHsu YWan CShuai HWang LCheng W(2022)Improving Crowd Density Estimation by Fusing Aerial Images and Radio SignalsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/349234618:3(1-23)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3492346
Thai TLy N(2020)Lightweight solution to background noise in crowd counting2020 7th NAFOSTED Conference on Information and Computer Science (NICS)10.1109/NICS51282.2020.9335834(185-190)Online publication date: 26-Nov-2020
https://doi.org/10.1109/NICS51282.2020.9335834

Index Terms

Multi-Dilation Network for Crowd Counting
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Multi-scale dilated convolution of feature Fusion Network for Crowd counting
Abstract
Crowd counting has long been a challenging task due to the perspective distortion and variability in head size. The previous methods ignore the multi-scale information in images or simply use convolutions with different kernel sizes to extract ...
Multi-scale dilated convolution of convolutional neural network for crowd counting
Abstract
Growing numbers of crowd density estimation methods have been developed in scene monitoring, crowd safety and on-site management scheduling. We proposed a method for density estimation of a single static image based on convolutional neural network ...
A Novel Spatiotemporal Attention Convolutional Neural Network for Video Crowd Counting
AIPR '22: Proceedings of the 2022 5th International Conference on Artificial Intelligence and Pattern Recognition

For most existing crowd counting methods, image-based methods are still used for crowd counting in the presence of video datasets, ignoring powerful time information. Thus, a novel spatiotemporal attention convolutional neural network is proposed to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '19: Proceedings of the 1st ACM International Conference on Multimedia in Asia

December 2019

403 pages

ISBN:9781450368414

DOI:10.1145/3338533

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

MMAsia '19

Sponsor:

SIGMM

MMAsia '19: ACM Multimedia Asia

December 15 - 18, 2019

Beijing, China

Acceptance Rates

MMAsia '19 Paper Acceptance Rate 59 of 204 submissions, 29%;

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
150
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li YWang ZYu J(2022)Densely Enhanced Semantic Network for Conversation System in Social MediaACM Transactions on Multimedia Computing, Communications, and Applications10.1145/350179918:4(1-24)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3501799
Yang KHuang YHuang JHsu YWan CShuai HWang LCheng W(2022)Improving Crowd Density Estimation by Fusing Aerial Images and Radio SignalsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/349234618:3(1-23)Online publication date: 4-Mar-2022
https://dl.acm.org/doi/10.1145/3492346
Thai TLy N(2020)Lightweight solution to background noise in crowd counting2020 7th NAFOSTED Conference on Information and Computer Science (NICS)10.1109/NICS51282.2020.9335834(185-190)Online publication date: 26-Nov-2020
https://doi.org/10.1109/NICS51282.2020.9335834

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten