More Web Proxy on the site http://driver.im/

Article

Scale Aggregation Network for Accurate and Efficient Crowd Counting

Authors:

Fei SuAuthors Info & Claims

Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part V

Pages 757 - 773

https://doi.org/10.1007/978-3-030-01228-1_45

Published: 08 September 2018 Publication History

Abstract

In this paper, we propose a novel encoder-decoder network, called Scale Aggregation Network (SANet), for accurate and efficient crowd counting. The encoder extracts multi-scale features with scale aggregation modules and the decoder generates high-resolution density maps by using a set of transposed convolutions. Moreover, we find that most existing works use only Euclidean loss which assumes independence among each pixel but ignores the local correlation in density maps. Therefore, we propose a novel training loss, combining of Euclidean loss and local pattern consistency loss, which improves the performance of the model in our experiments. In addition, we use normalization layers to ease the training process and apply a patch-based test scheme to reduce the impact of statistic shift problem. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments on four major crowd counting datasets and our method achieves superior performance to state-of-the-art methods while with much less parameters.

References

[1]

Zhan B, Monekosso DN, Remagnino P, Velastin SA, and Xu LQ Crowd analysis: a survey Mach. Vis. Appl. 2008 19 5–6 345-357

[2]

Li T, Chang H, Wang M, Ni B, Hong R, and Yan S Crowded scene analysis: a survey IEEE Trans. Circuits Syst. Video Technol. 2015 25 3 367-386

[3]

Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)

[4]

Oñoro-Rubio D and López-Sastre RJ Leibe B, Matas J, Sebe N, and Welling M Towards perspective-free object counting with deep learning Computer Vision – ECCV 2016 2016 Cham Springer 615-629

[5]

Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 6 (2017)

[6]

Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNS. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1879–1888. IEEE (2017)

[7]

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

[8]

Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

[9]

Szegedy, C., et al.: Going deeper with convolutions. IEEE (2015)

[10]

Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for neural networks for image processing. IEEE Trans. Comput. Imaging (2017)

[11]

Wang Z, Bovik AC, Sheikh HR, and Simoncelli EP Image quality assessment: from error visibility to structural similarity IEEE Trans. Image Process. 2004 13 4 600-612

[12]

Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. CoRR, abs/1703.06868 (2017)

[13]

Ge, W., Collins, R.T.: Marked point processes for crowd counting. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2913–2920. IEEE (2009)

[14]

Li, M., Zhang, Z., Huang, K., Tan, T.: Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008)

[15]

Dollar P, Wojek C, Schiele B, and Perona P Pedestrian detection: an evaluation of the state of the art IEEE Trans. Pattern Anal. Mach. Intell. 2012 34 4 743-761

[16]

Felzenszwalb PF, Girshick RB, McAllester D, and Ramanan D Object detection with discriminatively trained part-based models IEEE Trans. Pattern Anal. Mach. Intell. 2010 32 9 1627-1645

[17]

Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005. vol. 1, pp. 878–885. IEEE (2005)

[18]

Chan, A.B., Vasconcelos, N.: Bayesian poisson regression for crowd counting. In: IEEE 12th International Conference on Computer Vision, pp. 545–551. IEEE (2009)

[19]

Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC, vol. 1, p. 3 (2012)

[20]

Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications, DICTA 2009, pp. 81–88. IEEE (2009)

[21]

Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2547–2554. IEEE (2013)

[22]

Ng PC and Henikoff S Sift: predicting amino acid changes that affect protein function Nucleic Acids Res. 2003 31 13 3812-3814

[23]

Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems, pp. 1324–1332 (2010)

[24]

Pham, V.Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3253–3261 (2015)

[25]

Sindagi VA and Patel VM A survey of recent advances in cnn-based single image crowd counting and density estimation Pattern Recognit. Lett. 2017 107 3-16

[26]

Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1299–1302. ACM (2015)

[27]

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

[28]

Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 833–841. IEEE (2015)

[29]

Walach E and Wolf L Leibe B, Matas J, Sebe N, and Welling M Learning to count with CNN boosting Computer Vision – ECCV 2016 2016 Cham Springer 660-676

[30]

Shang, C., Ai, H., Bai, B.: End-to-end crowd counting via joint learning local and global count. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 1215–1219. IEEE (2016)

[31]

Boominathan, L., Kruthiventi, S.S., Babu, R.V.: Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 640–644. ACM (2016)

[32]

Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)

[33]

Li, Y., Zhang, X., Chen, D.: CSRNET: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

[34]

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

[35]

Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)

[36]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

[37]

Paszke, A., et al.: Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration, May 2017

[38]

Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7. IEEE (2008)

[39]

Huang S et al. Body structure aware deep crowd counting IEEE Trans. Image Process. 2018 27 3 1049-1059

Cited By

Jiang SJie QCheng FLiu YYao KLi C(2025)Distillation-boosted heterogeneous architecture search for aphid countingExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125936265:COnline publication date: 15-Mar-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125936
Xu JZhang ZLi XLi WYu KJin CHe LSong MWang R(2024)Attention Mixture Network for Crowd Counting via Binarization TransferProceedings of the 2nd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice10.1145/3688867.3690172(45-53)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3688867.3690172
Zhou LRao SLi WHu BSun B(2024)Multi-branch progressive embedding network for crowd countingImage and Vision Computing10.1016/j.imavis.2024.105140148:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.105140
Show More Cited By

Index Terms

Scale Aggregation Network for Accurate and Efficient Crowd Counting

Index terms have been assigned to the content through auto-classification.

Recommendations

MSCANet: Adaptive Multi-scale Context Aggregation Network for Congested Crowd Counting
MultiMedia Modeling
Abstract
Crowd counting has achieved significant progress with deep convolutional neural networks. However, most of the existing methods don’t fully utilize spatial context information, and it is difficult for them to count the congested crowd accurately. ...
A survey of crowd counting and density estimation based on convolutional neural network
Abstract
Crowd counting and crowd density estimation methods are of great significance in the field of public security. Estimating crowd density and counting from single image or video frame has become an essential part of a computer vision system in ...
SA-InterNet: Scale-Aware Interaction Network for Joint Crowd Counting and Localization
Pattern Recognition and Computer Vision
Abstract
Crowd counting and crowd localization are essential and challenging tasks due to uneven distribution and scale variation. Recent studies have shown that crowd counting and localization can complement and guide each other from two different ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part V

Sep 2018

829 pages

ISBN:978-3-030-01227-4

DOI:10.1007/978-3-030-01228-1

Editors:
Vittorio Ferrari
Google Research, Zurich, Switzerland
,
Martial Hebert
Carnegie Mellon University, Pittsburgh, PA, USA
,
Cristian Sminchisescu
Google Research, Zurich, Switzerland
,
Yair Weiss
Hebrew University of Jerusalem, Jerusalem, Israel

© Springer Nature Switzerland AG 2018.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 08 September 2018

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

52
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang SJie QCheng FLiu YYao KLi C(2025)Distillation-boosted heterogeneous architecture search for aphid countingExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125936265:COnline publication date: 15-Mar-2025
https://dl.acm.org/doi/10.1016/j.eswa.2024.125936
Xu JZhang ZLi XLi WYu KJin CHe LSong MWang R(2024)Attention Mixture Network for Crowd Counting via Binarization TransferProceedings of the 2nd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice10.1145/3688867.3690172(45-53)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3688867.3690172
Zhou LRao SLi WHu BSun B(2024)Multi-branch progressive embedding network for crowd countingImage and Vision Computing10.1016/j.imavis.2024.105140148:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.105140
Ma CZeng JShao PQing AWang Y(2024)MetaUSACCExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123228247:COnline publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123228
Li YJia RHu YSun H(2024)A lightweight dense crowd density estimation network for efficient compression modelsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122069238:PDOnline publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.122069
Zhang ZZhang LZhang HGuo YWang HLu X(2024)AMSA-CAFF NetExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121602237:PCOnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121602
Kong WYu ZLi HZhang J(2024)Cross-modal misalignment-robust feature fusion for crowd countingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108898136:PAOnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108898
Hobley MPrisacariu V(2024)ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-Agnostic CountingComputer Vision – ECCV 202410.1007/978-3-031-73247-8_18(304-319)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-73247-8_18
Mo HZhang XTan JYang CGu QHang BRen W(2024)CountFormer: Multi-view Crowd Counting TransformerComputer Vision – ECCV 202410.1007/978-3-031-72943-0_2(20-40)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72943-0_2
Zhang JYe LWu JSun DWu C(2023)A Fusion-Based Dense Crowd Counting Method for Multi-Imaging SystemsInternational Journal of Intelligent Systems10.1155/2023/66776222023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/6677622
Show More Cited By

View Options

View options

Figures

Tables

Media

View Table of Conten