[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-030-01228-1_45guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Scale Aggregation Network for Accurate and Efficient Crowd Counting

Published: 08 September 2018 Publication History

Abstract

In this paper, we propose a novel encoder-decoder network, called Scale Aggregation Network (SANet), for accurate and efficient crowd counting. The encoder extracts multi-scale features with scale aggregation modules and the decoder generates high-resolution density maps by using a set of transposed convolutions. Moreover, we find that most existing works use only Euclidean loss which assumes independence among each pixel but ignores the local correlation in density maps. Therefore, we propose a novel training loss, combining of Euclidean loss and local pattern consistency loss, which improves the performance of the model in our experiments. In addition, we use normalization layers to ease the training process and apply a patch-based test scheme to reduce the impact of statistic shift problem. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments on four major crowd counting datasets and our method achieves superior performance to state-of-the-art methods while with much less parameters.

References

[1]
Zhan B, Monekosso DN, Remagnino P, Velastin SA, and Xu LQ Crowd analysis: a survey Mach. Vis. Appl. 2008 19 5–6 345-357
[2]
Li T, Chang H, Wang M, Ni B, Hong R, and Yan S Crowded scene analysis: a survey IEEE Trans. Circuits Syst. Video Technol. 2015 25 3 367-386
[3]
Zhang, Y., Zhou, D., Chen, S., Gao, S., Ma, Y.: Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597 (2016)
[4]
Oñoro-Rubio D and López-Sastre RJ Leibe B, Matas J, Sebe N, and Welling M Towards perspective-free object counting with deep learning Computer Vision – ECCV 2016 2016 Cham Springer 615-629
[5]
Sam, D.B., Surya, S., Babu, R.V.: Switching convolutional neural network for crowd counting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, p. 6 (2017)
[6]
Sindagi, V.A., Patel, V.M.: Generating high-quality crowd density maps using contextual pyramid CNNS. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1879–1888. IEEE (2017)
[7]
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
[8]
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
[9]
Szegedy, C., et al.: Going deeper with convolutions. IEEE (2015)
[10]
Zhao, H., Gallo, O., Frosio, I., Kautz, J.: Loss functions for neural networks for image processing. IEEE Trans. Comput. Imaging (2017)
[11]
Wang Z, Bovik AC, Sheikh HR, and Simoncelli EP Image quality assessment: from error visibility to structural similarity IEEE Trans. Image Process. 2004 13 4 600-612
[12]
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. CoRR, abs/1703.06868 (2017)
[13]
Ge, W., Collins, R.T.: Marked point processes for crowd counting. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2913–2920. IEEE (2009)
[14]
Li, M., Zhang, Z., Huang, K., Tan, T.: Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4. IEEE (2008)
[15]
Dollar P, Wojek C, Schiele B, and Perona P Pedestrian detection: an evaluation of the state of the art IEEE Trans. Pattern Anal. Mach. Intell. 2012 34 4 743-761
[16]
Felzenszwalb PF, Girshick RB, McAllester D, and Ramanan D Object detection with discriminatively trained part-based models IEEE Trans. Pattern Anal. Mach. Intell. 2010 32 9 1627-1645
[17]
Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005. vol. 1, pp. 878–885. IEEE (2005)
[18]
Chan, A.B., Vasconcelos, N.: Bayesian poisson regression for crowd counting. In: IEEE 12th International Conference on Computer Vision, pp. 545–551. IEEE (2009)
[19]
Chen, K., Loy, C.C., Gong, S., Xiang, T.: Feature mining for localised crowd counting. In: BMVC, vol. 1, p. 3 (2012)
[20]
Ryan, D., Denman, S., Fookes, C., Sridharan, S.: Crowd counting using multiple local features. In: Digital Image Computing: Techniques and Applications, DICTA 2009, pp. 81–88. IEEE (2009)
[21]
Idrees, H., Saleemi, I., Seibert, C., Shah, M.: Multi-source multi-scale counting in extremely dense crowd images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2547–2554. IEEE (2013)
[22]
Ng PC and Henikoff S Sift: predicting amino acid changes that affect protein function Nucleic Acids Res. 2003 31 13 3812-3814
[23]
Lempitsky, V., Zisserman, A.: Learning to count objects in images. In: Advances in Neural Information Processing Systems, pp. 1324–1332 (2010)
[24]
Pham, V.Q., Kozakaya, T., Yamaguchi, O., Okada, R.: Count forest: co-voting uncertain number of targets using random forest for crowd density estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3253–3261 (2015)
[25]
Sindagi VA and Patel VM A survey of recent advances in cnn-based single image crowd counting and density estimation Pattern Recognit. Lett. 2017 107 3-16
[26]
Wang, C., Zhang, H., Yang, L., Liu, S., Cao, X.: Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 1299–1302. ACM (2015)
[27]
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
[28]
Zhang, C., Li, H., Wang, X., Yang, X.: Cross-scene crowd counting via deep convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 833–841. IEEE (2015)
[29]
Walach E and Wolf L Leibe B, Matas J, Sebe N, and Welling M Learning to count with CNN boosting Computer Vision – ECCV 2016 2016 Cham Springer 660-676
[30]
Shang, C., Ai, H., Bai, B.: End-to-end crowd counting via joint learning local and global count. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 1215–1219. IEEE (2016)
[31]
Boominathan, L., Kruthiventi, S.S., Babu, R.V.: Crowdnet: a deep convolutional network for dense crowd counting. In: Proceedings of the 2016 ACM on Multimedia Conference, pp. 640–644. ACM (2016)
[32]
Sindagi, V.A., Patel, V.M.: CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–6. IEEE (2017)
[33]
Li, Y., Zhang, X., Chen, D.: CSRNET: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
[34]
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
[35]
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015)
[36]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
[37]
Paszke, A., et al.: Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration, May 2017
[38]
Chan, A.B., Liang, Z.S.J., Vasconcelos, N.: Privacy preserving crowd monitoring: counting people without people models or tracking. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–7. IEEE (2008)
[39]
Huang S et al. Body structure aware deep crowd counting IEEE Trans. Image Process. 2018 27 3 1049-1059

Cited By

View all
  • (2025)Distillation-boosted heterogeneous architecture search for aphid countingExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125936265:COnline publication date: 15-Mar-2025
  • (2024)Attention Mixture Network for Crowd Counting via Binarization TransferProceedings of the 2nd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice10.1145/3688867.3690172(45-53)Online publication date: 28-Oct-2024
  • (2024)Multi-branch progressive embedding network for crowd countingImage and Vision Computing10.1016/j.imavis.2024.105140148:COnline publication date: 1-Aug-2024
  • Show More Cited By

Index Terms

  1. Scale Aggregation Network for Accurate and Efficient Crowd Counting
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Guide Proceedings
          Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part V
          Sep 2018
          829 pages
          ISBN:978-3-030-01227-4
          DOI:10.1007/978-3-030-01228-1

          Publisher

          Springer-Verlag

          Berlin, Heidelberg

          Publication History

          Published: 08 September 2018

          Author Tags

          1. Crowd counting
          2. Crowd density estimation
          3. Scale Aggregation Network
          4. Local pattern consistency

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 02 Mar 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2025)Distillation-boosted heterogeneous architecture search for aphid countingExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.125936265:COnline publication date: 15-Mar-2025
          • (2024)Attention Mixture Network for Crowd Counting via Binarization TransferProceedings of the 2nd International Workshop on Multimedia Content Generation and Evaluation: New Methods and Practice10.1145/3688867.3690172(45-53)Online publication date: 28-Oct-2024
          • (2024)Multi-branch progressive embedding network for crowd countingImage and Vision Computing10.1016/j.imavis.2024.105140148:COnline publication date: 1-Aug-2024
          • (2024)MetaUSACCExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123228247:COnline publication date: 1-Aug-2024
          • (2024)A lightweight dense crowd density estimation network for efficient compression modelsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122069238:PDOnline publication date: 15-Mar-2024
          • (2024)AMSA-CAFF NetExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121602237:PCOnline publication date: 1-Mar-2024
          • (2024)Cross-modal misalignment-robust feature fusion for crowd countingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108898136:PAOnline publication date: 1-Oct-2024
          • (2024)ABC Easy as 123: A Blind Counter for Exemplar-Free Multi-Class Class-Agnostic CountingComputer Vision – ECCV 202410.1007/978-3-031-73247-8_18(304-319)Online publication date: 29-Sep-2024
          • (2024)CountFormer: Multi-view Crowd Counting TransformerComputer Vision – ECCV 202410.1007/978-3-031-72943-0_2(20-40)Online publication date: 29-Sep-2024
          • (2023)A Fusion-Based Dense Crowd Counting Method for Multi-Imaging SystemsInternational Journal of Intelligent Systems10.1155/2023/66776222023Online publication date: 1-Jan-2023
          • Show More Cited By

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media