Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.17514 (cs)

[Submitted on 27 Feb 2024 (v1), last revised 15 Aug 2024 (this version, v2)]

Title:Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

Authors:Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chan

Abstract:The existing crowd counting models require extensive training data, which is time-consuming to annotate. To tackle this issue, we propose a simple yet effective crowd counting method by utilizing the Segment-Everything-Everywhere Model (SEEM), an adaptation of the Segmentation Anything Model (SAM), to generate pseudo-labels for training crowd counting models. However, our initial investigation reveals that SEEM's performance in dense crowd scenes is limited, primarily due to the omission of many persons in high-density areas. To overcome this limitation, we propose an adaptive resolution SEEM to handle the scale variations, occlusions, and overlapping of people within crowd scenes. Alongside this, we introduce a robust localization method, based on Gaussian Mixture Models, for predicting the head positions in the predicted people masks. Given the mask and point pseudo-labels, we propose a robust loss function, which is designed to exclude uncertain regions based on SEEM's predictions, thereby enhancing the training process of the counting networks. Finally, we propose an iterative method for generating pseudo-labels. This method aims at improving the quality of the segmentation masks by identifying more tiny persons in high-density regions, which are often missed in the first pseudo-labeling stage. Overall, our proposed method achieves the best unsupervised performance in crowd counting, while also being comparable results to some supervised methods. This makes it a highly effective and versatile tool for crowd counting, especially in situations where labeled data is not available.

Comments:	Accepted to ECCV 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.17514 [cs.CV]
	(or arXiv:2402.17514v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.17514

Submission history

From: Jia Wan [view email]
[v1] Tue, 27 Feb 2024 13:55:17 UTC (23,275 KB)
[v2] Thu, 15 Aug 2024 09:38:42 UTC (8,406 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators