A novel two-stage omni-supervised face clustering algorithm

139 Accesses
Explore all metrics

Abstract

Face clustering has applications in organizing personal photo album, video understanding and automatic labeling of data for semi-supervised learning. Many existing methods cannot cluster millions of faces. They are either too slow, inaccurate, or need a lot memory. In our paper, we proposed a two stage unsupervised clustering algorithm which can cluster millions of faces in minutes. A rough clustering using greedy Transitive Closure (TC) algorithm to separate the easy to locate clusters, then a more precise non-greedy clustering algorithm is used to split the clusters into smaller clusters. We also developed a set of omni-supervised transformations that can produce multiple embeddings using a single trained model as if there are multiple models trained. These embeddings are combined using simple averaging and normalization. We carried out extensive experiments with multiple datasets of different sizes comparing with existing state of the art clustering algorithms to show that our clustering algorithm is robust to differences between datasets, efficient and outperforms existing methods. We also carried out further analysis on number of singleton clusters and variations of our model using different non-greedy clustering algorithms. We did trained our semi-supervised model using the cluster labels and shown that our clustering algorithm is effective for semi-supervised learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Face clustering using a weighted combination of deep representations

Article 29 October 2021

ClusterFace: Clustering-Driven Deep Face Recognition

Combining Feature Extraction and Clustering for Better Face Recognition

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request

References

Radosavovic I, Dollár P, Girshick R, Gkioxari G, He K (2018) Data distillation: towards omni-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4119–4128
Sarfraz S, Sharma V, Stiefelhagen R (2019) Efficient parameter-free clustering using first neighbor relations. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8934–8943
Yang L, Zhan X, Chen D, Yan J, Loy CC, Lin D (2019) learning to cluster faces on an affinity graph. in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2298–2306
Wang Z, Zheng L, Li Y, Wang S (2019) Linkage based face clustering via graph convolution network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1117–1125
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Iscen A, Tolias G, Avrithis Y, Chum O (2019) Label propagation for deep semi-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5070–5079
Wang S, Meng J, Yuan J, Tan Y-P (2019) Joint representative selection and feature learning: a semi-supervised approach. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6005–6013
Wu S, Li J, Liu C, Yu Z, Wong H-S (2019) Mutual learning of complementary networks via residual correction for improving semi-supervised classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6500–6509
Li Q, Wu X-M, Liu H, Zhang X, Guan Z(2019) Label efficient semi-supervised learning via graph filtering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9582–9591
Wu S, Deng G, Li J, Li R, Yu Z, Wong H-S (2019) Enhancing triplegan for semi-supervised conditional instance synthesis and classification. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp 10091–10100
Yu B, Wu J, Ma J, Zhu Z (2019) Tangent-normal adversarial regularization for semi-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 10676–10684
Jiang B, Zhang Z, Lin D, Tang J, Luo B (2019) Semi-supervised learning with graph learning-convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11313–11320
Qiao S, Shen W, Zhang Z, Wang B, Yuille A (2018) Deep co-training for semi-supervised image recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 135–152
Robert T, Thome N, Cord M (2018) Hybridnet: classification and reconstruction cooperation for semi-supervised learning. In: Proceedings of the European conference on computer vision (ECCV), pp 153–169
Chen Y, Zhu X, Gong S (2018) Semi-supervised deep learning with memory. In: Proceedings of the European conference on computer vision (ECCV), pp 268–283
Shi W, Gong Y, Ding C, MaXiaoyu Tao Z, Zheng N (2018) Transductive semi-supervised deep learning using min-max features. In: Proceedings of the European conference on computer vision (ECCV), pp 299–315
Cicek S, Fawzi A, Soatto S (2018) Saas: speed as a supervisor for semi-supervised learning. In: Proceedings of the European conference on computer vision (ECCV), pp 149–163
Liu Y, Song G, Shao J, Jin X, Wang X (2018) Transductive centroid projection for semi-supervised large-scale recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 70–86
Coelho de Castro D, Nowozin S (2018) From face recognition to models of identity: a bayesian approach to learning about unknown identities from unsupervised data. In: Proceedings of the European conference on computer vision (ECCV), pp 745–761
Kumar V, Namboodiri A, Jawahar C (2018) Semi-supervised annotation of faces in image collection. Signal Image Video Process 12(1):141–149
Article Google Scholar
Sharma V, Tapaswi M, Sarfraz MS, Stiefelhagen R (2019) Self-supervised learning of face representations for video face clustering. arXiv preprint arXiv:1903.01000
Zhan X, Liu Z, Yan J, Lin D, Change Loy C (2018) Consensus-driven propagation in massive unlabeled data for face recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 568–583
Shen S, Li W, Zhu Z, Huang G, Du D, Lu J, Zhou J(2021) Structure aware face clustering on a large-scale graph with 107 nodes. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, IEEE, pp 9085–9094
Nguyen XB, Bui DT, Duong CN, Bui TD, Luu K (2021) Clusformer: a transformer based clustering approach to unsupervised large-scale face and visual landmark recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, IEEE, pp 10847–10856
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sign Process Lett 23(10):1499–1503
Article Google Scholar
Whitelam C, Taborsky E, Blanton A, Maze B, Adams J, Miller T, Kalka N, Jain AK, Duncan JA, Allen K, et al. (2017) Iarpa janus benchmark-b face dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 90–98
Guo Y, Zhang L, Hu Y, He X, Gao J(2016) Ms-celeb-1m: a dataset and benchmark for large-scale face recognition. In: European conference on computer vision, Springer, pp 87–102
Amigó E, Gonzalo J, Artiles J, Verdejo F (2009) A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retriev 12(4):461–486
Article Google Scholar
Zhan X (2019) Implementation of “Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition” (CDP). GitHub
Yang L (2019) Learning to cluster faces on an affinity graph (CVPR 2019). GitHub
Wang Z (2019) Linkage-based face clustering via graph convolution network. GitHub
Yang L, Chen D, Zhan X, Zhao R, Loy CC, Lin D (2020) Learning to cluster faces via confidence and connectivity estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13369–13378
Liu Y, Zhang G, Wang H, Zhao W, Zhang M, Qin H (2019) An efficient super-resolution network based on aggregated residual transformations. Electronics 8(3):339
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work

Author information

Authors and Affiliations

Department of Industrial Systems Engineering and Management, National University of Singapore, 1 Engineering Drive 2, Singapore, 117576, Singapore
Sing Kuang Tan
School of Computing and Artificial Intelligence, Southwest Jiaotong University, Xi’an Road, Chengdu, 611756, Sichuan, China
Xiu Wang

Authors

Sing Kuang Tan
View author publications
You can also search for this author in PubMed Google Scholar
Xiu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiu Wang.

Ethics declarations

Conflict of interest

The authors declare they have no financial interests

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Implementation details of our clustering algorithm

We use the TC clustering algorithm as the backbone algorithm in our clustering algorithm. Our clustering algorithm first does a greedy clustering (TC clustering), next a non-greedy algorithm and lastly propagation of cluster labels from labeled embeddings to unlabeled embeddings (developed by us). Our whole algorithm is written entirely in python and we released the codes in our github site (state the site). We used the data from the papers [3] and [4]. All clustering experiments are ran on a desktop machine with Intel Core i7-7700 CPU @3.60GHz. We use only one core (without any parallel processing) in our experiments. Our clustering algorithm is simple and can run on most CPU using only single core.We use only one core (without any parallel processing) in our experiments. Our clustering algorithm is simple and can run on most CPU using only single core. We have uploaded our codes to github. It can be accessed through the link https://github.com/singkuangtan/face-clustering.

For Table 6, we use embeddings trained using softmax loss function and as for semi-supervised learning (Table 12), we use embeddings trained using ArcFace loss function.

Appendix 2: Python functions and packages

Table 13 shows the python functions and packages we use for the experiments.

Table 13 List of python functions and packages

Full size table

Appendix 3: Relationship of the distances in four cases

We begin by describing a set of properties,

$$\begin{aligned} d_{min}(C_0,C_1)&=\text {min}_{i\in C_0\, j\in C_1, n(i,j)=1} \Vert e_i-e_j\Vert \nonumber \\ d_{max}(C_0)&=\text {max}_{i\in C_0, j\in C_0, n(i,j)=1} \Vert e_i-e_j\Vert \nonumber \\ d_{max}(C_1)&=\text {max}_{i\in C_1, j\in C_1, n(i,j)=1} \Vert e_i-e_j\Vert \nonumber \\ d_{avg}(C_0,C_1)&=\frac{1}{|C_0||C_1|}\sum _{i\in C_0\, j\in C_1} \Vert e_i-e_j\Vert \nonumber \\ d_{avg}(C_0)&=\frac{1}{|C_0|}\sum _{i\in C_0} \Vert e_i-\sum _{j\in C_0} e_j\Vert \nonumber \\ d_{avg}(C_1)&=\frac{1}{|C_1|}\sum _{i\in C_1} \Vert e_i-\sum _{j\in C_1} e_j\Vert \end{aligned}$$

(12)

where $C_0$ is the set of embedding indices for cluster 0 and likewise $C_1$ is the set of embedding indices for cluster 1. $n(i,j)=1$ if embedding i and j are neighbors else it is a 0. $e_i$ or $e_j$ is an embedding with index i or j.

For Case 0, a greedy clustering algorithm with a large threshold can separate the two clusters. Mathematically, it is

$$\begin{aligned} d_{min}(C_0,C_1)>> max(d_{max}(C_0),d_{max}(C_1)) \end{aligned}$$

(13)

where max is a maximum function of the two input values and $>>$ means much greater than (by a few times).

For case 1, a greedy clustering algorithm can separate the two clusters, but the gap between the clusters is smaller and therefore a smaller threshold is used. Mathematically, it is

$$\begin{aligned} d_{min}(C_0,C_1) > max(d_{max}(C_0),d_{max}(C_1)). \end{aligned}$$

(14)

For case 2, there is a bridge that connects nearest neighbor embeddings from the two clusters. Therefore no threshold using a greedy algorithm is able to separate them. However, the mean interclass distance is still larger than the mean intraclass distance. This property enables the clusters to be separated by non-greedy clustering algorithm such as Kmeans. Mathematically, it is

$$\begin{aligned} d_{min}(C_0,C_1)&< max(d_{max}(C_0),d_{max}(C_1))\nonumber \\ d_{avg}(C_0,C_1)&>d_{avg}(C_0)\nonumber \\ d_{avg}(C_0,C_1)&>d_{avg}(C_1). \end{aligned}$$

(15)

For case 3, although the singleton cluster 0 is separated from main cluster 1 using a threshold and greedy clustering algorithm, the ground truth class of cluster 0 is the same as cluster 1 due to random outlier noise. So the singleton cluster 0 should be combined with cluster 1. Mathematically, it is

$$\begin{aligned} d_{min}(C_0,C_1)&> d_{max}(C_1)\nonumber \\ |C_0|&=1\nonumber \\ |C_1|&>>1 \end{aligned}$$

(16)

where $>>$ means it is much greater (a few times greater).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tan, S.K., Wang, X. A novel two-stage omni-supervised face clustering algorithm. Pattern Anal Applic 27, 83 (2024). https://doi.org/10.1007/s10044-024-01298-5

Download citation

Received: 08 September 2022
Accepted: 14 June 2024
Published: 09 July 2024
DOI: https://doi.org/10.1007/s10044-024-01298-5

A novel two-stage omni-supervised face clustering algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Face clustering using a weighted combination of deep representations

ClusterFace: Clustering-Driven Deep Face Recognition

Combining Feature Extraction and Clustering for Better Face Recognition

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Implementation details of our clustering algorithm

Appendix 2: Python functions and packages

Appendix 3: Relationship of the distances in four cases

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A novel two-stage omni-supervised face clustering algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Face clustering using a weighted combination of deep representations

ClusterFace: Clustering-Driven Deep Face Recognition

Combining Feature Extraction and Clustering for Better Face Recognition

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Implementation details of our clustering algorithm

Appendix 2: Python functions and packages

Appendix 3: Relationship of the distances in four cases

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now