Improved local search algorithms for Bregman k-means and its variants

Xiaoyun Tian¹,
Dachuan Xu²,
Longkun Guo ORCID: orcid.org/0000-0003-2891-4253³ &
…
Dan Wu⁴

244 Accesses
Explore all metrics

Abstract

In this paper, we consider the Bregman k-means problem (BKM) which is a variant of the classical k-means problem. For an n-point set \({\mathcal {S}}\) and \(k \le n\) with respect to \(\mu \)-similar Bregman divergence, the BKM problem aims first to find a center subset \(C \subseteq {\mathcal {S}}\) with \( \mid C \mid = k\) and then separate \({\mathcal {S}}\) into k clusters according to C, such that the sum of \(\mu \)-similar Bregman divergence from each point in \({\mathcal {S}}\) to its nearest center is minimized. We propose a \(\mu \)-similar BregMeans++ algorithm by employing the local search scheme, and prove that the algorithm deserves a constant approximation guarantee. Moreover, we extend our algorithm to solve a variant of BKM called noisy \(\mu \)-similar Bregman k-means++ (noisy \(\mu \)-BKM++) which is BKM in the noisy scenario. For the same instance and purpose as BKM, we consider the case of sampling a point with an imprecise probability by a factor between \(1-\varepsilon _1\) and \(1+ \varepsilon _2\) for \(\varepsilon _1 \in [0,1)\) and \(\varepsilon _2 \ge 0\), and obtain an approximation ratio of \(O(\log ^2 k)\) in expectation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

An Improved Bregman k-means++ Algorithm via Local Search

The spherical k-means++ algorithm via local search scheme

Article 20 April 2021

A Semi Brute-Force Search Approach for (Balanced) Clustering

Article 01 August 2023

References

Ackermann M Blömer J (2009) Coresets and approximate clustering for Bregman divergences. In: Proceedings of SODA, pp 1088–1097
Ackermann M, Blömer J (2010) Bregman clustering for separable instances. In: Proceedings of SWAT, pp 212–223
Ackermann M, Blömer J, Sohler C (2010) Clustering for metric and non-metric distance measures. ACM Trans Algorithms 6(4):1–59
Article MathSciNet Google Scholar
Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035
Banerjee A, Guo X, Wang H (2005) On the optimality of conditional expectation as a Bregman predictor. IEEE Trans Inf Theory 51(7):2664–2669
Article MathSciNet Google Scholar
Banerjee A, Merugu S, Dhillon I, Ghosh J (2005) Clustering with Bregman divergences. J Mach Learn Res 6:1705–1749
MathSciNet MATH Google Scholar
Belhassena A, Wang H (2019) Trajectory big data processing based on frequent activity. Tsinghua Sci Technol 24(3):317–332
Article Google Scholar
Bhattacharya A, Eube J, Heiko Röglin H, Schmidt MN (2020) Greedy and Not So Greedy \(k\)-means++. In: Proceedings of ESA, pp 18:1-18:21
Bregman L (1967) The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput Math Math Phys 7:200–217
Article MathSciNet Google Scholar
Censor Y, Lent A (1981) An iterative rowaction method for interval convex programming. J Optim Theory Appl 34(3):321–353
Article MathSciNet Google Scholar
Choo D, Grunau C, Portmann J, Rozhon V (2020) \(k\)-means++: few more steps yield constant approximation In: Proceedings of the 37th International Conference on Machine Learning (ICML), pp 1909–1917
Censor Y, Zenios S (1997) Parallel optimization: theory, algorithms, and applications. Oxford University Press, Oxford
MATH Google Scholar
Feldman D, Schmidt M, Sohler C (2020) Turning big data into tiny data: constant-size coresets for \(k\)-means, PCA and projective clustering. SIAM J Comput 49(3):601–657
Article MathSciNet Google Scholar
Jain A, Dubes R (1988) Algorithms for clustering data. Prentice Hall, New Jersey
MATH Google Scholar
Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surveys 31:264–323
Article Google Scholar
Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) A local search approximation algorithm for \(k\)-means clustering. In: Proceedings of SoCG, pp 10–18
Kumar A, Sabharwal Y, Sen S. A simple linear time \((1+\varepsilon )\)-approximation algorithm for \(k\)-means clustering in any dimensions. In: Proceedings of FOCS, pp 454–462
Lattanzi S, Sohler C (2019) A better \(k\)-means++ algorithm via local search. In: Proceedings of ICML, pp 3662–3671
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
Article MathSciNet Google Scholar
Wang N, Guo G, Wang B, Wang C (2020) Traffic clustering algorithm of urban data brain based on a hybrid-augmented architecture of quantum annealing and brain-inspired cognitive computing. Tsinghua Sci Technol 25(6):813–825
Article Google Scholar

Download references

Acknowledgements

The first two authors are supported by National Natural Science Foundation of China (No. 11871081) and Beijing Natural Science Foundation Project No. Z200002. The third author is supported by National Natural Science Foundation of China (No. 61772005) and Outstanding Youth Innovation Team Project for Universities of Shandong Province (No. 2020KJN008). The fourth author is supported by National Natural Science Foundation of China (No.11701150).

Author information

Authors and Affiliations

Department of Operations Research and Information Engineering, Beijing University of Technology, Beijing, 100124, People’s Republic of China
Xiaoyun Tian
Beijing Institute for Scientific and Engineering Computing, Beijing University of Technology, Beijing, 100124, People’s Republic of China
Dachuan Xu
School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250353, People’s Republic of China
Longkun Guo
School of Mathematics and Statistics, Henan University of Science and Technology, Luoyang, 471023, People’s Republic of China
Dan Wu

Authors

Xiaoyun Tian
View author publications
You can also search for this author in PubMed Google Scholar
Dachuan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Longkun Guo
View author publications
You can also search for this author in PubMed Google Scholar
Dan Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Longkun Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in Proceedings of the 26th International Computing and Combinatorics Conference, pp. 532–541, 2020.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, X., Xu, D., Guo, L. et al. Improved local search algorithms for Bregman k-means and its variants. J Comb Optim 44, 2533–2550 (2022). https://doi.org/10.1007/s10878-021-00771-9

Download citation

Accepted: 09 June 2021
Published: 28 June 2021
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10878-021-00771-9

Improved local search algorithms for Bregman k-means and its variants

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved Bregman k-means++ Algorithm via Local Search

The spherical k-means++ algorithm via local search scheme

A Semi Brute-Force Search Approach for (Balanced) Clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Improved local search algorithms for Bregman k-means and its variants

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Improved Bregman k-means++ Algorithm via Local Search

The spherical k-means++ algorithm via local search scheme

A Semi Brute-Force Search Approach for (Balanced) Clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now