[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Improved local search algorithms for Bregman k-means and its variants

  • Published:
Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

In this paper, we consider the Bregman k-means problem (BKM) which is a variant of the classical k-means problem. For an n-point set \({\mathcal {S}}\) and \(k \le n\) with respect to \(\mu \)-similar Bregman divergence, the BKM problem aims first to find a center subset \(C \subseteq {\mathcal {S}}\) with \( \mid C \mid = k\) and then separate \({\mathcal {S}}\) into k clusters according to C, such that the sum of \(\mu \)-similar Bregman divergence from each point in \({\mathcal {S}}\) to its nearest center is minimized. We propose a \(\mu \)-similar BregMeans++ algorithm by employing the local search scheme, and prove that the algorithm deserves a constant approximation guarantee. Moreover, we extend our algorithm to solve a variant of BKM called noisy \(\mu \)-similar Bregman k-means++ (noisy \(\mu \)-BKM++) which is BKM in the noisy scenario. For the same instance and purpose as BKM, we consider the case of sampling a point with an imprecise probability by a factor between \(1-\varepsilon _1\) and \(1+ \varepsilon _2\) for \(\varepsilon _1 \in [0,1)\) and \(\varepsilon _2 \ge 0\), and obtain an approximation ratio of \(O(\log ^2 k)\) in expectation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ackermann M Blömer J (2009) Coresets and approximate clustering for Bregman divergences. In: Proceedings of SODA, pp 1088–1097

  • Ackermann M, Blömer J (2010) Bregman clustering for separable instances. In: Proceedings of SWAT, pp 212–223

  • Ackermann M, Blömer J, Sohler C (2010) Clustering for metric and non-metric distance measures. ACM Trans Algorithms 6(4):1–59

    Article  MathSciNet  Google Scholar 

  • Arthur D, Vassilvitskii S (2007) \(k\)-means++: the advantages of careful seeding. In: Proceedings of SODA, pp 1027–1035

  • Banerjee A, Guo X, Wang H (2005) On the optimality of conditional expectation as a Bregman predictor. IEEE Trans Inf Theory 51(7):2664–2669

    Article  MathSciNet  Google Scholar 

  • Banerjee A, Merugu S, Dhillon I, Ghosh J (2005) Clustering with Bregman divergences. J Mach Learn Res 6:1705–1749

    MathSciNet  MATH  Google Scholar 

  • Belhassena A, Wang H (2019) Trajectory big data processing based on frequent activity. Tsinghua Sci Technol 24(3):317–332

    Article  Google Scholar 

  • Bhattacharya A, Eube J, Heiko Röglin H, Schmidt MN (2020) Greedy and Not So Greedy \(k\)-means++. In: Proceedings of ESA, pp 18:1-18:21

  • Bregman L (1967) The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming. USSR Comput Math Math Phys 7:200–217

    Article  MathSciNet  Google Scholar 

  • Censor Y, Lent A (1981) An iterative rowaction method for interval convex programming. J Optim Theory Appl 34(3):321–353

    Article  MathSciNet  Google Scholar 

  • Choo D, Grunau C, Portmann J, Rozhon V (2020) \(k\)-means++: few more steps yield constant approximation In: Proceedings of the 37th International Conference on Machine Learning (ICML), pp 1909–1917

  • Censor Y, Zenios S (1997) Parallel optimization: theory, algorithms, and applications. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Feldman D, Schmidt M, Sohler C (2020) Turning big data into tiny data: constant-size coresets for \(k\)-means, PCA and projective clustering. SIAM J Comput 49(3):601–657

    Article  MathSciNet  Google Scholar 

  • Jain A, Dubes R (1988) Algorithms for clustering data. Prentice Hall, New Jersey

    MATH  Google Scholar 

  • Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surveys 31:264–323

    Article  Google Scholar 

  • Kanungo T, Mount D, Netanyahu N, Piatko C, Silverman R, Wu A (2002) A local search approximation algorithm for \(k\)-means clustering. In: Proceedings of SoCG, pp 10–18

  • Kumar A, Sabharwal Y, Sen S. A simple linear time \((1+\varepsilon )\)-approximation algorithm for \(k\)-means clustering in any dimensions. In: Proceedings of FOCS, pp 454–462

  • Lattanzi S, Sohler C (2019) A better \(k\)-means++ algorithm via local search. In: Proceedings of ICML, pp 3662–3671

  • Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137

    Article  MathSciNet  Google Scholar 

  • Wang N, Guo G, Wang B, Wang C (2020) Traffic clustering algorithm of urban data brain based on a hybrid-augmented architecture of quantum annealing and brain-inspired cognitive computing. Tsinghua Sci Technol 25(6):813–825

    Article  Google Scholar 

Download references

Acknowledgements

The first two authors are supported by National Natural Science Foundation of China (No. 11871081) and Beijing Natural Science Foundation Project No. Z200002. The third author is supported by National Natural Science Foundation of China (No. 61772005) and Outstanding Youth Innovation Team Project for Universities of Shandong Province (No. 2020KJN008). The fourth author is supported by National Natural Science Foundation of China (No.11701150).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Longkun Guo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A preliminary version of this paper appeared in Proceedings of the 26th International Computing and Combinatorics Conference, pp. 532–541, 2020.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tian, X., Xu, D., Guo, L. et al. Improved local search algorithms for Bregman k-means and its variants. J Comb Optim 44, 2533–2550 (2022). https://doi.org/10.1007/s10878-021-00771-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-021-00771-9

Keywords

Navigation