Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering

Penghang Yin¹,
Minh Pham¹,
Adam Oberman² &
…
Stanley Osher¹

736 Accesses
10 Citations
Explore all metrics

Abstract

In this paper, we propose an implicit gradient descent algorithm for the classic k-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed stochastic backward Euler and the recent entropy stochastic gradient descent for improving the training of deep neural networks. Numerical experiments on various synthetic and real datasets show that the proposed algorithm provides better clustering results compared to k-means algorithms in the sense that it decreased the objective function (the cluster) and is much more robust to initialization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

An Approach Towards Learning K-Means-Friendly Deep Latent Representation

Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses

Article Open access 13 May 2024

Deep K-Means: A Simple and Effective Method for Data Clustering

References

Artina, M., Fornasier, M., Solombrino, F.: Linearly constrained nonsmooth and nonconvex minimization. SIAM J. Optim. 23(3), 1904–1937 (2013)
Article MathSciNet Google Scholar
Arthur, D., Vassilvitskii, S.: \(k\)-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics (2007)
Baldassi, C., Ingrosso, A., Lucibello, C., Saglietti, L., Zecchina, R.: Subdominant dense clusters allow for simple learning and high computational performance in neural networks with discrete synapses. Phys. Rev. Lett. 115(12), 101–128 (2015)
Article Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (2008)
Google Scholar
Bottou, L., Bengio, Y.: Convergence properties of the \(k\)-means algorithms. Adv. Neural Inf. Process. Syst. 3, 82 (1995)
Google Scholar
Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y., Baldassi, C., Borgs, C., Chayes, J., Sagun, L., Zecchina, R.: Entropy-SGD: Biasing Gradient Descent into Wide Valleys (2016). arXiv:1611.01838
Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep Relaxation: Partial Differential Equations for Optimizing Deep Neural Networks (2017). arXiv:1704.04932
Ding, Y., Zhao, Y., Shen, X., Musuvathi, M., Mytkowicz, T.: Yinyang \(k\)-means: a drop-in replacement of the classic \(k\)-means with consistent speedup. In: Proceedings of the 32nd International Conference on Machine Learning (2015)
Elkan, C.: Using the triangle inequality to accelerate \(k\)-means. In: Proceedings of the 20th International Conference on Machine Learning (2003)
Kaplan, A., Tichatschke, R.: Proximal point method and nonconvex optimization. J. Global Optim. 13, 389–406 (1998)
Article MathSciNet Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Article MathSciNet Google Scholar
Moreau, J.-J.: Proximité et dualité dans un espace hilbertien. Bull. Soc. Math. France 93, 273–299 (1965)
Article MathSciNet Google Scholar
Newling, J., Fleuret, F.: Nested mini-batch \(k\)-means. In: Advances in Neural Information Processing Systems, pp. 1352–1360 (2016)
Rockafellar, R.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)
Article MathSciNet Google Scholar
Sculley, D.: Web-scale \(k\)-means clustering. In: Proceedings of the 19th International Conference on World wide web. ACM (2010)
Tang, C., Monteleoni, C.: Convergence Rate of Stochastic \(k\)-Means (2016). arXiv:1610.04900

Download references

Acknowledgements

This work was partially supported by AFOSR Grant FA9550-15-1-0073 and ONR Grant N00014-16-1-2157. We would like to thank Dr. Bao Wang for helpful discussions. We also thank the anonymous reviewers for their constructive comments.

Author information

Authors and Affiliations

Department of Mathematics, University of California, Los Angeles, CA, 90095, USA
Penghang Yin, Minh Pham & Stanley Osher
Department of Mathematics and Statistics, McGill University, Montreal, Canada
Adam Oberman

Authors

Penghang Yin
View author publications
You can also search for this author in PubMed Google Scholar
Minh Pham
View author publications
You can also search for this author in PubMed Google Scholar
Adam Oberman
View author publications
You can also search for this author in PubMed Google Scholar
Stanley Osher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Penghang Yin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yin, P., Pham, M., Oberman, A. et al. Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering. J Sci Comput 77, 1133–1146 (2018). https://doi.org/10.1007/s10915-018-0744-4

Download citation

Received: 20 October 2017
Revised: 07 May 2018
Accepted: 22 May 2018
Published: 31 May 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s10915-018-0744-4

Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Approach Towards Learning K-Means-Friendly Deep Latent Representation

Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses

Deep K-Means: A Simple and Effective Method for Data Clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-Means Clustering

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Approach Towards Learning K-Means-Friendly Deep Latent Representation

Sub-One Quasi-Norm-Based k-Means Clustering Algorithm and Analyses

Deep K-Means: A Simple and Effective Method for Data Clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation