Abstract
Clustering is an important problem in data mining. It can be formulated as a nonsmooth, nonconvex optimization problem. For the most global optimization techniques this problem is challenging even in medium size data sets. In this paper, we propose an approach that allows one to apply local methods of smooth optimization to solve the clustering problems. We apply an incremental approach to generate starting points for cluster centers which enables us to deal with nonconvexity of the problem. The hyperbolic smoothing technique is applied to handle nonsmoothness of the clustering problems and to make it possible application of smooth optimization algorithms to solve them. Results of numerical experiments with eleven real-world data sets and the comparison with state-of-the-art incremental clustering algorithms demonstrate that the smooth optimization algorithms in combination with the incremental approach are powerful alternative to existing clustering algorithms.
Similar content being viewed by others
References
Al-Sultan, K.S.: A tabu search approach to the clustering problem. Pattern Recognit. 28(9), 1443–1451 (1995)
Bache, K., Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2013). http://archive.ics.uci.edu/ml.
Bagirov, A.M.: Modified global \(k\)-means algorithm for sum-of-squares clustering problems. Pattern Recognit. 41(10), 3192–3199 (2008)
Bagirov, A.M., Karasozen, B., Sezer, M.: Discrete gradient method: A derivative free method for nonsmooth optimization. J. Optim. Theory Appl. 137, 317–334 (2008)
Bagirov, A.M., Rubinov, A.M., Yearwood, J.: A global optimisation approach to classification. Optim. Eng. 3(2), 129–155 (2002)
Bagirov, A.M., Rubinov, A.M., Soukhoroukova, N.V., Yearwood, J.: Supervised and unsupervised data classification via nonsmooth and global optimization. TOP: Span. Oper. Res. J. 11(1), 1–93 (2003)
Bagirov, A.M., Ugon, J.: An algorithm for minimizing clustering functions. Optimization 54(4–5), 351–368 (2005)
Bagirov, A.M., Ugon, J., Webb, D.: Fast modified global \(k\)-means algorithm for sum-of-squares clustering problems. Pattern Recognit. 44, 866–876 (2011)
Bagirov, A.M., Yearwood, J.: A new nonsmooth optimization algorithm for minimum sum-of-squares clustering problems. Eur. J. Oper. Res. 170, 578–596 (2006)
Bagirov, A.M., Al Nuaimat, A., Sultanova, N.: Hyperbolic smoothing method for minimax problems. Optimization 62(6), 759–782 (2013)
Bock, H.H.: Clustering and neural networks. In: Rizzi, A., Vichi, M., Bock, H.H. (eds.) Advances in Data Science and Classification, pp. 265–277. Springer, Berlin (1998)
Brown, D.E., Entail, C.L.: A practical application of simulated annealing to the clustering problem. Pattern Recognit. 25, 401–412 (1992)
Diehr, G.: Evaluation of a branch and bound algorithm for clustering. SIAM J. Sci. Stat. Comput. 6, 268–284 (1985)
Dubes, R., Jain, A.K.: Clustering techniques: the user’s dilemma. Pattern Recognit. 8, 247–260 (1976)
GAMS: The Solver Manuals. GAMS Development Corporation, Washington, D.C. (2012)
Hansen, P., Jaumard, B.: Cluster analysis and mathematical programming. Math. Programm. 79(1–3), 191–215 (1997)
Hansen, P., Mladenovic, N.: \(J\)-means: a new heuristic for minimum sum-of-squares clustering. Pattern Recognit. 4, 405–413 (2001)
Hansen, P., Mladenovic, N.: Variable neighborhood decomposition search. J. Heuristic. 7, 335–350 (2001)
Koontz, W.L.G., Narendra, P.M., Fukunaga, K.: A branch and bound clustering algorithm. IEEE Trans. Comput. 24, 908–915 (1975)
Jensen, R.E.: A dynamic programming algorithm for cluster analysis. Appl. Stat. 28, 1034–1057 (1969)
Likas, A., Vlassis, M., Verbeek, J.: The global \(k\)-means clustering algorithm. Pattern Recognit. 36, 451–461 (2003)
du Merle, O., Hansen, P., Jaumard, B., Mladenovic, N.: An interior point method for minimum sum-of-squares clustering. SIAM J. Sci. Comput. 21, 1485–1505 (2001)
Reinelt, G.: TSP-LIB-A traveling salesman library. ORSA J. Comput. 3, 319–350 (1991)
Selim, S.Z., Al-Sultan, K.S.: A simulated annealing algorithm for the clustering. Pattern Recognit. 24(10), 1003–1008 (1991)
Spath, H.: Cluster Analysis Algorithms. Ellis Horwood Limited, Chichester (1980)
Sun, L.X., Xie, Y.L., Song, X.H., Wang, J.H., Yu, R.Q.: Cluster analysis by simulated annealing. Comput. Chem. 18, 103–108 (1994)
Teboulle, M.: A unified continuous optimization framework for center-based clustering methods. J. Mach. Learn. Res. 8, 65–102 (2007)
Xavier, A.E.: The hyperbolic smoothing clustering method. Pattern Recognit. 43(3), 731–737 (2010)
Xavier, A.E., Oliveira, A.A.F.D.: Optimal covering of plane domains by circles via hyperbolic smoothing. J. Glob. Optim. 31(3), 493–504 (2005)
Xavier, A.E., Xavier, V.L.: Solving the minimum sum-of-squares clustering problem by hyperbolic smoothing and partition into boundary and gravitational regions. Pattern Recognit. 44(1), 70–77 (2011)
Acknowledgments
Dr. Burak Ordin acknowledges TUBITAK for its support of his visit to the University of Ballarat, Australia. This research by A. M. Bagirov was supported under Australian Research Council’s Discovery Projects funding scheme (Project No. DP140103213). We are grateful to two anonymous referees for their comments and criticism that helped the authors to significantly improve the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bagirov, A.M., Ordin, B., Ozturk, G. et al. An incremental clustering algorithm based on hyperbolic smoothing. Comput Optim Appl 61, 219–241 (2015). https://doi.org/10.1007/s10589-014-9711-7
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-014-9711-7