Abstract
Owing to sparseness, directly clustering high-dimensional data is still a challenge problem. Therefore, obtaining their low-dimensional compact representation by dimensional reduction is an effective method for clustering high-dimensional data. Most of existing dimensionality reduction methods, however, are developed originally for classification (such as Linear Discriminant Analysis) or recovering the geometric structure (known as manifold) of high-dimensional data (such as Locally Linear Embedding) rather than clustering purpose. Hence, a novel nonlinear discriminant clustering by dimensional reduction based on spectral regularization is proposed. The contributions of the proposed method are two folds: (1) it can obtain nonlinear low-dimensional representation that can recover the intrinsic manifold structure as well as enhance the cluster structure of the original high-dimensional data; (2) the clustering results can also be obtained in the dimensionality reduction procedure. Firstly, the desired low-dimensional coordinates are represented as linear combinations of predefined smooth vectors with respect to the data manifold, which are characterized by a weighted graph. Then, the optimal combination coefficients and the optimal cluster assignment matrix are computed by maximizing the ratio between the between-cluster scatter and the total scatter simultaneously as well as preserving the smoothness of the cluster assignment matrix with respect to the data manifold. Finally, the optimization problem is solved in an iterative procedure, which is proved to be convergent. Experiments on UCI data sets and real world data sets demonstrated the effectiveness of the proposed method for both clustering and visualization high-dimensional data set.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural comput 15(6):1373–1396
Belkin M, Niyogi P (2004) Semi-supervised learning on riemannian manifolds. Mach Learn 56(1):209–239
Boutsidis C, Mahoney M, Drineas P (2009) Unsupervised feature selection for the k-means clustering problem. In: NIPS 2009
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and k-means clustering. In: Proceedings of the 24th international conference on Machine learning. ACM, New York, pp 521–528
Ding CH, He X, Zha H, Gu M, Simon HD (2001) A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of the 2001 IEEE international conference on data mining, pp 107–114
Hagen L, Kahng AB (1992) New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput Aided Design Integr Circ Syst 11(9):1074–1085
Hou C, Nie F, Zhang C, Wu Y (2009) Learning a subspace for face image clustering via trace ratio criterion. Opt Eng 48:060501–060503
Jianbo SY, Yu SX, Shi J (2003) Multiclass spectral clustering. In: Proceedings of 9th IEEE international conference on computer vision, pp 313–319
Li Z, Liu J, Tang X (2009) Constrained clustering via spectral regularization. In: IEEE conference on computer vision and pattern recognition, 2009 (CVPR’09)
Lovász L, Plummer MD (2009) Matching theory. Chelsea Pub Co, New York
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17:395–416
Nene SA, Nayar SK, Murase H (1996) Columbia object image library (coil-20). Technical report, Department of Computer Science, Columbia University, New York. http://www.cs.columbia.edu/CAVE/coil-20.html
Nie F, Xu D, Tsang IW, Zhang C (2009) Spectral embedded clustering. In: Proceedings of the 21st international jont conference on artifical intelligence. Morgan Kaufmann Publishers Inc., San Francisco, pp 1181–1186
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290:2323–2326
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
la Torre FD, Kanade T (2006) Discriminative cluster analysis. In: Proceedings of the 23rd international conference on machine learning. ACM, New York, pp 241–248
Yan S, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51. doi:10.1109/TPAMI.2007.250598
Ye J, Zhao Z, Liu H (2007) Adaptive distance metric learning for clustering. In: IEEE conference on computer vision and pattern recognition (CVPR’07), pp 1–7
Ye J, Zhao Z, Wu M (2008) Discriminative k-means for clustering. In: Platt J, Koller D, Singer Y, Roweis S (eds) Advances in neural information processing systems, vol 20. MIT Press, Cambridge, pp 1649–1656
Zelnik-Manor L, Perona P (2005) Self-tuning spectral clustering. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17. MIT Press, Cambridge, pp 1601–1608
Zha H, He X, Ding C, Simon H (2002) Spectral relaxation for k-means clustering. In: Advances in neural information processing systems, vol 2. MIT Press, Cambridge, pp 1057–1064
Acknowledgments
This work is supported by Natural Science Foundation of China (60970034, 60603015), and the Foundation for Author of National Excellent Doctoral Dissertation (Grant No. 2007B4).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhan, Y., Yin, J. & Liu, X. Nonlinear discriminant clustering based on spectral regularization. Neural Comput & Applic 22, 1599–1608 (2013). https://doi.org/10.1007/s00521-012-0929-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-0929-y