Abstract
Ubiquitous data are increasingly expanding in large volumes due to human activities, and grouping them into appropriate clusters is an important and yet challenging problem. Existing matrix factorization techniques have shown their significant power in solving this problem, e.g., nonnegative matrix factorization, concept factorization. Recently, one state-of-the-art method called locality-constrained concept factorization is put forward, but its locality constraint does not well reveal the intrinsic data structure since it only requires the concept to be as close to the original data points as possible. To address this issue, we present a graph-based local concept coordinate factorization (GLCF) method, which respects the intrinsic structure of the data through manifold kernel learning in the warped Reproducing Kernel Hilbert Space. Besides, a generalized update algorithm is developed to handle data matrices containing both positive and negative entries. Since GLCF is essentially based on the local coordinate coding and concept factorization, it inherits many advantageous properties, such as the locality and sparsity of the data representation. Moreover, it can better encode the locally geometrical structure via graph Laplacian in the manifold adaptive kernel. Therefore, a more compact and better structured representation can be obtained in the low-dimensional data space. Extensive experiments on several image and gene expression databases suggest the superiority of the proposed method in comparison with some alternatives.
Similar content being viewed by others
References
Abdi H, Williams LJ (2010) Principal component analysis. Wiley Interdiscip Rev: Comput Stat 2(4):433–459
Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inf Process Syst 2:585–592
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434
Bouguila N, Ziou D (2012) A countably infinite mixture model for clustering and feature selection. Knowl Inf Syst 33(2):351–370
Cai D, He X, Han J (2011) Locally consistent concept factorization for document clustering. IEEE Trans Knowl Data Eng 23(7):902–913
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Cai D, He X (2012) Manifold adaptive experimental design for text categorization. IEEE Trans Knowl Data Eng 24(4):707–719
Cai D, Bao H, He X (2011) Sparse concept coding for visual analysis. In: Proceedings of the 24th IEEE conference on computer vision and pattern recognition, pp 2905–2910
Cai D, He X, Wang X, Bao H, Han J (2009) Locality preserving nonnegative matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence, pp 1010–1015
Cheng X, Du P, Guo J, Zhu X, Chen Y (2013) Ranking on data manifold with sink points. IEEE Trans Knowl Data Eng 25(1):177–191
Ding C, Li T, Peng W, Park H (2006) Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM international conference on knowledge discovery and data mining, pp 126–135
Gong P, Zhang C (2012) Efficient nonnegative matrix factorization via projected Newton method. Pattern Recognit 45(9):3557–3565
Guan N, Tao D, Luo Z, Yuan B (2012) NeNMF: An optimal gradient method for nonnegative matrix factorization. IEEE Trans Signal Process 45(9):3557–3565
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: Proceedings of the 19th IEEE conference on computer vision and pattern recognition 2:1735–1742
Hastie T, Tibshirani R, Friedman J, Franklin J (2009) The elements of statistical learning: data mining, inference, and prediction
Hoyer PO, Dayan P (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
Hua W, He X (2011) Discriminative concept factorization for data representation. Neurocomputing 74(18):3800–3807
Kim Y, Chung C, Lee SG, Kim D (2011) Distance approximation techniques to reduce the dimensionality for multimedia databases. Knowl Inf Syst 28(1):227–248
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Li Z, Wu X, Peng H (2010) Nonnegative matrix factorization on orthogonal subspace. Pattern Recognit Lett 31(9):905–911
Li P, Chen C, Bu J (2012) Clustering analysis using manifold kernel concept factorization. Neurocomputing 87:120–131
Li P, Bu J, Chen C, Wang C, Cai D (2013) Subspace learning via locally constrained A-optimal nonnegative projection. Neurocomputing 115:49–62
Li P, Bu J, Chen C, He Z, Cai D (2013) Relational multi-manifold co-clustering. IEEE Trans Cybern. 43(6):1871–1881
Liu H, Wu Z, Cai D, Huang TS (2012) Constrained nonnegative matrix factorization for image representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311
Liu H, Yang Z, Wu Z (2011) Locality-constrained concept factorization. In: Proceedings of the 22nd international joint conference on artificial intelligence, pp 1378–1383
Lovász L, Plummer M (1986) Matching theory. In: North Holland, Akadémiai Kiadó
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Seung D, Lee L (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562
Sha F, Lin Y, Saul LK, Lee DD (2007) Multiplicative updates for nonnegative quadratic programming. Neural Comput, MIT Press 19(8):2004–2031
Sindhwani V, Niyogi P, elkin M (2005) Beyond the point cloud: from transductive to semi-supervised learning. In: Proceedings of the 22nd international conference on machine learning, pp 824–831
Tenenbaum JB, De Silva V, Langford JC, Bauckhage C (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Thurau C, Kersting K, Wahabzada M, Bauckhage C (2011) Convex non-negative matrix factorization for massive datasets. Knowl Inf Syst 29(2):457–478
Tzimiropoulos G, Zafeiriou S, Pantic M (2012) Subspace learning from image gradient orientations. IEEE Trans Pattern Anal Mach Intell 34(12):2454–2466
Xu B, Bu J, Chen C, Cai D (2012) A Bregman divergence optimization framework for ranking on data manifold and its new extensions. In: Proceedings of the 26th AAAI conference on artificial intelligence, pp 1190–1196
Xu W, Gong Y (2004) Document clustering by concept factorization. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, pp 202–209
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, pp 267–273
Yang Z, Oja E (2012) Quadratic nonnegative matrix factorization. Pattern Recognit 45(4):1500–1510
Yu K, Zhang T, Gong Y (2009) Nonlinear learning using local coordinate coding. Adv Neural Inf Process Syst 22:2223–2231
Zhang L, Chen Z, Zheng M, He X (2011) Robust non-negative matrix factorization. Frontiers Electr Electron Eng China 6(2):192–200
Zhang Z, Wang J, Zha H (2012) Adaptive manifold learning. IEEE Trans Pattern Anal Mach Intell 34(2):253–265
Zhu S, Wang D, Yu K, Li T, Gong Y (2010) Feature selection for gene expression using model-based entropy. IEEE/ACM Trans Comput Biol Bioinform 7(1):25–36
Zhu J, Hoi SCH, Lyu MR, Yan S (2008) Near-duplicate keyframe retrieval by nonrigid image matching. In: Proceedings of the 16th ACM international conference on multimedia, pp 41–50
Acknowledgments
We thank the anonymous reviewers for their valuable comments and suggestions which greatly improve the quality of the paper. This work was supported in part by National Natural Science Foundation of China under Grants 91120302, 61222207, 61173185, and 61173186, National Basic Research Program of China (973 Program) under Grant 2013CB336500, the Fundamental Research Funds for the Central Universities under Grant 2012FZA5017, and the Zhejiang Province Key S&T Innovation Group Project under Grant 2009R50009.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, P., Bu, J., Zhang, L. et al. Graph-based local concept coordinate factorization. Knowl Inf Syst 43, 103–126 (2015). https://doi.org/10.1007/s10115-013-0715-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-013-0715-x