Abstract
Unsupervised feature selection is designed to select a subset of informative features from unlabeled data to avoid the issue of ‘curse of dimensionality’ and thus achieving efficient calculation and storage. In this paper, we integrate the feature-level self-representation property, a low-rank constraint, a hypergraph regularizer, and a sparsity inducing regularizer (i.e., an \(\ell _{2,1}\)-norm regularizer) in a unified framework to conduct unsupervised feature selection. Specifically, we represent each feature by other features to rank the importance of features via the feature-level self-representation property. We then embed a low-rank constraint to consider the relations among features and a hypergarph regularizer to consider both the high-order relations and the local structure of the samples. We finally use an \(\ell _{2,1}\)-norm regularizer to result in low-sparsity to output informative features which satisfy the above constraints. The resulting feature selection model thus takes into account both the global structure of the samples (via the low-rank constraint) and the local structure of the data (via the hypergraph regularizer), rather than only considering each of them used in the previous studies. This enables the proposed model more robust than the previous models due to achieving the stable feature selection model. Experimental results on benchmark datasets showed that the proposed method effectively selected the most informative features by removing the adverse effect of redundant/nosiy features, compared to the state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/.
- 2.
Available at http://see.xidian.edu.cn/vipsl/database_Face.html.
- 3.
Available at http://archive.ics.uci.edu/ml/.
References
Cai, X., Ding, C., Nie, F., Huang, H.: On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In: SIGKDD, pp. 1124–1132 (2013)
Cao, J., Wu, Z., Wu, J.: Scaling up cosine interesting pattern discovery: a depth-first method. Inf. Sci. 266(5), 31–46 (2014)
Cheng, D., Zhang, S., Liu, X., Sun, K., Zong, M.: Feature selection by combining subspace learning with sparse representation. Multimedia Syst., 1–7 (2015)
Gao, L., Song, J., Nie, F., Yan, Y., Sebe, N., Tao Shen, H.: Optimal graph learning with partial tags and multiple features for image and video annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4371–4379 (2015)
Gao, L.L., Song, J., Shao, J., Zhu, X., Shen, H.T.: Zero-shot image categorization by image correlation exploration. In: ICMR, pp. 487–490 (2015)
Gheyas, I.A., Smith, L.S.: Feature subset selection in large dimensionality domains. Pattern Recogn. 43(1), 5–13 (2010)
Gu, Q., Li, Z., Han, J.: Joint feature selection and subspace learning. IJCAI 22, 1294–1299 (2011)
He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: NIPS, pp. 507–514 (2005)
Hou, C., Nie, F., Li, X., Yi, D., Wu, Y.: Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans. Cybern. 44(6), 793–804 (2013)
Hu, R., Zhu, X., Cheng, D., He, W., Yan, Y., Song, J., Zhang, S.: Graph self-representation method for unsupervised feature selection. Neurocomputing (2016)
Huang, Y., Liu, Q., Lv, F., Gong, Y., Metaxas, D.N.: Unsupervised image categorization by hypergraph partition. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1266–1273 (2011)
Jie, C., Wu, Z., Wu, J., Hui, X.: Sail: summation-based incremental learning for information-theoretic text clustering. ieee trans. syst. man cybern. part b cybern. 43(2), 570–584 (2013). A Publication of the IEEE Systems Man & Cybernetics Society
Lewandowski, M., Makris, D., Velastin, S., Nebel, J.-C.: Structural Laplacian eigenmaps for modeling sets of multivariate sequences. IEEE Trans. Cybern. 44(6), 936–949 (2014)
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, Y., Ma, Y.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Softw. Eng. 35 (2013)
Liu, G., Lin, Z., Yu, Y.: Robust subspace segmentation by low-rank representation. In: CVPR, pp. 663–670 (2010)
Liu, R., Yang, N., Ding, X., Ma, L.: An unsupervised feature selection algorithm: Laplacian score combined with distance-based entropy measure. In: IITA, pp. 65–68 (2009)
Maugis, C., Celeux, G., Martin-Magniette, M.L.: Variable selection for clustering with gaussian mixture models. Biometrics 65(3), 701–709 (2009)
Nie, F., Huang, H., Cai, X., Ding, C.H.: Efficient and robust feature selection via joint \(\ell _{2,1}\)-norms minimization. In: NIPS, pp. 1813–1821 (2010)
Nie, F., Xiang, S., Jia, Y., Zhang, C., Yan, S.: Trace ratio criterion for feature selection. In: AAAI, pp. 671–676 (2008)
Peng, Y., Long, X., Lu, B.L.: Graph based semi-supervised learning via structure preserving low-rank representation. Neural Process. Lett. 41(3), 389–406 (2015)
Qin, Y., Zhang, S., Zhu, X., Zhang, J., Zhang, C.: Semi-parametric optimization for missing data imputation. Appl. Intell. 27(1), 79–88 (2007)
Shi, X., Guo, Z., Lai, Z., Yang, Y., Bao, Z., Zhang, D.: A framework of joint graph embedding and sparse regression for dimensionality reduction. IEEE Trans. Image Process. 24(4), 1341–1355 (2015). A Publication of the IEEE Signal Processing Society
Sunzhong, L.V., Jiang, H., Zhao, L., Wang, D., Fan, M.: Manifold based fisher method for semi-supervised feature selection. In: FSKD, pp. 664–668 (2013)
Tabakhi, S., Moradi, P., Akhlaghian, F.: An unsupervised feature selection algorithm based on ant colony optimization. Eng. Appl. Artif. Intell. 32, 112–123 (2014)
Unler, A., Murat, A., Chinnam, R.B.: mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf. Sci. 181(20), 4625–4641 (2011)
Wang, D., Nie, F., Huang, H.: Unsupervised feature selection via unified trace ratio formulation and K-means clustering (TRACK). In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8726, pp. 306–321. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44845-8_20
Wang, J.Y., Yao, J., Sun, Y.: Semi-supervised local-learning-based feature selection. In: IJCNN, pp. 1942–1948 (2014)
Wen, J., Lai, Z., Wong, W.K., Cui, J., Wan, M.: Optimal feature selection for robust classification via \(\ell _{2,1}\)-norms regularization. In: ICPR, pp. 517–521 (2014)
Wu, X., Zhang, C., Zhang, S.: Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst. 22(3), 381–405 (2004)
Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Xu, Y., Song, F., Feng, G., Zhao, Y.: A novel local preserving projection scheme for use with face recognition. Expert Syst. Appl. 37(9), 6718–6721 (2010)
Yu, J., Tao, D., Wang, M.: Adaptive hypergraph learning and its application in image classification. IEEE Trans. Image Process. 21(7), 3262–3272 (2012)
Zhang, C., Qin, Y., Zhu, X., Zhang, J., Zhang, S.: Clustering-based missing value imputation for data preprocessing. In: IEEE International Conference on Industrial Informatics, pp. 1081–1086 (2006)
Zhang, S., Cheng, D., Zong, M., Gao, L.: Self-representation nearest neighbor search for classification. Neurocomputing 195, 137–142 (2016)
Zhang, S., Li, X., Zong, M., Zhu, X., Cheng, D.: Learning k for KNN classification. ACM Transactions on Intelligent Systems and Technology (2016)
Zhang, S., Wu, X., Zhang, C.: Multi-database mining. 2, 5–13 (2003)
Zhao, Z., Wang, L., Liu, H., Ye, J.: On similarity preserving feature selection. IEEE Trans. Knowl. Data Eng. 25(3), 619–632 (2013)
Zhu, P., Zuo, W., Zhang, L., Hu, Q., Shiu, S.C.: Unsupervised feature selection by regularized self-representation. Pattern Recogn. 48(2), 438–446 (2015)
Zhu, X., Huang, Z., Shen, H.T., Cheng, J., Xu, C.: Dimensionality reduction by mixed kernel canonical correlation analysis. Pattern Recogn. 45(8), 3003–3016 (2012)
Zhu, X., Suk, H.-I., Shen, D.: Sparse discriminative feature selection for multi-class Alzheimer’s disease classification. In: Wu, G., Zhang, D., Zhou, L. (eds.) MLMI 2014. LNCS, vol. 8679, pp. 157–164. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10581-9_20
Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Trans. Knowl. Data Eng. 23(1), 110–121 (2011)
Zhu, X., Zhang, S., Zhang, J., Zhang, C.: Cost-sensitive imputing missing values with ordering. In: AAAI Conference on Artificial Intelligence, 22–26 July 2007, Vancouver, British Columbia, Canada, pp. 1922–1923 (2007)
Zhu, Y., Lucey, S.: Convolutional sparse coding for trajectory reconstruction. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 529–540 (2013)
Acknowledgement
This work was supported in part by the China “1000-Plan” National Distinguished Professorship; the Nation Natural Science Foundation of China (Grants No: 61263035, 61573270 and 61672177), the China 973 Program (Grant No: 2013CB329404); the China Key Research Program (Grant No: 2016YFB1000905); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the China Postdoctoral Science Foundation (Grant No: 2015M570837); the Innovation Project of Guangxi Graduate Education under grant YCSZ2016046; the Guangxi High Institutions’ Program of Introducing 100 High-Level Overseas Talents; the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; and the Guangxi “Bagui” Teams for Innovation and Research, and the project “Application and Research of Big Data Fusion in Inter-City Traffic Integration of The Xijiang River - Pearl River Economic Belt(da shu jv rong he zai xijiang zhujiang jing ji dai cheng ji jiao tong yi ti hua zhong de ying yong yu yan jiu)”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
He, W., Zhu, X., Li, Y., Hu, R., Zhu, Y., Zhang, S. (2016). Unsupervised Hypergraph Feature Selection with Low-Rank and Self-Representation Constraints. In: Li, J., Li, X., Wang, S., Li, J., Sheng, Q. (eds) Advanced Data Mining and Applications. ADMA 2016. Lecture Notes in Computer Science(), vol 10086. Springer, Cham. https://doi.org/10.1007/978-3-319-49586-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-49586-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49585-9
Online ISBN: 978-3-319-49586-6
eBook Packages: Computer ScienceComputer Science (R0)