Abstract
The massive growth of data in recent years has led challenges in data mining and machine learning tasks. One of the major challenges is the selection of relevant features from the original set of available features that maximally improves the learning performance over that of the original feature set. This issue attracts researchers’ attention resulting in a variety of successful feature selection approaches in the literature. Although there exist several surveys on unsupervised learning (e.g., clustering), lots of works concerning unsupervised feature selection are missing in these surveys (e.g., evolutionary computation based feature selection for clustering) for identifying the strengths and weakness of those approaches. In this paper, we introduce a comprehensive survey on feature selection approaches for clustering by reflecting the advantages/disadvantages of current approaches from different perspectives and identifying promising trends for future research.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: a review. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications
Aloise D, Deshpande A, Hansen P, Popat P (2009) Np-hardness of Euclidean sum-of-squares clustering. Mach Learn 75(2):245–248
Amini S, Homayouni S, Safari A, Darvishsefat AA (2018) Object-based classification of hyperspectral data using random forest algorithm. Geo Spat Inf Sci 21(2):127–138
Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD’99. ACM, New York, NY, USA, pp 49–60
Awad MM (2018) Forest mapping: a comparison between hyperspectral and multispectral images and technologies. J For Res 29(5):1395–1405
Bandyopadhyay S, Saha S (2012) Unsupervised classification: similarity measures, classical and metaheuristic approaches, and applications. Springer, Berlin
Bawa M, Condie T, Ganesan P (2005) LSH forest: self-tuning indexes for similarity search. In: Proceedings of the 14th international conference on world wide web, WWW’05. ACM, New York, NY, USA, pp 651–660
Bezdek JC, Ehrlich R, Full W (1984) FCM: The fuzzy c-means clustering algorithm. Comput Geosci 10(2):191–203
Butler-Yeoman T, Xue B, Zhang M (2015) Particle swarm optimisation for feature selection: a hybrid filter-wrapper approach. In: IEEE congress on evolutionary computation (CEC), pp 2428–2435
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’10. ACM, New York, NY, USA, pp 333–342
Calinski R, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Theory Methods 3(1):1–27
Chakraborty S, Das S (2018) Simultaneous variable weighting and determining the number of clusters—a weighted Gaussian means algorithm. Stat Probab Lett 137:148–156
Chan EY, Ching WK, Ng MK, Huang JZ (2004) An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognit 37(5):943–952
Chatzis SP (2011) A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional. Expert Syst Appl 38(7):8684–8689
Cheung Y, Zeng H (2006) Feature weighted rival penalized em for gaussian mixture clustering: automatic feature and model selections in a single paradigm. Int Conf Comput Intell Secur 1:633–638
Cobos C, Leon E, Mendoza M (2010) A harmony search algorithm for clustering with feature selection. Rev Fac Ing Univ Antioq 55:153–164
Das S, Chaudhuri S, Ghatak S, Das AK (2016) Simultaneous feature selection and cluster analysis using genetic algorithm. In: International conference on information technology (ICIT2016), pp 288–293
Dash M, Liu H (1999) Handling large unsupervised data via dimensionality reduction. In: SIGMOD research issues in data mining and knowledge discovery (DMKD-99) workshop
Dash M, Liu H (2000) Feature selection for clustering. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications, pp 110–121
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227
de Amorim RC (2016) A survey on feature weighting based k-means algorithms. J Classif 33(2):210–242
DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(2):249–282
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Domeniconi C, Papadopoulos D, Gunopulos D, Ma S (2004) Subspace clustering of high dimensional data. In: Siam international conference on data mining
Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Discov 14(1):63–97
Dorigo M, Di Caro G (1999) Ant colony optimization: a new meta-heuristic. Proc Congr Evol Compu 2:1470–1477
Du S, Ma Y, Li S, Ma Y (2017) Robust unsupervised feature selection via matrix factorization. Neurocomputing 241:115–127
Dutta D, Dutta P, Sil J (2012) Simultaneous feature selection and clustering for categorical features using multi objective genetic algorithm. In: 12th international conference on hybrid intelligent systems (HIS2012), pp 191–196
Dutta D, Dutta P, Sil J (2013) Simultaneous continuous feature selection and k clustering by multi objective genetic algorithm. In: 3rd IEEE international advance computing conference (IACC2013), pp 937–942
Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889
Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96, pp 226–231
Fan W, Bouguila N, Ziou D (2013) Unsupervised hybrid feature extraction selection for high-dimensional non-gaussian data clustering with variational inference. IEEE Trans Knowl Data Eng 25(7):1670–1685
Ferreira AJ, Figueiredo MA (2012) An unsupervised approach to feature discretization and selection. Pattern Recognit 45(9):3048–3060
Fop M, Murphy TB, Scrucca L (2018) Model-based clustering with sparse covariance matrices. Stat Comput 39:1–29
Gao D, Liang H, Shi G, Cao L (2019) Multi-objective optimization of carbon fiber-reinforced plastic composite bumper based on adaptive genetic algorithm. Math Problems Eng. https://doi.org/10.1155/2019/8948315
Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. Numer Math 14(5):403–420
Grün B (2019) Model-based clustering. CRC Press, Boca Raton, pp 163–198
Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. SIGMOD Rec 27(2):73–84
Guha S, Rastogi R, Kyuseok S (1999) Rock: a robust clustering algorithm for categorical attributes. In: 15th international conference on data engineering, 1999. Proceedings, pp 512–521
Haindl M, Somol P, Ververidis D, Kotropoulos C (2006) Feature selection based on mutual correlation. In: Carrasco Ochoa JA, Kittler J, Martínez-Trinidad JF (eds) Progress in pattern recognition, image analysis and applications. Springer, Berlin, pp 569–577
Hancer E (2019) Differential evolution for feature selection: a fuzzy wrapper-filter approach. Soft Comput 23(13):5233–5248
Hancer E (2020) A new multi-objective differential evolution approach for simultaneous clustering and feature selection. Eng Appl Artif Intell 87:103307
Hancer E, Karaboga D (2017) A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number. Swarm Evol Comput 32:49–67
Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119
Hancer E (2018) A differential evolution approach for simultaneous clustering and feature selection. In: International conference on artificial intelligence and data processing, pp 1–7
Hancer E, Ozturk C, Karaboga D (2012) Artificial bee colony based image clustering method. In: IEEE congress on evolutionary computation (CEC2012), pp 1–5
Hancer E, Ozturk C, Karaboga D (2013) Extraction of brain tumors from MRI images with artificial bee colony based segmentation methodology. In: 8th international conference on electrical and electronics engineering (ELECO2013), pp 516–520
Hancer E, Samet R, Karaboga D (2014) A hybrid method to the reconstruction of contour lines from scanned topographic maps. In: IEEE 23rd international symposium on industrial electronics (ISIE2014), pp 930–933
He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05. MIT Press, Cambridge, MA, USA, pp 507–514
Hinneburg A, Gabriel HH (2007) Denclue 2.0: fast clustering based on kernel density estimation. In: Shawe-Taylor MRBJ, Lavrač N (eds) Advances in intelligent data analysis VII, pp 70–80
Holland JH (1975) Adaption in natural and artificial systems. University of Michigan Press, Ann Arbor
Hruschka ER, Campello RJGB, Freitas AA, De Carvalho ACPLF (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):133–155
Hruschka ER, Hruschka ER, Covoes TF, Ebecken NFF (2005) Feature selection for clustering problems: a hybrid algorithm that iterates between k-means and a Bayesian filter. In: Fifth international conference on hybrid intelligent systems (HIS’05), pp 1–6
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):657–668
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Jashki MA, Makki M, Bagheri E, Ghorbani AA (2009) An iterative hybrid filter-wrapper approach to feature selection for document clustering. In: Gao Y, Japkowicz N (eds) Advances in artificial intelligence. Springer, Berlin, pp 74–85
Javani M, Faez K, Aghlmandi D (2011) Clustering and feature selection via PSO algorithm. In: 2011 international symposium on artificial intelligence and signal processing (AISP), pp 71–76
Ji J, Bai T, Zhou C, Ma C, Wang Z (2013) An improved k-prototypes clustering algorithm for mixed numeric and categorical data. Neurocomputing 120:590–596
Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1026–1041
Jolliffe I (1986) Principal component analysis. Springer, Berlin
Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2014) A comprehensive survey: artificial bee colony ABC algorithm and applications. Artif Intell Rev 42(1):21–57
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of international conference on neural networks (ICNN’95), vol 4, pp 1942–1948
Kim Y, Street WN, Menczer F (2002) Evolutionary model selection in unsupervised learning. Intell Data Anal 6(6):531–556
Kim S, Tadesse MG, Vannucci M (2006) Variable selection in clustering via dirichlet process mixture models. Biometrika 93(4):877–893
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge
Lee Y, Choi TJ, Ahn CW (2017) Multi-objective evolutionary approach to select security solutions. CAAI Trans Intell Technol 2(2):64–67
Lensen A, Xue B, Zhang M (2017) Using particle swarm optimisation and the silhouette metric to estimate the number of clusters, select features, and perform clustering. In: Squillero G, Sim K (eds) Applications of evolutionary computation. Springer, Berlin, pp 538–554
Lensen A, Xue B, Zhang M (2016) Particle swarm optimisation representations for simultaneous clustering and feature selection. In: IEEE symposium series on computational intelligence (SSCI)
Li Y, Dong M, Hua J (2008) Localized feature selection for clustering. Pattern Recognit Lett 29(1):10–18
Li Y, Lu BL, Wu ZF (2007) Hierarchical fuzzy filter method for unsupervised feature selection. J Intell Fuzzy Syst 18(2):157–169
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2016) Feature selection: a data perspective. CoRR abs/1601.07996
Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, AAAI’12. AAAI Press, pp 1026–1032
Liu F, Liu X (2012) Unsupervised feature selection for multi-cluster data via smooth distributed score. In: Huang DS, Gupta P, Zhang X, Premaratne P (eds) Emerging intelligent computing technology and applications. Springer, Berlin, pp 74–79
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Macqueen J (1967) Some methods for classification and analysis of multivariate observations. In: 5th Berkeley symposium on mathematical statistics and probability, pp 281–297
Mao KZ (2005) Identifying critical variables of principal components for unsupervised feature selection. IEEE Trans Syst Man Cybern Part B (Cybern) 35(2):339–344
Maugis C, Celeux G, Martin-Magniette ML (2005) Variable selection for clustering with Gaussian mixture models. Biometrics 65(3):602–617
McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions (Wiley series in probability and statistics), 2nd edn. Wiley, Hoboken
Miruthula P, Roopa SN (2015) Unsupervised feature selection algorithms: a survey. Int J Sci Res 4(6):688–690
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
Modha DS, Spangler WS (2003) Feature weighting in k-means clustering. Mach Learn 52(3):217–237
Mugunthadevi K, Punitha SC, Punithavalli M, Mugunthadevi K (2011) Survey on feature selection in document clustering. Int J Comput Sci Eng 3:1240–1241
Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. vol 2, pp 671–676
Ozturk C, Hancer E, Karaboga D (2015) Improved clustering criterion for image clustering with artificial bee colony algorithm. Pattern Anal Appl 18(3):587–599
Pal SK, De RK, Basak J (2000) Unsupervised feature evaluation: a neuro-fuzzy approach. IEEE Trans Neural Netw 11(2):366–376
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. SIGKDD Explor Newsl 6(1):90–105
Parvin H, Beigi A, Mozayani N (2012) A clustering ensemble learning method based on the ant colony clustering algorithm. Appl Comput Math 11:286–302
Parvin H, Minaei-Bidgoli B (2013) A clustering ensemble framework based on elite selection of weighted clusters. Adv Data Anal Classif 7(2):181–208
Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112
Patnaik AK, Bhuyan PK, Rao KK (2016) Divisive analysis (DIANA) of hierarchical clustering and GPS data for level of service criteria of urban streets. Alex Eng J 55(1):407–418
Polak RGE (1969) Note sur la convergence de méthodes de directions conjuguées. ESAIM 3(R1):35–43
Prakash J, Singh PK (2019) Gravitational search algorithm and k-means for simultaneous feature selection and data clustering: a multi-objective approach. Soft Comput 23(6):2083–2100
Prakash J, Singh PK (2015) Particle swarm optimization with k-means for simultaneous feature selection and data clustering. In: Second international conference on soft computing and machine intelligence (ISCMI2015), pp 74–78
Qian M, Zhai C (2013) Robust unsupervised feature selection. In: Proceedings of the twenty-third international joint conference on artificial intelligence, IJCAI’13, pp 1621–1627
Raftery AE, Dean N (2006) Variable selection for model-based clustering. J Am Stat Assoc 101(473):168–178
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Rui X, Wunsch ID (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678
Saha S, Acharya S, Kavya K, Miriyala S (2018) Simultaneous clustering and feature weighting using multiobjective optimization for identifying functionally similar mirnas. IEEE J Biomed Health Inform 22(5):1684–1690
Saha S, Ekbal A, Alok A, Spandana R (2014) Feature selection and semi-supervised clustering using multiobjective optimization. SpringerPlus 3:465
Saha S, Spandana R, Ekbal A, Bandyopadhyay S (2015) Simultaneous feature selection and symmetry based clustering using multiobjective framework. Appl Soft Comput 29:479–486
Samet R, Hancer E (2012) A new approach to the reconstruction of contour lines extracted from topographic maps. J Vis Commun Image Represent 23(4):642–647
Sarvari H, Khairdoost N, Fetanat A (2010) Harmony search algorithm for simultaneous clustering and feature selection. In: International conference of soft computing and pattern recognition, pp 202–207
Sheng W, Swift S, Zhang L, Liu X (2005) A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Trans Syst Man Cybern B Cybern 35(6):1156–1167
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2016) A new hybrid filter-wrapper feature selection method for clustering based on ranking. Neurocomputing 214:866–880
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2019) A review of unsupervised feature selection methods. Artif Intell Rev. https://doi.org/10.1007/s10462-019-09682-y
Song M, Chen D (2018) An improved knowledge-informed NSGA-II for multi-objective land allocation (MOLA). Geo Spat Inf Sci 21(4):273–287
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359
Swetha KP, Susheela Devi V (2012) Simultaneous feature selection and clustering using particle swarm optimization. In: Proceedings of the 19th international conference on neural information processing—volume part I, ICONIP’12. Springer, Berlin, pp 509–515
Tadesse MG, Sha N, Vannucci M (2005) Bayesian variable selection in clustering high-dimensional data. J Am Stat Assoc 100(470):602–617
Tsai CY, Chiu CC (2008) Developing a feature weight self-adjustment mechanism for a k-means clustering algorithm. Comput Stat Data Anal 52(10):4658–4672
Turi R (2001) Clustering-based colour image segmentation. Ph.D thesis, Monash University, Australia
Vaithyanathan S, Dom B (1999) Generalized model selection for unsupervised learning in high dimensions. In: Proceedings of the 12th international conference on neural information processing systems, NIPS’99. MIT Press, Cambridge, MA, USA, pp 970–976
Vandenbroucke N, Macaire L, Postaire JG (2000) Unsupervised color texture feature extraction and selection for soccer image segmentation. vol 2
Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl Based Syst 126:8–19
Wang H, Yan S, Xu D, Tang X, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Wang L, Shen H (2016) Improved data streams classification with fast unsupervised feature selection. In: 17th international conference on parallel and distributed computing, applications and technologies (PDCAT), pp 221–226
Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847
Xue B (2014) Particle swarm optimisation for feature selection. PhD thesis, Victoria University of Wellington, Wellington, New Zealand
Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) L2,1-norm regularized discriminative feature selection for unsupervised learning. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 3, IJCAI’11. AAAI Press, pp 1589–1594
Ye J (2007) Least squares linear discriminant analysis. In: Proceedings of the 24th international conference on machine learning, ICML’07. ACM, New York, NY, USA, pp 1087–1093
Yun L, Bao-Liang L, Zhong-Fu W (2006) A hybrid method of unsupervised feature selection based on ranking. In: 18th international conference on pattern recognition (ICPR’06), vol 2, pp 687–690
Zhang T, Ramakrishnan R, Livny M (1997) Birch: a new data clustering algorithm and its applications. Data Min Knowl Discov 1(2):141–182
Zhang S, Wong H, Shen Y, Xie D (2012) A new unsupervised feature ranking method for gene expression data based on consensus affinity. IEEE/ACM Trans Comput Biol Bioinf 9(4):1257–1263
Zhao X, Xu G, Liu D, Zuo X (2017) Second-order de algorithm. CAAI Trans Intell Technol 2(2):80–92
Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning, ICML’07. ACM, New York, NY, USA, pp 1151–1157
Zhao Z, Wang L, Liu H (2010) Efficient spectral feature selection with minimum redundancy. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence, AAAI’10, pp 673–678
Zhu QH, Yang YB (2018) Discriminative embedded unsupervised feature selection. Pattern Recognit Lett 112:219–225
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hancer, E., Xue, B. & Zhang, M. A survey on feature selection approaches for clustering. Artif Intell Rev 53, 4519–4545 (2020). https://doi.org/10.1007/s10462-019-09800-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-019-09800-w