[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

A survey on feature selection approaches for clustering

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The massive growth of data in recent years has led challenges in data mining and machine learning tasks. One of the major challenges is the selection of relevant features from the original set of available features that maximally improves the learning performance over that of the original feature set. This issue attracts researchers’ attention resulting in a variety of successful feature selection approaches in the literature. Although there exist several surveys on unsupervised learning (e.g., clustering), lots of works concerning unsupervised feature selection are missing in these surveys (e.g., evolutionary computation based feature selection for clustering) for identifying the strengths and weakness of those approaches. In this paper, we introduce a comprehensive survey on feature selection approaches for clustering by reflecting the advantages/disadvantages of current approaches from different perspectives and identifying promising trends for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Alelyani S, Tang J, Liu H (2013) Feature selection for clustering: a review. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications

  • Aloise D, Deshpande A, Hansen P, Popat P (2009) Np-hardness of Euclidean sum-of-squares clustering. Mach Learn 75(2):245–248

    MATH  Google Scholar 

  • Amini S, Homayouni S, Safari A, Darvishsefat AA (2018) Object-based classification of hyperspectral data using random forest algorithm. Geo Spat Inf Sci 21(2):127–138

    Google Scholar 

  • Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) Optics: ordering points to identify the clustering structure. In: Proceedings of the ACM SIGMOD international conference on management of data, SIGMOD’99. ACM, New York, NY, USA, pp 49–60

  • Awad MM (2018) Forest mapping: a comparison between hyperspectral and multispectral images and technologies. J For Res 29(5):1395–1405

    Google Scholar 

  • Bandyopadhyay S, Saha S (2012) Unsupervised classification: similarity measures, classical and metaheuristic approaches, and applications. Springer, Berlin

    MATH  Google Scholar 

  • Bawa M, Condie T, Ganesan P (2005) LSH forest: self-tuning indexes for similarity search. In: Proceedings of the 14th international conference on world wide web, WWW’05. ACM, New York, NY, USA, pp 651–660

  • Bezdek JC, Ehrlich R, Full W (1984) FCM: The fuzzy c-means clustering algorithm. Comput Geosci 10(2):191–203

    Google Scholar 

  • Butler-Yeoman T, Xue B, Zhang M (2015) Particle swarm optimisation for feature selection: a hybrid filter-wrapper approach. In: IEEE congress on evolutionary computation (CEC), pp 2428–2435

  • Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’10. ACM, New York, NY, USA, pp 333–342

  • Calinski R, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Theory Methods 3(1):1–27

    MathSciNet  MATH  Google Scholar 

  • Chakraborty S, Das S (2018) Simultaneous variable weighting and determining the number of clusters—a weighted Gaussian means algorithm. Stat Probab Lett 137:148–156

    MathSciNet  MATH  Google Scholar 

  • Chan EY, Ching WK, Ng MK, Huang JZ (2004) An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recognit 37(5):943–952

    MATH  Google Scholar 

  • Chatzis SP (2011) A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional. Expert Syst Appl 38(7):8684–8689

    Google Scholar 

  • Cheung Y, Zeng H (2006) Feature weighted rival penalized em for gaussian mixture clustering: automatic feature and model selections in a single paradigm. Int Conf Comput Intell Secur 1:633–638

    Google Scholar 

  • Cobos C, Leon E, Mendoza M (2010) A harmony search algorithm for clustering with feature selection. Rev Fac Ing Univ Antioq 55:153–164

    Google Scholar 

  • Das S, Chaudhuri S, Ghatak S, Das AK (2016) Simultaneous feature selection and cluster analysis using genetic algorithm. In: International conference on information technology (ICIT2016), pp 288–293

  • Dash M, Liu H (1999) Handling large unsupervised data via dimensionality reduction. In: SIGMOD research issues in data mining and knowledge discovery (DMKD-99) workshop

  • Dash M, Liu H (2000) Feature selection for clustering. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications, pp 110–121

  • Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1(2):224–227

    Google Scholar 

  • de Amorim RC (2016) A survey on feature weighting based k-means algorithms. J Classif 33(2):210–242

    MathSciNet  MATH  Google Scholar 

  • DeSarbo WS, Cron WL (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(2):249–282

    MathSciNet  MATH  Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Google Scholar 

  • Dheeru D, Karra Taniskidou E (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml

  • Domeniconi C, Papadopoulos D, Gunopulos D, Ma S (2004) Subspace clustering of high dimensional data. In: Siam international conference on data mining

  • Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Discov 14(1):63–97

    MathSciNet  Google Scholar 

  • Dorigo M, Di Caro G (1999) Ant colony optimization: a new meta-heuristic. Proc Congr Evol Compu 2:1470–1477

    Google Scholar 

  • Du S, Ma Y, Li S, Ma Y (2017) Robust unsupervised feature selection via matrix factorization. Neurocomputing 241:115–127

    Google Scholar 

  • Dutta D, Dutta P, Sil J (2012) Simultaneous feature selection and clustering for categorical features using multi objective genetic algorithm. In: 12th international conference on hybrid intelligent systems (HIS2012), pp 191–196

  • Dutta D, Dutta P, Sil J (2013) Simultaneous continuous feature selection and k clustering by multi objective genetic algorithm. In: 3rd IEEE international advance computing conference (IACC2013), pp 937–942

  • Dy JG, Brodley CE (2004) Feature selection for unsupervised learning. J Mach Learn Res 5:845–889

    MathSciNet  MATH  Google Scholar 

  • Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the second international conference on knowledge discovery and data mining, KDD’96, pp 226–231

  • Fan W, Bouguila N, Ziou D (2013) Unsupervised hybrid feature extraction selection for high-dimensional non-gaussian data clustering with variational inference. IEEE Trans Knowl Data Eng 25(7):1670–1685

    Google Scholar 

  • Ferreira AJ, Figueiredo MA (2012) An unsupervised approach to feature discretization and selection. Pattern Recognit 45(9):3048–3060

    Google Scholar 

  • Fop M, Murphy TB, Scrucca L (2018) Model-based clustering with sparse covariance matrices. Stat Comput 39:1–29

    MATH  Google Scholar 

  • Gao D, Liang H, Shi G, Cao L (2019) Multi-objective optimization of carbon fiber-reinforced plastic composite bumper based on adaptive genetic algorithm. Math Problems Eng. https://doi.org/10.1155/2019/8948315

    Article  Google Scholar 

  • Golub GH, Reinsch C (1970) Singular value decomposition and least squares solutions. Numer Math 14(5):403–420

    MathSciNet  MATH  Google Scholar 

  • Grün B (2019) Model-based clustering. CRC Press, Boca Raton, pp 163–198

    MATH  Google Scholar 

  • Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. SIGMOD Rec 27(2):73–84

    MATH  Google Scholar 

  • Guha S, Rastogi R, Kyuseok S (1999) Rock: a robust clustering algorithm for categorical attributes. In: 15th international conference on data engineering, 1999. Proceedings, pp 512–521

  • Haindl M, Somol P, Ververidis D, Kotropoulos C (2006) Feature selection based on mutual correlation. In: Carrasco Ochoa JA, Kittler J, Martínez-Trinidad JF (eds) Progress in pattern recognition, image analysis and applications. Springer, Berlin, pp 569–577

    Google Scholar 

  • Hancer E (2019) Differential evolution for feature selection: a fuzzy wrapper-filter approach. Soft Comput 23(13):5233–5248

    Google Scholar 

  • Hancer E (2020) A new multi-objective differential evolution approach for simultaneous clustering and feature selection. Eng Appl Artif Intell 87:103307

    Google Scholar 

  • Hancer E, Karaboga D (2017) A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number. Swarm Evol Comput 32:49–67

    Google Scholar 

  • Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119

    Google Scholar 

  • Hancer E (2018) A differential evolution approach for simultaneous clustering and feature selection. In: International conference on artificial intelligence and data processing, pp 1–7

  • Hancer E, Ozturk C, Karaboga D (2012) Artificial bee colony based image clustering method. In: IEEE congress on evolutionary computation (CEC2012), pp 1–5

  • Hancer E, Ozturk C, Karaboga D (2013) Extraction of brain tumors from MRI images with artificial bee colony based segmentation methodology. In: 8th international conference on electrical and electronics engineering (ELECO2013), pp 516–520

  • Hancer E, Samet R, Karaboga D (2014) A hybrid method to the reconstruction of contour lines from scanned topographic maps. In: IEEE 23rd international symposium on industrial electronics (ISIE2014), pp 930–933

  • He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05. MIT Press, Cambridge, MA, USA, pp 507–514

  • Hinneburg A, Gabriel HH (2007) Denclue 2.0: fast clustering based on kernel density estimation. In: Shawe-Taylor MRBJ, Lavrač N (eds) Advances in intelligent data analysis VII, pp 70–80

  • Holland JH (1975) Adaption in natural and artificial systems. University of Michigan Press, Ann Arbor

    MATH  Google Scholar 

  • Hruschka ER, Campello RJGB, Freitas AA, De Carvalho ACPLF (2009) A survey of evolutionary algorithms for clustering. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):133–155

    Google Scholar 

  • Hruschka ER, Hruschka ER, Covoes TF, Ebecken NFF (2005) Feature selection for clustering problems: a hybrid algorithm that iterates between k-means and a Bayesian filter. In: Fifth international conference on hybrid intelligent systems (HIS’05), pp 1–6

  • Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):657–668

    Google Scholar 

  • Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Google Scholar 

  • Jashki MA, Makki M, Bagheri E, Ghorbani AA (2009) An iterative hybrid filter-wrapper approach to feature selection for document clustering. In: Gao Y, Japkowicz N (eds) Advances in artificial intelligence. Springer, Berlin, pp 74–85

    Google Scholar 

  • Javani M, Faez K, Aghlmandi D (2011) Clustering and feature selection via PSO algorithm. In: 2011 international symposium on artificial intelligence and signal processing (AISP), pp 71–76

  • Ji J, Bai T, Zhou C, Ma C, Wang Z (2013) An improved k-prototypes clustering algorithm for mixed numeric and categorical data. Neurocomputing 120:590–596

    Google Scholar 

  • Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1026–1041

    Google Scholar 

  • Jolliffe I (1986) Principal component analysis. Springer, Berlin

    MATH  Google Scholar 

  • Karaboga D, Gorkemli B, Ozturk C, Karaboga N (2014) A comprehensive survey: artificial bee colony ABC algorithm and applications. Artif Intell Rev 42(1):21–57

    Google Scholar 

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of international conference on neural networks (ICNN’95), vol 4, pp 1942–1948

  • Kim Y, Street WN, Menczer F (2002) Evolutionary model selection in unsupervised learning. Intell Data Anal 6(6):531–556

    MATH  Google Scholar 

  • Kim S, Tadesse MG, Vannucci M (2006) Variable selection in clustering via dirichlet process mixture models. Biometrika 93(4):877–893

    MathSciNet  MATH  Google Scholar 

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Lee Y, Choi TJ, Ahn CW (2017) Multi-objective evolutionary approach to select security solutions. CAAI Trans Intell Technol 2(2):64–67

    Google Scholar 

  • Lensen A, Xue B, Zhang M (2017) Using particle swarm optimisation and the silhouette metric to estimate the number of clusters, select features, and perform clustering. In: Squillero G, Sim K (eds) Applications of evolutionary computation. Springer, Berlin, pp 538–554

    Google Scholar 

  • Lensen A, Xue B, Zhang M (2016) Particle swarm optimisation representations for simultaneous clustering and feature selection. In: IEEE symposium series on computational intelligence (SSCI)

  • Li Y, Dong M, Hua J (2008) Localized feature selection for clustering. Pattern Recognit Lett 29(1):10–18

    Google Scholar 

  • Li Y, Lu BL, Wu ZF (2007) Hierarchical fuzzy filter method for unsupervised feature selection. J Intell Fuzzy Syst 18(2):157–169

    MATH  Google Scholar 

  • Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2016) Feature selection: a data perspective. CoRR abs/1601.07996

  • Li Z, Yang Y, Liu J, Zhou X, Lu H (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence, AAAI’12. AAAI Press, pp 1026–1032

  • Liu F, Liu X (2012) Unsupervised feature selection for multi-cluster data via smooth distributed score. In: Huang DS, Gupta P, Zhang X, Premaratne P (eds) Emerging intelligent computing technology and applications. Springer, Berlin, pp 74–79

    Google Scholar 

  • Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Google Scholar 

  • Macqueen J (1967) Some methods for classification and analysis of multivariate observations. In: 5th Berkeley symposium on mathematical statistics and probability, pp 281–297

  • Mao KZ (2005) Identifying critical variables of principal components for unsupervised feature selection. IEEE Trans Syst Man Cybern Part B (Cybern) 35(2):339–344

    MathSciNet  Google Scholar 

  • Maugis C, Celeux G, Martin-Magniette ML (2005) Variable selection for clustering with Gaussian mixture models. Biometrics 65(3):602–617

    MathSciNet  MATH  Google Scholar 

  • McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions (Wiley series in probability and statistics), 2nd edn. Wiley, Hoboken

    MATH  Google Scholar 

  • Miruthula P, Roopa SN (2015) Unsupervised feature selection algorithms: a survey. Int J Sci Res 4(6):688–690

    Google Scholar 

  • Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312

    Google Scholar 

  • Modha DS, Spangler WS (2003) Feature weighting in k-means clustering. Mach Learn 52(3):217–237

    MATH  Google Scholar 

  • Mugunthadevi K, Punitha SC, Punithavalli M, Mugunthadevi K (2011) Survey on feature selection in document clustering. Int J Comput Sci Eng 3:1240–1241

    Google Scholar 

  • Nie F, Xiang S, Jia Y, Zhang C, Yan S (2008) Trace ratio criterion for feature selection. vol 2, pp 671–676

  • Ozturk C, Hancer E, Karaboga D (2015) Improved clustering criterion for image clustering with artificial bee colony algorithm. Pattern Anal Appl 18(3):587–599

    MathSciNet  Google Scholar 

  • Pal SK, De RK, Basak J (2000) Unsupervised feature evaluation: a neuro-fuzzy approach. IEEE Trans Neural Netw 11(2):366–376

    Google Scholar 

  • Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. SIGKDD Explor Newsl 6(1):90–105

    Google Scholar 

  • Parvin H, Beigi A, Mozayani N (2012) A clustering ensemble learning method based on the ant colony clustering algorithm. Appl Comput Math 11:286–302

    MathSciNet  Google Scholar 

  • Parvin H, Minaei-Bidgoli B (2013) A clustering ensemble framework based on elite selection of weighted clusters. Adv Data Anal Classif 7(2):181–208

    MathSciNet  MATH  Google Scholar 

  • Parvin H, Minaei-Bidgoli B (2015) A clustering ensemble framework based on selection of fuzzy weighted clusters in a locally adaptive clustering algorithm. Pattern Anal Appl 18(1):87–112

    MathSciNet  MATH  Google Scholar 

  • Patnaik AK, Bhuyan PK, Rao KK (2016) Divisive analysis (DIANA) of hierarchical clustering and GPS data for level of service criteria of urban streets. Alex Eng J 55(1):407–418

    Google Scholar 

  • Polak RGE (1969) Note sur la convergence de méthodes de directions conjuguées. ESAIM 3(R1):35–43

    MATH  Google Scholar 

  • Prakash J, Singh PK (2019) Gravitational search algorithm and k-means for simultaneous feature selection and data clustering: a multi-objective approach. Soft Comput 23(6):2083–2100

    Google Scholar 

  • Prakash J, Singh PK (2015) Particle swarm optimization with k-means for simultaneous feature selection and data clustering. In: Second international conference on soft computing and machine intelligence (ISCMI2015), pp 74–78

  • Qian M, Zhai C (2013) Robust unsupervised feature selection. In: Proceedings of the twenty-third international joint conference on artificial intelligence, IJCAI’13, pp 1621–1627

  • Raftery AE, Dean N (2006) Variable selection for model-based clustering. J Am Stat Assoc 101(473):168–178

    MathSciNet  MATH  Google Scholar 

  • Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65

    MATH  Google Scholar 

  • Rui X, Wunsch ID (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Google Scholar 

  • Saha S, Acharya S, Kavya K, Miriyala S (2018) Simultaneous clustering and feature weighting using multiobjective optimization for identifying functionally similar mirnas. IEEE J Biomed Health Inform 22(5):1684–1690

    Google Scholar 

  • Saha S, Ekbal A, Alok A, Spandana R (2014) Feature selection and semi-supervised clustering using multiobjective optimization. SpringerPlus 3:465

    Google Scholar 

  • Saha S, Spandana R, Ekbal A, Bandyopadhyay S (2015) Simultaneous feature selection and symmetry based clustering using multiobjective framework. Appl Soft Comput 29:479–486

    Google Scholar 

  • Samet R, Hancer E (2012) A new approach to the reconstruction of contour lines extracted from topographic maps. J Vis Commun Image Represent 23(4):642–647

    Google Scholar 

  • Sarvari H, Khairdoost N, Fetanat A (2010) Harmony search algorithm for simultaneous clustering and feature selection. In: International conference of soft computing and pattern recognition, pp 202–207

  • Sheng W, Swift S, Zhang L, Liu X (2005) A weighted sum validity function for clustering with a hybrid niching genetic algorithm. IEEE Trans Syst Man Cybern B Cybern 35(6):1156–1167

    Google Scholar 

  • Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2016) A new hybrid filter-wrapper feature selection method for clustering based on ranking. Neurocomputing 214:866–880

    Google Scholar 

  • Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2019) A review of unsupervised feature selection methods. Artif Intell Rev. https://doi.org/10.1007/s10462-019-09682-y

    Article  Google Scholar 

  • Song M, Chen D (2018) An improved knowledge-informed NSGA-II for multi-objective land allocation (MOLA). Geo Spat Inf Sci 21(4):273–287

    Google Scholar 

  • Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359

    MathSciNet  MATH  Google Scholar 

  • Swetha KP, Susheela Devi V (2012) Simultaneous feature selection and clustering using particle swarm optimization. In: Proceedings of the 19th international conference on neural information processing—volume part I, ICONIP’12. Springer, Berlin, pp 509–515

  • Tadesse MG, Sha N, Vannucci M (2005) Bayesian variable selection in clustering high-dimensional data. J Am Stat Assoc 100(470):602–617

    MathSciNet  MATH  Google Scholar 

  • Tsai CY, Chiu CC (2008) Developing a feature weight self-adjustment mechanism for a k-means clustering algorithm. Comput Stat Data Anal 52(10):4658–4672

    MathSciNet  MATH  Google Scholar 

  • Turi R (2001) Clustering-based colour image segmentation. Ph.D thesis, Monash University, Australia

  • Vaithyanathan S, Dom B (1999) Generalized model selection for unsupervised learning in high dimensions. In: Proceedings of the 12th international conference on neural information processing systems, NIPS’99. MIT Press, Cambridge, MA, USA, pp 970–976

  • Vandenbroucke N, Macaire L, Postaire JG (2000) Unsupervised color texture feature extraction and selection for soccer image segmentation. vol 2

  • Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl Based Syst 126:8–19

    Google Scholar 

  • Wang H, Yan S, Xu D, Tang X, Huang T (2007) Trace ratio vs. ratio trace for dimensionality reduction. In: IEEE conference on computer vision and pattern recognition, pp 1–8

  • Wang L, Shen H (2016) Improved data streams classification with fast unsupervised feature selection. In: 17th international conference on parallel and distributed computing, applications and technologies (PDCAT), pp 221–226

  • Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13(8):841–847

    Google Scholar 

  • Xue B (2014) Particle swarm optimisation for feature selection. PhD thesis, Victoria University of Wellington, Wellington, New Zealand

  • Yang Y, Shen HT, Ma Z, Huang Z, Zhou X (2011) L2,1-norm regularized discriminative feature selection for unsupervised learning. In: Proceedings of the twenty-second international joint conference on artificial intelligence, vol 3, IJCAI’11. AAAI Press, pp 1589–1594

  • Ye J (2007) Least squares linear discriminant analysis. In: Proceedings of the 24th international conference on machine learning, ICML’07. ACM, New York, NY, USA, pp 1087–1093

  • Yun L, Bao-Liang L, Zhong-Fu W (2006) A hybrid method of unsupervised feature selection based on ranking. In: 18th international conference on pattern recognition (ICPR’06), vol 2, pp 687–690

  • Zhang T, Ramakrishnan R, Livny M (1997) Birch: a new data clustering algorithm and its applications. Data Min Knowl Discov 1(2):141–182

    Google Scholar 

  • Zhang S, Wong H, Shen Y, Xie D (2012) A new unsupervised feature ranking method for gene expression data based on consensus affinity. IEEE/ACM Trans Comput Biol Bioinf 9(4):1257–1263

    Google Scholar 

  • Zhao X, Xu G, Liu D, Zuo X (2017) Second-order de algorithm. CAAI Trans Intell Technol 2(2):80–92

    Google Scholar 

  • Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of the 24th international conference on machine learning, ICML’07. ACM, New York, NY, USA, pp 1151–1157

  • Zhao Z, Wang L, Liu H (2010) Efficient spectral feature selection with minimum redundancy. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence, AAAI’10, pp 673–678

  • Zhu QH, Yang YB (2018) Discriminative embedded unsupervised feature selection. Pattern Recognit Lett 112:219–225

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emrah Hancer.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hancer, E., Xue, B. & Zhang, M. A survey on feature selection approaches for clustering. Artif Intell Rev 53, 4519–4545 (2020). https://doi.org/10.1007/s10462-019-09800-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-019-09800-w

Keywords

Navigation