Abstract
Feature selection is an important issue in classification, but it is a difficult task due to the large search space and feature interaction. Statistical clustering methods, which consider feature interaction, group features into different feature clusters. This paper investigates the use of statistical clustering information in particle swarm optimisation (PSO) for feature selection. Two PSO based feature selection algorithms are proposed to select a feature subset based on the statistical clustering information. The new algorithms are examined and compared with a greedy forward feature selection algorithm on seven benchmark datasets. The results show that the two algorithms can select a much smaller number of features and achieve similar or better classification performance than using all features. One of the new algorithms that introduces more stochasticity achieves the best results and outperforms all other methods, especially on the datasets with a relatively large number of features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)
Kennedy, J., Eberhart, R.: Particle swarm optimization. IEEE International Conference on Neural Networks 4, 1942–1948 (1995)
Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: IEEE International Conference on Evolutionary Computation (CEC1998), pp. 69–73 (1998)
Xue, B., Zhang, M., Browne, W.: Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics (2012), doi:10.1109/TSMCB.2012.2227469
Wang, X., Yang, J., Teng, X., Xia, W.: Feature selection based on rough sets and particle swarm optimization. Pattern Recognition Letters 28, 459–471 (2007)
Kennedy, J., Eberhart, R.: A discrete binary version of the particle swarm algorithm. In: IEEE International Conference on Systems, Man, and Cybernetics, Computational Cybernetics and Simulation, vol. 5, pp. 4104–4108 (1997)
Bach, F.R., Jordan, M.I.: A probabilistic interpretation of canonical correlation analysis. Technical report (2005)
Pledger, S., Arnold, R.: Multivariate methods using mixtures: correspondence analysis, scaling and pattern detection. Computational Statistics and Data Analysis (2013), http://dx.doi.org/10.1016/j.csda.2013.05.013
Matechou, E., Liu, I., Pledger, S., Arnold, R.: Biclustering models for ordinal data. In: Presentation at the NZ Statistical Assn. Annual Conference. University of Auckland (2011)
Bache, K., Lichman, M.: UCI Machine Learning Repository (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Lane, M.C., Xue, B., Liu, I., Zhang, M. (2013). Particle Swarm Optimisation and Statistical Clustering for Feature Selection. In: Cranefield, S., Nayak, A. (eds) AI 2013: Advances in Artificial Intelligence. AI 2013. Lecture Notes in Computer Science(), vol 8272. Springer, Cham. https://doi.org/10.1007/978-3-319-03680-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-03680-9_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03679-3
Online ISBN: 978-3-319-03680-9
eBook Packages: Computer ScienceComputer Science (R0)