Abstract
This paper addresses the problem of clustering binary data with feature selection within the context of maximum likelihood (ML) and classification maximum likelihood (CML) approaches. In order to efficiently perform the clustering with feature selection, we propose the use of an appropriate Bernoulli model. We derive two algorithms: Expectation-Maximization (EM) and Classification EM (CEM) with feature selection. Without requiring a knowledge of the number of clusters, both algorithms optimize two approximations of the minimum message length (MML) criterion. To exploit the advantages of EM for clustering and of CEM for fast convergence, we combine the two algorithms. With Monte Carlo simulations and by varying parameters of the model, we rigorously validate the approach. We also illustrate our contribution using real datasets commonly used in document clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. In: Computational Statistics and Data Analysis, pp. 155–173 (2006)
Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence, 719–725 (2000)
Celeux, G., Govaert, G.: A classification em algorithm for clustering and two stochastic versions. Comput. Stat. Data Anal., 315–332 (1992)
Dempster, A.P., Laird, M.N., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1–22 (1977)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Willey & Sons, New Yotk (1973)
Figueiredo, M.A.T., Jain, K.: Unsupervised learning of finite mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 381–396 (2002)
Grim, J.: Multivariate statistical pattern recognition with nonreduced dimensionality. Kybernetika, 142–157 (1986)
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification, 193–218 (1985)
Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell., 1154–1166 (2004)
Li, M., Zhang, L.: Multinomial mixture model with feature selection for text clustering. Know.-Based Syst., 704–708 (2008)
McLachlan, G.J., Peel, D.: Finite mixture models. New York (2000)
Pudil, P., Novovicová, J., Choakjarernwanit, N., Kittler, J.: Feature selection based on the approximation of class densities by finite mixtures of special type. Pattern Recognition, 1389–1398 (1995)
Schwarz, G.E.: Estimating the dimension of a model. Annal of Statistics, 461–464 (1978)
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res., 583–617 (2003)
Symons, M.: Clustering criteria and multivariate normale mixture. Biometrics, 35–43 (1981)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Laclau, C., Nadif, M. (2014). Fast Simultaneous Clustering and Feature Selection for Binary Data. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds) Advances in Intelligent Data Analysis XIII. IDA 2014. Lecture Notes in Computer Science, vol 8819. Springer, Cham. https://doi.org/10.1007/978-3-319-12571-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-12571-8_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12570-1
Online ISBN: 978-3-319-12571-8
eBook Packages: Computer ScienceComputer Science (R0)