Abstract
In the past years, Naive Bayes has experienced a renaissance in machine learning, particularly in the area of information retrieval. This classifier is based on the not always realistic assumption that class-conditional distributions can be factorized in the product of their marginal densities. On the other side, one of the most common ways of estimating the Independent Component Analysis (ICA) representation for a given random vector consists in minimizing the Kullback-Leibler distance between the joint density and the product of the marginal densities (mutual information). From this that ICA provides a representation where the independence assumption can be held on stronger grounds. In this paper we propose class-conditional ICA as a method that provides an adequate representation where Naive Bayes is the classifier of choice. Experiments on two public databases are performed in order to confirm this hypothesis.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yang, Y., Slattery, S., Ghani, R.: A study of approaches to hypertext categorization. Journal of Intelligent Information Systems. Kluwer Academic Press (2002)
Lewis, D.: Naive bayes at forty: The independence assumption in information retrieval. In N’edellec, C., Rouveirol, C., eds.: Proceedings of ECML-98, 10th European Conference on Machine Learning. Volume 1398.25., Springer Verlag, Heidelberg, DE (1998) 4–15
Domingos, P., Pazzani, M.J.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29 (1997) 103–130
Turtle, H., Croft, W.: Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems 9 (1991) 187–222
Rish, I., Hellerstein, J., Thathachar, J.: An analysis of data characteristics that affect naive bayes performance. In N’edellec, C., Rouveirol, C., eds.: Proceedings of the Eighteenth Conference on Machine Learning-ICML2001, Morgan Kaufmann (2001)
M. Bressan, D. Guillamet, J. Vitria: Using an ica representation of high dimensional data for object recognition and classification. In: IEEE CSC in Computer Vision and Pattern Recognition (CVPR 2001). Volume 1. (2001) 1004–1009
Bell, A., Sejnowski, T.: An information-maximization approach for blind signal separation. Neural Computation 7 (1995) 1129–1159
Field, D.: What is the goal of sensory coding? Neural Computation 6 (1994) 559–601
Hyvärinen, A.: Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation. Neural Computation 11 (1999) 1739–1768
Vigario, R., Jousmäki, V., Hämäläinen, M., Hari, R., Oja, E.: Independent component analysis for identification of artifacts in magnetoencephalographic recordings. Advances in Neural Information Processing Systems 10 (1998) 229–235
Blake, C., Merz, C.: Uci repository of machine learning databases (1998)
LeCun, Y., Labs-Research, A.: The MNIST DataBase of Handwritten digits. http://www.research.att.com/ yann/ocr/mnist/index.html (1998)
Scott, D.W.: Multivariate Density Estimation. John Wiley and sons, New York, NY (1992)
Duda, R., Hart, P., Stork, D.: Pattern Classication. John Wiley and Sons, Inc., New York, 2nd edition (2001)
Simpson, E.: The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society, Ser. B 13 (1951) 238–241
Bell, A., Sejnowski, T.: The ‘independent components’ of natural scenes are edge filters. Neural Computation 11 (1999) 1739–1768
Lee, T., Lewicki, M., Seynowski, T.: A mixture models for unsupervised classification of non-gaussian sources and automatic context switching in blind signal separation. IEEE Transactions on PAMI 22 (2000) 1–12
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley and Sons (2001)
Comon, P.: Independent component analysis-a new concept? Signal Processing 36 (1994) 287–314
Hyvärinen, A.: New approximatins of differential entropy for independent component analysis and projection pursuit. Advances in Neural Processing Systems 10 (1998) 273–279
Marill, T., Green, D.: On the effectiveness of receptors in recognition systems. IEEE Trans. on Information Theory 9 (1963) 1–17
Kailath, T.: The divergence and bhattacharyya distance measures in signal selection. IEEE Trans. on Communication Technology COM-15 1 (1967) 52–60
Swain, P., King, R.: Two effective feature selection criteria for multispectral remote sensing. In: Proceedings of the 1st International Joint Conference on Pattern Recognition, IEEE 73 CHO821-9. (1973) 536–540
Swain, P., Davis, S.: Remote sensing: the quantitative approach. McGraw-Hill (1978)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bressan, M., Vitrià, J. (2002). Improving Naive Bayes Using Class-Conditional ICA. In: Garijo, F.J., Riquelme, J.C., Toro, M. (eds) Advances in Artificial Intelligence — IBERAMIA 2002. IBERAMIA 2002. Lecture Notes in Computer Science(), vol 2527. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36131-6_1
Download citation
DOI: https://doi.org/10.1007/3-540-36131-6_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00131-7
Online ISBN: 978-3-540-36131-2
eBook Packages: Springer Book Archive