Abstract
Features and protypes selection are two major problems in data mining, especially for machine learning algorithms. The goal of both selections is to reduce storage complexity, and thus computational costs, without sacrificing accuracy. In this article, we present two incremental algorithms using geometrical neighborhood graphs and a new statistical test to select, step by step, relevant features and prototypes for supervised learning problems. The feature selection procedure we present could be applied before any machine learning algorithm is used.
Chapter PDF
Similar content being viewed by others
References
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification And Regression Trees. Chapman & Hall, Boca Raton (1984)
Chassery, J.M.: A Montanvert, A. Géométrie Discrète en Analyse d’Images. Hermès (1991)
Devijver, P.A., Dekesel, M.: Computing Multidimensional Delaunay tesseletions. Research Report (1982)
Efron, B., Tibishirani, R.: An Introduction to the Bootstrap. Chapman & Hall, Boca Raton (1993)
Gabriel, K.R., Sokal, R.R.: A new Statistical Approach to Geographic Variation Analysis. Systematic Zoology, 259–278 (1969)
Gates, G.W.: The Reduced Nearest Neighbor Rule. IEEE Trans. Inform. Theory, 431–433 (1972)
Hart, P.E.: The Condensed Nearest Neighbor Rule. IEEE Trans. Inform. Theory, 515–516 (1968)
Ichino, M., Sklansy, J.: The Relative Neighborhood Graph for Mixed Feature Variables. Pattern recognition (18), 161–167 (1985)
Kohavi, K.: Feature Fubset Selection as Search with Probabilistic Estimates. In: AAAI Fall Symposium on Relevance (1994)
LeBourgeois, F., Emptoz, H.: Pretopological Approach For Supervised Learning. In: 13th ICPR 1996, Vienna, Austria, August 25-29, pp. 256–260 (1996)
Parzen, E.: On Estimation of a Probability Density Function and Mode. Ann. Math. Stat. 33, 1065–1076 (1962)
Preparata, F.P., Shamos, M.I.: Pattern Recognition and Scene Analysis. Springer, Heidelberg (1985)
Rao, C.: Linear Statistical Inference and its Applications. Wiley, New York (1965)
Sebban, M.: Modèles théoriques en Reconnaissance de Formes et Architecture Hybride pour Machine Perceptive. Thèse de doctorat de l’ Université Lyon1 (1996)
Zighed, D.A., Sebban, M.: Sélection et Validation Statistique de Variables et de Prototypes. Revue Electronique sur l’Apprentissage par les Données (2) (1998)
Skalak, D.B.: Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms. In: Proceedings of 11th International Conference on MAchine Learning, pp. 293–301. Morgan Kaufmann, San Francisco (1994)
Thrun, et al.: The Monk’s Problem: a Performance Comparison of Different Learning Algorithms. Technical Report CMU-CS-91-197, Carnegie Mellon University (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sebban, M., Zighed, D.A., Di Palma, S. (1999). Selection and Statistical Validation of Features and Prototypes. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive