Abstract
In this work a measure called GD is presented for attribute selection. This measure is defined between an attribute set and a class and corresponds to a generalization of the Mántaras distance that allows to detect the interdependencies between attributes. In the same way, the proposed measure allows to order the attributes by importance in the definition of the concept. This measure does not exhibit a noticeable bias in favor of attributes with many values. The quality of the selected attributes using the GD measure is tested by means of different comparisons with other two attribute selection methods over 19 datasets.
This work was supported in part the Spanish Ministry of Education under project TAP95-0288
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
David W. Aha and Richard L. Bankert. Feature selection for case-based classification of cloud types: An empirical comparison. In Proc. of the 1994 AAAI Workshop on Case-Based Reasoning, pages 106–112. AAAI Press, 1994. 124
David W. Aha, Dennis Kibler, and Marc K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991. 130
H. Almuallim and T.G. Dietterich. Learning with many irrelevant features. In Proc. of the Ninth National Conference on Artificial Intelligence, pages 547–552. AAAI Press, 1991. 124
Michael R. Anderberg. Cluster Analysis for Applications. Academic Press Inc., New York, 1973. 129
Rich Caruana and Dayne Freitag. Greedy attribute selection. In Proc. of the 11th International Machine Learning Conference, pages 28–36, New Brunswick, NJ, 1994. Morgan Kaufmann. 130
T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons Inc., 1991. 126
Walter Daelemans and Antal van den Bosch. Generalization performance of backpropagation learning on a syllabification task. In Proc. of the Third Twente Workshop on Language Technology, pages 27–38, 1992. 125
P. A. Devijver and J. Kittler. Pattern Recognition: A Statistical Approach. Prentice-Hall, Englewood Cliffs, New Jersey, 1982. 131
R. Duda and P. Hart. Pattern Classification and Scene Analysis. John Willey and Sons, 1973. 128, 130
G. H. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In W. William and Haym Hirsh, editors, Procs. of the Eleventh International Conference on Machine Learning, pages 121–129. Morgan Kaufmann, San Francisco, CA, 1994. 124
Kenji Kira and Larry A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proc. of the 10th National Conf. on Artificial Intelligence, pages 129–134, 1992. 124
Ron Kohavi and George H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1–2):273–324, December 1997. 130
Ron Kohavi, Dan Sommerfield, and James Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools with Artificial Intelligence, pages 234–245. IEEE Computer Society Press, 1996. Received the best paper award. 130
Igor Kononenko. Estimating attributes: Analysis and extensions of relief. In F. Bergadano and L. de Raedt, editors, Machine Learning: ECML-94, pages 171–182, Berlin, 1994. Springer. 130
Nick Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1988. 124
R. Lopez de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991. 125, 126
Javier Lorenzo and Mario Hernández. Sobre el uso de conceptos de teoráa de la información en la selección de características. Technical Report GIAS-TR-006, Grupo de Inteligencia Artificial y Sistemas, Dpto. de Informática y Sistemas, Univ. de Las Palmas de Gran Canaria, 1996. 127, 129
David J.C. MacKay. Information theory, inference and learning algorithms. http://wol.ra.phy.cam.ac.uk/mackay/itprnn/book.ps.gz, 1997. 126
C. J. Merz and P.M. Murphy. UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science., 1996. 130
J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986. 125, 130
M Scherf and W. Brauer. Feature selection by means of a feature weighting approach. Technical Report FKI-221-97, Institut fur Informatik, Technische Universitat Munchen, 1997. 125
Dietrich Wettschereck and David W. Aha. Weighting features. In Proc. of the First Int. Conference on Case-Based Reasoning, pages 347–358, 1995. 130
Dietrich Wettschereck and Thomas G. Dieterich. An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Machine Learning, pages 5–27, 1995. 125
Allan P. White and Wei Zhong Liu. Bias in information-based measures in decision tree induction. Machine Learning, 15:321–329, 1994. 125, 129
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lorenzo, J., Hernández, M., Méndez, J. (1998). GD: A Measure Based on Information Theory for Attribute Selection. In: Coelho, H. (eds) Progress in Artificial Intelligence — IBERAMIA 98. IBERAMIA 1998. Lecture Notes in Computer Science(), vol 1484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49795-1_11
Download citation
DOI: https://doi.org/10.1007/3-540-49795-1_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64992-2
Online ISBN: 978-3-540-49795-0
eBook Packages: Springer Book Archive