Abstract
Knowledge Discovery in Databases (KDD) is an active and important research area with the promise for a high payoff in many business and scientific applications. One of the main tasks in KDD is classification. A particular efficient method for classification is decision tree induction. The selection of the attribute used at each node of the tree to split the data (split criterion) is crucial in order to correctly classify objects. Different split criteria were proposed in the literature (Information Gain, Gini Index, etc.). It is not obvious which of them will produce the best decision tree for a given data set. A large amount of empirical tests were conducted in order to answer this question. No conclusive results were found. In this paper we introduce a formal methodology, which allows us to compare multiple split criteria. This permits us to present fundamental insights into the decision process. Furthermore, we are able to present a formal description of how to select between split criteria for a given data set. As an illustration we apply the methodology to two widely used split criteria: Gini Index and Information Gain.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
A. Babic, E. Krusinska and J.E. Stromberg, Extraction of diagnostic rules using recursive partitioning systems: A comparison of two approches, Artificial Intelligence in Medicine 20(5) (1992) 373–387.
E. Baker and A.K. Jain, On feature ordering in practice and some finite sample effects, in: Proceedings of the Third International Joint Conference on Pattern Recognition, San Diego, CA (1976) pp. 45–49.
M. Ben-Bassat, Myopic policies in sequential classification, IEEE Transactions on Computing 27(2) (1978) 170–174.
L. Breiman, J. Friedman, R. Olshen and C. Stone, Classification and Regression Trees (Wadsworth International Group, 1984).
Lopez de Mantaras, A distance-based attribute selection measure for decision tree induction, Machine Learning 6(1) (1991) 81–92.
J. Gama and P. Brazdil, Characterization of classification algorithms, in: EPIA-95: Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, eds. C. Pinto-Ferreira and N. Mamede (Springer, 1995) pp. 189–200.
I. Kononenko, On biases in estimating multi-valued attributes, in: IJCAI-95: Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, ed. C. Mellish (Morgan Kaufmann, San Mateo, CA, 1995) pp. 1034–1040.
T.-S. Lim, W.-Y. Loh and Y.-S. Shih, A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms, Machine Learning (1999).
J. Mingers, Expert systems-rule induction with statistical data, Journal of the Operational Research Society 38(1) (1987) 39–47.
J. Mingers, An empirical comparison of selection measures for decision tree induction, Machine Learning 3 (1989) 319–342.
M. Miyakawa, Criteria for selecting a variable in the construction of efficient decision trees, IEEE Transactions on Computers 35(1) (1929) 133–141.
B.M. Moret, Decision trees and diagrams, Computing Surveys 14(4) (1982) 593–623.
K.V.S. Murthy, On growing better decision trees fromdata, Ph.D. thesis, The John Hopkins University, Baltimore, MD (1995).
G. Pagallo, Adaptive decision tree algorithms for learning from examples, Ph.D. thesis, University of California, Santa Cruz, CA (1990).
J.R. Quinlan, Simplifying decision trees, International Journal of Man-Machine Studies 27 (1987) 221–234.
J.R. Quinlan, C4.5 Programs for Machine Learning (Morgan Kaufmann, 1993).
L.E. Raileanu, Formalization and comparison of split criteria for decision trees, Ph.D. thesis, University of Neuchâtel, Switzerland (May 2002).
S.R. Safavin and D. Langrebe, A survey of decision tree classifier methodology, IEEE Transactions on Systems, Man and Cybernetics 21(3) (1991) 660–674.
M. Sahami, Learning non-linearly separable Boolean functions with linear threshold unit trees and madaline-style networks, in: Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI Press, 1993) pp. 335–341.
K. Stoffel and L.E. Raileanu, Selecting optimal split-functions for large datasets, in: Research and Development in Intelligent Systems XVII, BCS Conference Series (2000).
R. Vilalta and D. Oblinger, A quantification of distance-bias between evaluation metrics in classification, in: Proceedings of the 17th International Conference on Machine Learning, Stanford University (2000).
A.P. White and W.Z. Liu, Bias il information-based measures in decision tree induction, Machine Learning 15(3) (1997) 321–328.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Raileanu, L.E., Stoffel, K. Theoretical Comparison between the Gini Index and Information Gain Criteria. Annals of Mathematics and Artificial Intelligence 41, 77–93 (2004). https://doi.org/10.1023/B:AMAI.0000018580.96245.c6
Issue Date:
DOI: https://doi.org/10.1023/B:AMAI.0000018580.96245.c6