Abstract
During the last decade, research on Genetic-Based Machine Learning has resulted in several proposals of supervised learning methodologies that use evolutionary algorithms to evolve rule-based classification models. Usually, these new GBML approaches are accompanied by little experimentation and there is a lack of comparisons among different proposals. Besides, the competitiveness of GBML systems with respect to non-evolutionary, highly-used machine learning techniques has only been partially studied. This paper reviews the state of the art in GBML, selects some of the best representatives of different families, and compares the accuracy and the interpretability of their models. The paper also analyzes the behavior of the GBML approaches with respect to some of the most influential machine learning techniques that belong to different learning paradigms such as decision trees, support vector machines, instance-based classifiers, and probabilistic classifiers. The experimental observations emphasize the suitability of GBML systems for performing classification tasks. Moreover, the analysis points out the strengths of the different systems, which can be used as recommendation guidelines on which systems should be employed depending on whether the user prefers to maximize the accuracy or the interpretability of the models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aggarwal C (ed) (2007) Data streams: models and algorithms. Springer, Heidelberg
Aguilar-Ruiz JS, Giraldez R, Riquelme JC (2007) Natural encoding for evolutionary supervised learning. IEEE Trans Evol Comput 11(4):466–479
Aguilar-Ruiz JS, Riquelme JC, Toro M (2003) Evolutionary learning of hierarchical decision rules. IEEE Trans Syst Man Cybern B 33(2):324–331
Aha DW, Kibler DF, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2008) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput (forthcoming)
Anikow CZ (1993) A knowledge-intensive genetic algorithm for supervised learning. Mach Learn 13(2–3):189–228
Asuncion A, Newman DJ (2007) UCI Machine learning repository: http://www.ics.uci.edu/∼mlearn/MLRepository.html. University of California
Bacardit J (2004) Pittsburgh genetic-based machine learning in the data mining era: representations, generalization and run-time. PhD thesis, Ramon Llull University, Barcelona, Catalonia, Spain
Bacardit J, Krasnogor N (2008) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Learning classifier systems, revised selected papers of the international workshop on learning classifier systems 2006–2007. Springer, Heidelberg
Bernadó-Mansilla E, Garrell JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238
Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1):1–23
Bernadó-Mansilla E, Llorà X, Garrell JM (2002) XCS and GALE: a comparative study of two learning classifier systems on data mining. In: Advances in learning classifier systems volume 2321 of LNAI. Springer, Heidelberg, pp 115–132
Bonelli P, Parodi A (1991) An efficient classifier system and its experimental comparison with two representative learning methods on three medical domains. In 4th international conference on genetic algorithms, pp 288–295
Bull L, Studley M, Bagnall A, Whittley I (2007) Learning classifier system ensembles with rule-sharing. IEEE Trans Evol Comput 11(4):496–502
Butz MV (2006) Rule-based evolutionary online learning systems: a principled approach to LCS analysis and design volume 109 of studies in Fuzziness and Soft Computing. Springer, Heidelberg
Butz MV, Lanzi PL, Wilson SW (2008) Function approximation with XCS: hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Trans Evol Comput. doi:10.1109/TEVC.2007.903551 (forthcoming)
Cantú-Paz E (2001) Efficient and accurate parallel genetic algorithms. Kluwer, Dordrecht
Castillo L, González A, Pérez R (2001) Including a simplicity criterion in the selection of the best rule in a genetic fuzzy learning algorithm. Fuzzy Sets Syst 120:309–321
Corcoran AL, Sen S (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In: International conference on evolutionary computation, pp 120–124
O. Cordón, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: evolutionary tuning and learning of fuzzy knowledge bases volume 19 of advances in fuzzy systems—aplications and theory. World Scientific
Dam HH, Lokan C, Abbass HA (2007) Evolutionary online data mining: an investigation in a dynamic environment. In: Evolutionary computation in dynamic and uncertain environments volume 51/2007 of studies in computational intelligence. Springer Berlin/Heidelberg, pp 153–178
de Jong KA, Spears W (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence. Sidney, pp 651–656
de Jong KA, Spears WM, Gordon DF (1993) Using genetic algorithms for concept learning. Genetic algorithms for machine learning. In: John J, Grefenstette (eds) A special issue of machine learning, vol 13, 2–3, pp 161–188
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
del Jesús MJ, Hoffmann F, Navascués LJ, Sánchez L (2004) Induction of fuzzy-rule-based classifiers with evolutionary boosting algorithms. IEEE Trans Fuzzy Syst 12(3):296–308
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924
Dixon PW, Corne DW, Oates MJ (2004) A ruleset reduction algorithm for the XCSI learning classifier system. In: Lecture Notes in Computer Science, vol 2661/2003. Springer, Heidelberg, pp 20–29
Drugowitsch J, Barry AM (2008) A formal framework and extensions for function approximation in learning classifier systems. Mach Learn 70(1):45–88
Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin
Fisher RA (1959) Statistical methods and scientific inference. 2nd edn. Hafner Publishing Co, New York
Frank E, Witten IH (1998) Generating accurate rule sets without global optimization. In: Proceedings of the 15th international conference on machine learning. Morgan Kaufmann, San Francisco, pp 144–151
Freitas A (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Heidelberg
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: International conference on machine learning, pp 148–156
Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 32(2):337–374
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32:675–701
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Fu C, Davis L (2002) A modified classifier system compaction algorithm. In: GECCO’02: Proceedings of the 2002 genetic and evolutionary computation conference. Morgan Kaufmann Publishers Inc., San Francisco, pp 920–925
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54
Giordana A, Neri F (1995) Search-intensive concept induction. Evol Comput 3(4):375–419
Giráldez R, Aguilar-Ruiz JS, Riquelme JC (2002) Discretization oriented to decision rules generation. In: Knowledge-based intelligent information engineering systems and allied technologies (KES’02). IOS Press, pp 275–279
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. 1st edn, Addison Wesley, Reading
Goldberg DE (2002) The design of innovation: lessons from and for competent genetic algorithms. 1st edn. Kluwer, Dordrecht
González A, Pérez R (1998) Completeness and consistency conditions for learning fuzzy rules. Fuzzy Sets Syst 96:37–51
González A, Pérez R (1999) SLAVE: a genetic learning system based on an iterative approach. IEEE Trans Fuzzy Syst 7(2):176–191
Greene DP, Smith SE (1993) Competition-based induction of decision models from examples. Mach Learn 13:229–257
Herrera F (2008) Genetic fuzzy systems: taxonomy and current research trends and prospects. Evol Intell 1(1):27–46. doi:10.1007/s12065-007-0001-5
Hochberg Y (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75:800–802
Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor
Holland JH (1976) Adaptation. In: Rosen R, Snell F (eds) Progress in theoretical biology, vol 4. Academic Press, New York, pp 263–293
Holland JH, Reitman JS (1978) Cognitive systems based on adaptive algorithms. In: Waterman DA, Hayes-Roth F (eds) Pattern-directed inference systems. Academic Press, San Diego, pp 313–329
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70
Ishibuchi H, Nojima Y (2007) Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int J Approx Reason 44(1):4–31
Ishibuchi H, Yamamoto T (2005) Rule weight specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435
Ishibuchi H, Yamamoto T, Murata T (2005) Hybridization of fuzzy GBML approaches for pattern classification problems. IEEE Trans Syst Man Cybern B Cybern 35(2):359–365
John GH, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: 11th conference on uncertainty in artificial intelligence. Morgan Kaufmann, San Mateo, pp 338–345
Llorà X, Garrell JM (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In: GECCO’01: Proceedings of the 2th annual conference on genetic and evolutionary computation. Morgan Kaufmann Publishers, San Mateo, pp 461?‘468
Llorà X, Reddy R, Matesic B, Bhargava R (2007) Towards better than human capability in diagnosing prostate cancer using infrared spectroscopic imaging. In GECCO’07: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York, pp 2098–2105
Llorà X, Sastry K (2006) Fast rule matching for learning classifier systems via vector instructions. In: GECCO’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1513–1520
Llorà X, Sastry K, Yu T-L, Goldberg DE (2007) Do not match, inherit: fitness surrogates for genetics-based machine learning techniques. In: GECCO’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation. ACM, New York, pp 1798–1805
Llorà X, Wilson SW (2004) Mixed decision trees: minimizing knowledge representation bias in lcs. In: GECCO’04: Proceedings of the genetic and evolutionary computation conference. Springer, LNCS, vol 3103, pp 797–809
Michalewicz Z (1999) Genetic algorithms + data structures = evolution programs. 3rd edn. Springer, Heidelberg
Nojima Y, Ishibuchi H, Kuwajima I (2008) Parallel distributed genetic fuzzy rule selection. Soft Comput (forthcomming)
Nurnberger A, Borgelt C, Klose A (1999) Improving naive Bayes classifiers using neuro-fuzzy learning. In: Proceedings of the 1999 conference on neural information processing, vol 1, Perth, pp 154–159
Núñez M, Fidalgo R, Morales R (2007) Learning in environments with unknown dynamics: towards more robust concept learners. J Mach Learn Res 8:2595–2628
Orriols-Puig A, Bernadó-Mansilla E (2004) Analysis of reduction algorithms for XCS classifier system. In: Recent advances in artificial intelligence research and development number 113 in 1. IOS Press, pp 383–390
Orriols-Puig A, Bernadó-Mansilla E (2006) Bounding XCS parameters for unbalanced datasets. In: GECCO’06: Proceedings of the 2006 genetic and evolutionary computation conference. ACM Press, New York, pp 1561–1568
Orriols-Puig A, Bernadó-Mansilla E (2008) Evolutionary rule-based systems for imbalanced datasets. Soft Comput J. doi:10.1007/s00500-008-0319-7
Orriols-Puig A, Bernadó-Mansilla E (2008) Revisiting UCS: description, fitness sharing and comparison with XCS. In: Advances at the Frontier of LCSs. Springer, Heidelberg
Otero J, Sánchez L (2006) Induction of descriptive fuzzy classifiers with the logitboost algorithm. Soft Comput 10(9):825–835
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel methods—support vector learning. MIT Press, Cambridge
Quinlan JR (1995) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo
Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471
Rivest RL (1987) Learning decision lists. Mach Learn 2(3):229–246
Sánchez L, Couso I (2007) Advocating the use of imprecisely observed data in genetic fuzzy systems. IEEE Trans Fuzzy Syst 15(4):551–562
Sánchez L, Couso I, Casillas J (2007) Modeling vague data with genetic fuzzy systems under a combination of crisp and imprecise criteria. In: Proceedings of the 2007 IEEE symposium on computational intelligence in multicriteria decision making, pp 346–353
Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336
Sheskin DJ (2000) Handbook of parametric and nonparametric statistical procedures. Chapman & Hall, London
Smith SF (1980) A learning system based on genetic adaptive algorithms. PhD thesis. University of Pittsburgh, USA
Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput 11(3):299–336
Tammee K, Bull L, Ouen P (2007) Ycsc: a modified clustering technique based on lcs. J Digit Inf Manage 5(3):160–167
Theodoridis S, Koutroumbas K (2006) Pattern Recognition, 3rd edn. Elsevier, Amsterdam
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (eds) Machine learning: ECML-93 - Proc. of the European conference on machine learning. Springer, Berlin, pp 280–296
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Wilson SW (1998) Generalization in the XCS classifier system. In: 3rd annual conf. on genetic programming. Morgan Kaufmann, San Mateo, pp 665–674
Wilson SW (2000) Get real! XCS with continuous-valued inputs. In: Learning classifier systems. From foundations to applications LNAI, Springer, Berlin, pp 209–219
Wilson SW (2002) Classifiers that approximate functions. J Nat Comput 1(2):211–234
Wilson SW (2002) Compact rulesets from XCSI. In: Advances in learning classifier systems, 4th international workshop, Lecture Notes in Artificial Intelligence, vol 2321. Springer, Heidelberg, pp 197–210
Wilson SW (2008) Classifier conditions using gene expression programming. Technical report, IlliGAL Report No. 2008001, Urbana-Champaign IL 61801, USA
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2007) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37
Acknowledgments
The authors would like to warmly thank Raúl Giráldez (Pablo de Olavide University), Yusuke Nojima (Osaka Prefecture University), and Raúl Pérez (University of Granada) for providing us with the experimental results of HIDER, HMOF, and SLAVE, respectively. The authors also wish to acknowledge the Ministerio de Educación y Ciencia for its support under projects TIN2005-08386-C05-01 and TIN2005-08386-C05-04, and Generalitat de Catalunya for its support under grants 2005FI-00252 and 2005SGR-00302.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Orriols-Puig, A., Casillas, J. & Bernadó-Mansilla, E. Genetic-based machine learning systems are competitive for pattern recognition. Evol. Intel. 1, 209–232 (2008). https://doi.org/10.1007/s12065-008-0013-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-008-0013-9