Abstract
The paper proposes a new GEP-based batch ensemble classifier constructed using the stacked generalization concept. In our approach combination of base classifiers involves evolving the meta-gene using genes induced by GEP from randomly generated combinations of instances with randomly selected subsets of attributes. The main property of the discussed classifier is its scalability allowing adaptation to the size of the dataset under consideration. To validate the proposed classifier, we have carried-out computational experiment involving a number of publicly available benchmark datasets. Experiment results show that the approach assures good performance, scalability and robustness.
Similar content being viewed by others
References
Álvarez, A., Sierra, B., Arruti, A., Gil, J.M.L., Garay-Vitoria, N.: Classifier subset selection for the stacked generalization method applied to emotion recognition in speech. Sensors 16(1), 21 (2016)
Awwalu, J., Ghazvini, A., Bakar, A.A.: Comparative analysis of algorithms in supervised classification: a case study of bank notes dataset. Int. J. Comput. Trends Technol. 17(1), 38–43 (2014)
Ávila-Jiménez, J.L., Gibaja Galindo, E.L., Zafra, A., Ventura, S.: A gene expression programming algorithm for multi-label classification. Multiple-Valued Logic Soft Comput. 17(2–3), 183–206 (2011)
Crain, K., Davis, G.: Classifying forest cover type using cartographic features. Stanford University (2014)
Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. CoRR, cs.AI/0102027 (2001)
Ferreira, C.: Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence. Studies in Computational Intelligence, vol. 21. Springer, Heidelberg (2006). doi:10.1007/3-540-32849-1
Hosseini, S.A., Rabiee, H.R., Hafez, H., Soltani-Farani, A.: Classifying a stream of infinite concepts: a Bayesian non-parametric approach. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8724, pp. 1–16. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44848-9_1
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: GEP-induced expression trees as weak classifiers. In: Perner, P. (ed.) ICDM 2008. LNCS, vol. 5077, pp. 129–141. Springer, Heidelberg (2008). doi:10.1007/978-3-540-70720-2_10
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: A family of GEP-induced ensemble classifiers. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS, vol. 5796, pp. 641–652. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04441-0_56
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Experimental evaluation of two new GEP-based ensemble classifiers. Expert Syst. Appl. 38(9), 10932–10939 (2011)
Jȩdrzejowicz, J., Jȩdrzejowicz, P.: Combining expression trees. In: 2013 IEEE International Conference on Cybernetics, CYBCONF 2013, Lausanne, Switzerland, 13–15 June 2013, pp. 80–85. IEEE (2013)
Johnson, B.A., Tateishi, R., Thanh, H.N.: A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees. Int. J. Remote Sens. 34(20), 6969–6982 (2013)
Karakasis, V., Stafylopatis, A.: Data mining based on gene expression programming and Clonal selection. In: IEEE International Conference on Evolutionary Computation, CEC 2006, part of WCCI 2006, Vancouver, BC, Canada, 16–21 July 2006, pp. 514–521. IEEE (2006)
Koc, A.A., Yeniay, O.: A comparative study of artificial neural networks and logistic regression for classification of marketing campaign results. Math. Comput. Appl. 18(3), 392–398 (2013)
Li, X., Zhou, C., Xiao, W., Nelson, P.C.: Prefix gene expression programming. In: Rothlauf, F. (ed.) Late Breaking Paper at Genetic and Evolutionary Computation Conference (GECCO 2005), Washington, D.C., USA, pp. 25–29, June 2005
Lichman, M.: UCI machine learning repository (2013)
Liu, S., Liu, Z., Sun, J., Liu, L.: Application of synergetic neural network in online writeprint identification. Int. J. Digit. Content Technol. Appl. 5(3), 126–135 (2011)
Mertayak, C.: Utilization of dimensionality reduction in stacked generalization architecture. In: The 24th International Symposium on Computer and Information Sciences, ISCIS 2009, 14–16 September 2009, North Cyprus, pp. 88–93. IEEE (2009)
Olorunnimbe, M.K., Viktor, H.L., Paquet, E.: Intelligent adaptive ensembles for data stream mining: a high return on investment approach. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2015. LNCS, vol. 9607, pp. 61–75. Springer, Cham (2016). doi:10.1007/978-3-319-39315-5_5
Pesaranghader, A., Viktor, H.L.: Fast hoeffding drift detection method for evolving data streams. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9852, pp. 96–111. Springer, Cham (2016). doi:10.1007/978-3-319-46227-1_7
Ting, K.M., Witten, I.H.: Issues in stacked generalization. J. Artif. Intell. Res. (JAIR) 10, 271–289 (1999)
Turkov, P., Krasotkina, O., Mottl, V.: Dynamic programming for bayesian logistic regression learning under concept drift. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 190–195. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45062-4_26
Weinert, W.R., Lopes, H.S.: GEPCLASS: a classification rule discovery tool using gene expression programming. In: Li, X., Zaïane, O.R., Li, Z. (eds.) ADMA 2006. LNCS, vol. 4093, pp. 871–880. Springer, Heidelberg (2006). doi:10.1007/11811305_95
Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)
Yeh, I.-C., Lien, C.H.: The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl. 36(2, Part 1), 2473–2480 (2009)
Zeng, T., Tang, C., Xiang, Y., Chen, P., Liu, Y.: A model of immune gene expression programming for rule mining. J. Univ. Comput. Sci. 13(10), 1484–1497 (2007). http://www.jucs.org/jucs_13_10/a_model_of_immune
Zliobaite, I.: Controlled permutations for testing adaptive classifiers. In: Discovery Science, pp. 365–379 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jȩdrzejowicz, J., Jȩdrzejowicz, P. (2017). Gene Expression Programming Ensemble for Classifying Big Datasets. In: Nguyen, N., Papadopoulos, G., Jędrzejowicz, P., Trawiński, B., Vossen, G. (eds) Computational Collective Intelligence. ICCCI 2017. Lecture Notes in Computer Science(), vol 10449. Springer, Cham. https://doi.org/10.1007/978-3-319-67077-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-67077-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67076-8
Online ISBN: 978-3-319-67077-5
eBook Packages: Computer ScienceComputer Science (R0)