[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3071178.3071300acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

How noisy data affects geometric semantic genetic programming

Published: 01 July 2017 Publication History

Abstract

Noise is a consequence of acquiring and pre-processing data from the environment, and shows fluctuations from different sources---e.g., from sensors, signal processing technology or even human error. As a machine learning technique, Genetic Programming (GP) is not immune to this problem, which the field has frequently addressed. Recently, Geometric Semantic Genetic Programming (GSGP), a semantic-aware branch of GP, has shown robustness and high generalization capability. Researchers believe these characteristics may be associated with a lower sensibility to noisy data. However, there is no systematic study on this matter. This paper performs a deep analysis of the GSGP performance over the presence of noise. Using 15 synthetic datasets where noise can be controlled, we added different ratios of noise to the data and compared the results obtained with those of a canonical GP. The results show that, as we increase the percentage of noisy instances, the generalization performance degradation is more pronounced in GSGP than GP. However, in general, GSGP is more robust to noise than GP in the presence of up to 10% of noise, and presents no statistical difference for values higher than that in the test bed.

Supplementary Material

ZIP File (p985-miranda.zip)
Supplemental material.

References

[1]
W. Banzhaf, P. Nordin, R.E. Keller, and F.D. Francone. 1998. Genetic Programming --- an Introduction: on the Automatic Evolution of Computer Programs and Its Applications. Morgan Kaufmann Publishers.
[2]
A. Borrelli, I. De Falco, A. Della Cioppa, M. Nicodemi, and G. Trautteur. 2006. Performance of genetic programming to extract the trend in noisy data series. Physica A: Statistical Mechanics and its Applications 370, 1 (2006), 104--108.
[3]
Mauro Castelli, Davide Castaldi, Ilaria Giordani, Sara Silva, Leonardo Vanneschi, Francesco Archetti, and Daniele Maccagnola. 2013. An efficient implementation of geometric semantic genetic programming for anticoagulation level prediction in pharmacogenetics. In 16th Portuguese Conference on Artificial Intelligence, EPIA 2013 (LNCS), Luís Correia, Luís Paulo Reis, and José Cascalho (Eds.), Vol. 8154. Springer Berlin Heidelberg, 78--89.
[4]
Mauro Castelli, Luca Manzoni, and Leonardo Vanneschi. 2012. An efficient genetic programming system with geometric semantic operators and its application to human oral bioavailability prediction. arXiv preprint arXiv:1208.2437 (2012).
[5]
M. Castelli, S. Silva, and L. Vanneschi. 2015. A C++ framework for geometric semantic genetic programming. Genetic Prog. and Evolvable Machines 16, 1 (Mar 2015), 73--81.
[6]
I. De Falco, A. Della Cioppa, D. Maisto, U. Scafuri, and E. Tarantino. 2007. Parsimony doesn't mean simplicity: Genetic programming for inductive inference on noisy data. In Proceedings of the 10th European Conference, EuroGP'07 (LNCS), M. Ebner, M. O'Neill, A. Ekárt, L. Vanneschi, and A. Esparcia-Alcázar (Eds.), Vol. 4445. Springer, 351--360.
[7]
Jeannie Fitzgerald and Conor Ryan. 2014. On Size, Complexity and Generalisation Error in GP. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO '14). ACM, New York, NY, USA, 903--910.
[8]
Ray J Hickey. 1996. Noise modelling and evaluating learning from examples. Artificial Intelligence 82, 1 (1996), 157--179.
[9]
J. H Imada and B. J Ross. 2008. Using feature-based fitness evaluation in symbolic regression with added noise. In Proceedings of the 10th annual conference companion on Genetic and evolutionary computation. ACM, 2153--2158.
[10]
M. Keijzer. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In 6th European Conference, EuroGP 2003, Conor Ryan, Terence Soule, Maarten Keijzer, Edward Tsang, Riccardo Poli, and Ernesto Costa (Eds.), Vol. 2610. Springer Berlin Heidelberg, 70--82.
[11]
Yu. Kharin and E. Zhuk. 1994. Robustness in statistical pattern recognition under "contaminations" of training samples. In Pattern Recognition, 1994. Vol. 2-Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on, Vol. 2. IEEE, 504--506.
[12]
J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. Vol. 1. MIT Press.
[13]
J. McDermott, D. R. White, S. Luke, L. Manzoni, M. Castelli, L. Vanneschi, W. Jaskowski, K. Krawiec, R. Harper, K. De Jong, and U. O'Reilly. 2012. Genetic Programming Needs Better Benchmarks. In Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO '12). ACM, New York, NY, USA, 791--798.
[14]
A. Moraglio, K. Krawiec, and C. G. Johnson. 2012. Geometric Semantic Genetic Programming. Springer Berlin Heidelberg, Berlin, Heidelberg, 21--31.
[15]
F. Nettleton, D, A. Orriols-Puig, and A. Fornells. 2010. A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review 33, 4 (2010), 275--306.
[16]
J. Ni, R. H. Drieberg, and P. I. Rockett. 2013. The use of an analytic quotient operator in genetic programming. Evolutionary Computation, IEEE Trans. on 17, 1 (Apr 2013), 146--152.
[17]
Peter J Rousseeuw and Annick M Leroy. 2005. Robust regression and outlier detection. Vol. 589. John wiley & sons.
[18]
J. A. Sáez, J. Luengo, and F. Herrera. 2011. Fuzzy rule based classification systems versus crisp robust learners trained in presence of class noise's effects: a case of study. In Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on. IEEE, 1229--1234.
[19]
J. A. Sáez, J. Luengo, and F. Herrera. 2016. Evaluating the classifier behavior with noisy data considering performance and robustness: the Equalized Loss of Accuracy measure. Neurocomputing 176 (2016), 26--35.
[20]
C. Sivapragasam, P. Vincent, and G. Vasudevan. 2007. Genetic programming model for forecast of short and noisy data. Hydrological processes 21, 2 (2007), 266--272.
[21]
Leonardo Vanneschi. 2014. Improving genetic programming for the prediction of pharmacokinetic parameters. Memetic Computing 6, 4 (2014), 255--262.
[22]
Leonardo Vanneschi, Mauro Castelli, Luca Manzoni, and Sara Silva. 2013. A new implementation of geometric semantic GP and its application to problems in pharmacokinetics. In 16th European Conference, EuroGP 2013 (LNCS), Krzysztof Krawiec, Alberto Moraglio, Ting Hu, A. Şima Etaner-Uyar, and Bin Hu (Eds.), Vol. 7831. Springer Berlin Heidelberg, 205--216.
[23]
Leonardo Vanneschi, Mauro Castelli, and Sara Silva. 2010. Measuring Bloat, Overfitting and Functional Complexity in Genetic Programming. In Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (GECCO '10). ACM, New York, NY, USA, 877--884.
[24]
Leonardo Vanneschi, Mauro Castelli, and Sara Silva. 2014. A survey of semantic methods in genetic programming. Genetic Programming and Evolvable Machines 15, 2 (2014), 195--214.
[25]
Leonardo Vanneschi, Sara Silva, Mauro Castelli, and Luca Manzoni. 2014. Geometric Semantic Genetic Programming for Real Life Applications. In Genetic Programming Theory and Practice XI, Rick Riolo, Jason H. Moore, and Mark Kotanchek (Eds.). Springer New York, 191--209.
[26]
E. J. Vladislavleva, G. F. Smits, and D. Den Hertog. 2009. Order of Nonlinearity As a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming. Trans. Evol. Comp 13, 2 (April 2009), 333--349.

Cited By

View all
  • (2023)Towards Evolutionary Control Laws for Viability ProblemsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590415(1464-1472)Online publication date: 15-Jul-2023
  • (2020)Choosing function sets with better generalisation performance for symbolic regression modelsGenetic Programming and Evolvable Machines10.1007/s10710-020-09391-4Online publication date: 12-May-2020
  • (2019)Untapped Potential of Genetic Programming: Transfer Learning and Outlier RemovalGenetic Programming Theory and Practice XVI10.1007/978-3-030-04735-1_10(193-207)Online publication date: 24-Jan-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference
July 2017
1427 pages
ISBN:9781450349208
DOI:10.1145/3071178
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. geometric semantic genetic programming
  2. noise impact
  3. symbolic regression

Qualifiers

  • Research-article

Funding Sources

Conference

GECCO '17
Sponsor:

Acceptance Rates

GECCO '17 Paper Acceptance Rate 178 of 462 submissions, 39%;
Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Towards Evolutionary Control Laws for Viability ProblemsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590415(1464-1472)Online publication date: 15-Jul-2023
  • (2020)Choosing function sets with better generalisation performance for symbolic regression modelsGenetic Programming and Evolvable Machines10.1007/s10710-020-09391-4Online publication date: 12-May-2020
  • (2019)Untapped Potential of Genetic Programming: Transfer Learning and Outlier RemovalGenetic Programming Theory and Practice XVI10.1007/978-3-030-04735-1_10(193-207)Online publication date: 24-Jan-2019
  • (2018)Analysing symbolic regression benchmarks under a meta-learning approachProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3205651.3208293(1342-1349)Online publication date: 6-Jul-2018
  • (2018)Filtering Outliers in One Step with Genetic ProgrammingParallel Problem Solving from Nature – PPSN XV10.1007/978-3-319-99253-2_17(209-222)Online publication date: 22-Aug-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media