[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Genetic programming for natural language processing

Published: 01 June 2020 Publication History

Abstract

This work takes us through the literature on applications of genetic programming to problems of natural language processing. The purpose of natural language processing is to allow us to communicate with computers in natural language. Among the problems addressed in the area is, for example, the extraction of information, which draws relevant data from unstructured texts written in natural language. There are also domains of application of particular relevance because of the difficulty in dealing with the corresponding documents, such as opinion mining in social networks, or because of the need for high precision in the information extracted, such as the biomedical domain. There have been proposals to apply genetic programming techniques in several of these areas. This tour allows us to observe the potential—not yet fully exploited—of such applications. We also review some cases in which genetic programming can provide information that is absent from other approaches, revealing its ability to provide easy to interpret results, in form of programs or functions. Finally, we identify some important challenges in the area.

References

[1]
L. Araujo, Genetic programming for natural language parsing, in Proceedings of the European Conference on Genetic Programming (EuroGP2004), Lecture Notes in Computer Science, vol. 3003 (Springer, Berlin, 2004), pp. 230–239
[2]
Araujo L Symbiosis of evolutionary techniques and statistical natural language processing IEEE Trans. Evol. Comput. 2004 8 1 14-27
[3]
L. Araujo, Multiobjective genetic programming for natural language parsing and tagging, in PPSN (2006), pp. 433–442
[4]
Araujo L How evolutionary algorithms are applied to statistical natural language processing Artif. Intell. Rev. 2007 28 4 275-303
[5]
Araujo L, Martinez-Romo J, and Fernandez AD Discovering taxonomies in Wikipedia by means of grammatical evolution Soft Comput. 2018 22 9 2907-2919
[6]
Bartoli A, Davanzo G, De Lorenzo A, Medvet E, and Sorio E Automatic synthesis of regular expressions from examples Computer 2014 47 12 72-80
[7]
A. Bartoli, A. De Lorenzo, E. Medvet, F. Tarlao, M. Virgolin, Evolutionary learning of syntax patterns for genic interaction extraction, in Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, GECCO ’15 (ACM, New York, 2015), pp. 1183–1190
[8]
A. Bartoli, A.D. Lorenzo, E. Medvet, F. Tarlao, Syntactical similarity learning by means of grammatical evolution, in PPSN, Lecture Notes in Computer Science, vol. 9921 (Springer, Berlin, 2016), pp. 260–269
[9]
Bartoli A, Lorenzo AD, Medvet E, and Tarlao F Active learning of regular expressions for entity extraction IEEE Trans. Cybern. 2018 48 3 1067-1080
[10]
Basto-Fernandes V, Yevseyeva I, Frantz RZ, Grilo C, Díaz NP, and Emmerich M An automatic generation of textual pattern rules for digital content filters proposal, using grammatical evolution genetic programming Proc. Technol. 2014 16 806-812
[11]
A. Bergström, P. Jaksetic, P. Nordin, Enhancing information retrieval by automatic acquisition of textual relations using genetic programming, in Proceedings of the 5th International Conference on Intelligent User Interfaces, IUI ’00 (ACM, New York, 2000), pp. 29–32
[12]
J. Bootkrajang, S. Kim, B. Zhang, Evolutionary hypernetwork classifiers for protein–protein interaction sentence filtering, in Genetic and Evolutionary Computation Conference, GECCO 2009, Proceedings, Montreal, Québec, Canada, July 8–12, 2009, ed. by F. Rothlauf (2009), pp. 185–192
[13]
Brameier M and Banzhaf WA comparison of linear genetic programming and neural networks in medical data miningIEEE Trans. Evol. Comput.20015117-260994.68039
[14]
Chapman WW and Cohen KB Current issues in biomedical text mining and natural language processing J. Biomed. Inf. 2009 42 5 757-759
[15]
[16]
Christiansen H A survey of adaptable grammars SIGPLAN Not. 1990 25 11 35-44
[17]
Cohen AM and Hersh WR A survey of current work in biomedical text mining Brief. Bioinf. 2005 6 1 57-71
[18]
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, and Kuksa PNatural language processing (almost) from scratchJ. Mach. Learn. Res.2011122493-25371280.68161
[19]
E. Conrad, Detecting Spam With Genetic Regular Expressions, Technical report (SANS Technology Institute, 2007)
[20]
Cordón O, Herrera-Viedma E, López-Pujalte C, Luque M, and Zarco CA review on the application of evolutionary computation to information retrievalInt. J. Approx. Reason.2003342–3241-26420413121059.68030
[21]
de Carvalho MG, Laender AHF, Goncalves MA, and da Silva AS A genetic programming approach to record deduplication IEEE Trans. Knowl. Data Eng. 2012 24 3 399-412
[22]
Espejo PG, Ventura S, and Herrera F A survey on the application of genetic programming to classification IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2010 40 2 121-144
[23]
Fabregat H, Araujo L, and Martinez-Romo J Deep neural models for extracting entities and relationships in the new RDD corpus relating disabilities and rare diseases Comput. Methods Programs Biomed. 2018 164 121-129
[24]
S. Faralli, A. Panchenko, C. Biemann, S.P. Ponzetto, Linked disambiguated distributional semantic networks, in International Semantic Web Conference (2). Lecture Notes in Computer Science, vol. 9982 (2016), pp. 56–64
[25]
M. Faruqui, J. Dodge, S.K. Jauhar, C. Dyer, E. Hovy, N.A. Smith, Retrofitting word vectors to semantic lexicons, in Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Association for Computational Linguistics, 2015), pp. 1606–1615
[26]
F. Frasincar, J. Borsje, F. Hogenboom, E-Business applications for product development and competitive growth: emerging technologies, chap., in Personalizing News Services Using Semantic Web Technologies (IGI Global 2011), pp. 261–289
[27]
A. González-Pardo, D. Camacho, Analysis of grammatical evolutionary approaches to regular expression induction, in IEEE Congress on Evolutionary Computation (IEEE 2011), pp. 639–646
[28]
M. Graff, E.S. Tellez, H.J. Escalante, S. Miranda-Jiménez, Semantic genetic programming for sentiment analysis, in NEO, Studies in Computational Intelligence, vol. 663 (Springer, Berlin, 2015), pp. 43–65
[29]
M. Graff, E.S. Tellez, S. Miranda-Jiménez, H.J. Escalante, Evodag: a semantic genetic programming python library, in 2016 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC, 2016), pp. 1–6
[30]
R. Greenstadt, M. Kaminsky, Evolving Spam Filters Using Genetic Algorithms, Technical Report 3836. (Massachusetts Institute of Technology, 2002)
[31]
A. Holzinger, C. Biemann, C.S. Pattichis, D.B. Kell, What do we need to build explainable AI systems for the medical domain? CoRR arXiv:1712.09923 (2017)
[32]
Holzinger A, Schantl J, Schroettner M, Seifert C, and Verspoor K Biomedical Text Mining: State-of-the-Art, Open Problems and Future Challenges 2014 Berlin Springer 271-300
[33]
W. IJntema, F. Hogenboom, F. Frasincar, D. Vandic, A genetic programming approach for learning semantic information extraction rules from news, in Web Information Systems Engineering—WISE 2014—15th International Conference, Thessaloniki, Greece, October 12–14, 2014, Proceedings, Part I, Lecture Notes in Computer Science, vol. 8786, ed. by B. Benatallah, A. Bestavros, Y. Manolopoulos, A. Vakali, Y. Zhang (Springer, Berlin, 2014), pp. 418–432
[34]
IJntema W, Sangers J, Hogenboom F, and Frasincar F A lexico-semantic pattern language for learning ontology instances from text Web Semant. Sci. Serv. Agents World Wide Web 2012 15 3 37-50
[35]
Isele R and Bizer C Active learning of expressive linkage rules using genetic programming Web Semant. Sci. Serv. Agents World Wide Web 2013 23 2-15
[36]
Jurafsky D and Martin JH Speech and Language Processing 2009 2 Upper Saddle River Prentice-Hall Inc
[37]
Khorsi AAn overview of content-based spam filtering techniquesInformatica (Slovenia)2007313269-2771162.68338
[38]
Kim KM, Lim SS, and Cho SB Yang ZR, Yin H, and Everson RM User adaptive answers generation for conversational agent using genetic programming Intelligent Data Engineering and Automated Learning—IDEAL 2004 2004 Berlin Springer 813-819
[39]
Korkmaz EE and Üçoluk G A controlled genetic programming approach for the deceptive domain IEEE Trans. Syst. Man Cybern. Part B 2004 34 4 1730-1742
[40]
Korkontzelos I, Piliouras D, Dowsey AW, and Ananiadou S Boosting drug named entity recognition using an aggregate classifier Artif. Intell. Med. 2015 65 2 145-153
[41]
Koza JR Genetic Programming: On the Programming of Computers by Means of Natural Selection 1992 Cambridge MIT Press
[42]
Lan M, Tan CL, and Su J Feature generation and representations for protein–protein interaction classification J. Biomed. Inf. 2009 42 5 866-872
[43]
LeCun Y, Bengio Y, and Hinton G Deep learning Nature 2015 521 436
[44]
Li F, Zhang M, Fu G, and Ji D A neural joint model for entity and relation extraction from biomedical text BMC Bioinf. 2017 18 1 198:1-198:11
[45]
S. Lim, S. Cho, Language generation for conversational agent by evolution of plan trees with genetic programming, in MDAI, Lecture Notes in Computer Science, vol. 3558 (Springer, Berlin, 2005), pp. 305–315
[46]
Liu B and Zhang L A Survey of Opinion Mining and Sentiment Analysis 2013 New York Springer 415-463
[47]
Manning CD, Raghavan P, and Schütze H Introduction to Information Retrieval 2008 New York Cambridge University Press
[48]
H. Manurung, An Evolutionary Algorithm Approach to Poetry Generation, Ph.D. thesis (University of Edinburgh, School of Informatics, 2003)
[49]
R. Manurung, G. Ritchie, H. Thompson, An implementation of a flexible author-reviewer model of generation using genetic algorithms, in Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation (PACLIC) (De La Salle University (DLSU), Manila, 2008), pp. 272–281
[50]
E. Martínez-Cámara, M.C. Díaz-Galiano, M. Ángel García-Cumbreras García-Vega, M. Villena-Román, J.: Overview of TASS 2017, in TASS@SEPLN, CEUR Workshop Proceedings. CEUR-WS.org (2017), pp. 13–21
[51]
McKeown KR Text Generation—Using Discourse Strategies and Focus Constraints to Generate Natural Language Text. Studies in Natural Language Processing 1992 Cambridge Cambridge University Press
[52]
Miller GA Wordnet: a lexical database for english Commun. ACM 1995 38 11 39-41
[53]
M. Miwa, M. Bansal, End-to-end relation extraction using LSTMs on sequences and tree structures, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7–12, 2016, Berlin, Germany, Volume 1 (Long Papers, 2016), pp. 1105–1116
[54]
D. Moctezuma, M. Graff, S. Miranda-Jiménez, E.S. Tellez, A. Coronado, CN. Sánchez, J. Ortiz-Bejar, A genetic programming approach to sentiment analysis for twitter: Tass17, in TASS 2017: Workshop on Semantic Analysis at SEPLN (CEUR, 2017), pp. 23–28
[55]
A. Moraglio, K. Krawiec, C.G. Johnson, Geometric semantic genetic programming, in PPSN (1), Lecture Notes in Computer Science, vol. 7491 (Springer, Berlin, 2012), pp. 21–31
[56]
Nadeau D and Sekine S A survey of named entity recognition and classification Linguist. Invest. 2007 30 1 3-26
[57]
Navigli R and Ponzetto SPBabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic networkArtif. Intell.2012193217-25029888771270.68299
[58]
M. O’Neill, C. Ryan, Under the hood of grammatical evolution, in Proceedings of the 1st Annual Conference on Genetic and Evolutionary Computation—Volume 2, GECCO’99 (Morgan Kaufmann Publishers Inc., Los Altos, 1999), pp. 1143–1148
[59]
O’Neill M and Ryan C Grammatical evolution IEEE Trans. Evol. Comput. 2001 5 4 349-358
[60]
Ortega A, de la Cruz M, and Alfonseca M Christiansen grammar evolution: grammatical evolution with semantics IEEE Trans. Evol. Comput. 2007 11 1 77-90
[61]
Percha B and Altman RB Learning the structure of biomedical relationships from unstructured text PLoS Comput. Biol. 2015 11 7 e1004216
[62]
Perera R and Nand PRecent advances in natural language generation: a survey and classification of the empirical literatureComput. Inf.20173611-323677285
[63]
Rose CP Spector L, Langdon WB, O’Reilly UM, and Angeline PJ A genetic programming approach for robust language interpretation Advances in Genetic Programming 1999 Cambridge MIT Press 67-88
[64]
Ruano-Ordás D, Fdez-Riverola F, and Méndez JR Using evolutionary computation for discovering spam patterns from e-mail samples Inf. Process. Manag. 2018 54 2 303-317
[65]
C. Ryan, J. Collins, J. Collins, M. O’Neill, Grammatical evolution: evolving programs for an arbitrary language, in Lecture Notes in Computer Science, Proceedings of the First European Workshop on Genetic Programming, vol. 1391 (Springer, Berlin, 1998), pp. 83–95
[66]
Schwartz A SpamAssassin 2004 Newton O’Reilly Media Inc.
[67]
T.C. Smith, I.H. Witten, A genetic algorithm for the induction of natural language grammars, in Proceedings of the IJCAI-95 Workshop on New Approaches to Learning for Natural Language Processing (1995), pp. 17–24
[68]
M. Suganuma, S. Shirakawa, T. Nagao, A genetic programming approach to designing convolutional neural network architectures, in Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’17 (ACM, New York, 2017), pp. 497–504
[69]
Takagi H Interactive evolutionary computation: fusion of the capabilities of EC optimization and human evaluation Proc. IEEE 2001 89 9 1275-1296
[70]
I. Tiddi, M. d’Aquin, E. Motta, Learning to assess linked data relationships using genetic programming, in International Semantic Web Conference (1). Lecture Notes in Computer Science, vol. 9981 (2016), pp. 581–597
[71]
J. Villena-Román, J. García-Morera, MÁG. Cumbreras, E. Martínez-Cámara, MT. Martín-Valdivia, LAU. López, Overview of TASS 2015, in TASS@SEPLN, CEUR Workshop Proceedings, vol. 1397, CEUR-WS.org (2015), pp. 13–21
[72]
Winkler S, Schaller S, Dorfer V, Affenzeller M, Petz G, and Karpowicz M Data-based prediction of sentiments using heterogeneous model ensembles Soft Comput. 2015 19 12 3401-3412
[73]
Wu HY, Karnik S, Subhadarshini A, Wang Z, Philips S, Han X, Chiang C, Liu L, Boustani M, Rocha LM, Quinney SK, Flockhart D, and Li L An integrated pharmacokinetics ontology and corpus for text mining BMC Bioinf. 2013 14 35
[74]
V. Yadav, S. Bethard, A survey on recent advances in named entity recognition from deep learning models, in Proceedings of the 27th International Conference on Computational Linguistics (Association for Computational Linguistics, 2018), pp. 2145–2158
[75]
Young T, Hazarika D, Poria S, and Cambria E Recent trends in deep learning based natural language processing IEEE Comput. Int. Mag. 2018 13 3 55-75

Cited By

View all
  • (2021)Efficiency improvement of genetic network programming by tasks decomposition in different types of environmentsGenetic Programming and Evolvable Machines10.1007/s10710-021-09402-y22:2(229-266)Online publication date: 1-Jun-2021
  • (2021)Discovering novel memory cell designs for sentiment analysis on tweetsGenetic Programming and Evolvable Machines10.1007/s10710-020-09395-022:2(147-187)Online publication date: 1-Jun-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Genetic Programming and Evolvable Machines
Genetic Programming and Evolvable Machines  Volume 21, Issue 1-2
Jun 2020
275 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 June 2020
Revision received: 03 April 2019
Received: 15 October 2018

Author Tags

  1. Genetic programming
  2. Grammatical evolution
  3. Natural language processing
  4. Applications
  5. Challenges

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Efficiency improvement of genetic network programming by tasks decomposition in different types of environmentsGenetic Programming and Evolvable Machines10.1007/s10710-021-09402-y22:2(229-266)Online publication date: 1-Jun-2021
  • (2021)Discovering novel memory cell designs for sentiment analysis on tweetsGenetic Programming and Evolvable Machines10.1007/s10710-020-09395-022:2(147-187)Online publication date: 1-Jun-2021

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media