Abstract
Information Extraction (IE) concerned with the automatic extraction of facts from text and stores them in a database for easy use and management of the data. As the first research work on IE from Afan Oromo text, we designed a model that deals with Infrastructure news domains in the Oromo language. The proposed model has document preprocessing, learning and extraction and post processing as its main components. In this work recall, precision and F-measure are used as evaluation metrics for Afan Oromo Text Information Extraction (AOTIE). Being trained and tested for the dataset of size 3169 tokens, AOTIE performed 79.5% precision, 80.5% recall and 80% F-measure. These results are used as a baseline to experiment on AOTIE. We set up two main experimentation scenarios to experiment on AOTIE. The first scenario is conducted by developing a gazetteer. The second scenario is aimed at observing the influence of Afan Oromo grammatical structure. Both scenarios showed that, the performance of AOTIE is mostly dependent on grammatical structure of Afan Oromo.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Corro, L.D.: Methods for open information extraction and sense disambiguation on natural language text, Doktor der Ingenieurwissenschaften (Dr.-Ing.), Naturwissenschaftlich-Technischen, Saarlandes (2015)
Fabio Ciravegna, C.G., Kushmerick, N., Lavelli, A., Muslea, I.: Adaptive text extraction and mining (ATEM 2006). In: 11th Conference of the European Chapter of the Association for Computational Linguistics Proceedings of the Workshop, Trento, Italy, 4 April 2006
Wilks, J.C.A.Y.: Information extraction, Lecture note on Information extraction
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
Ababa, A.: Population Census Commission, Summary and statistical report of the 2007 population and housing census, population size by age and sex (2008)
Etnologue: Show Language. http://www.ethnologue.com/web.asp. Accessed 15 Jan 2016
Siefkes, C., Siniakov, P.: An overview and classification of adaptive approaches to information extraction. J. Data Semantics IV, 172–212 (2005)
Masche, P.: Multilingual information extraction, Master’s thesis, University of Helsinki, Faculty of Science, Department of Computer Science (2004)
Tsedalu, G.: Information extraction model from Amharic news texts, Addis Ababa University (2010)
Stanford tokenization. https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html. Accessed 24 Apr 2017
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Abera, S., Tegegne, T. (2019). Information Extraction Model for Afan Oromo News Text. In: Mekuria, F., Nigussie, E., Tegegne, T. (eds) Information and Communication Technology for Development for Africa. ICT4DA 2019. Communications in Computer and Information Science, vol 1026. Springer, Cham. https://doi.org/10.1007/978-3-030-26630-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-26630-1_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26629-5
Online ISBN: 978-3-030-26630-1
eBook Packages: Computer ScienceComputer Science (R0)