Information Extraction Model for Afan Oromo News Text

Sisay Abera¹⁰ &
Tesfa Tegegne¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1026))

Included in the following conference series:

International Conference on Information and Communication Technology for Development for Africa

544 Accesses

Abstract

Information Extraction (IE) concerned with the automatic extraction of facts from text and stores them in a database for easy use and management of the data. As the first research work on IE from Afan Oromo text, we designed a model that deals with Infrastructure news domains in the Oromo language. The proposed model has document preprocessing, learning and extraction and post processing as its main components. In this work recall, precision and F-measure are used as evaluation metrics for Afan Oromo Text Information Extraction (AOTIE). Being trained and tested for the dataset of size 3169 tokens, AOTIE performed 79.5% precision, 80.5% recall and 80% F-measure. These results are used as a baseline to experiment on AOTIE. We set up two main experimentation scenarios to experiment on AOTIE. The first scenario is conducted by developing a gazetteer. The second scenario is aimed at observing the influence of Afan Oromo grammatical structure. Both scenarios showed that, the performance of AOTIE is mostly dependent on grammatical structure of Afan Oromo.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Framework for Event Information Extraction from Chinese News Online

Extractive Document Summarization of Text in Odia Language

Inclusive Review on Extractive and Abstractive Text Summarization: Taxonomy, Datasets, Techniques and Challenges

References

Corro, L.D.: Methods for open information extraction and sense disambiguation on natural language text, Doktor der Ingenieurwissenschaften (Dr.-Ing.), Naturwissenschaftlich-Technischen, Saarlandes (2015)
Google Scholar
Fabio Ciravegna, C.G., Kushmerick, N., Lavelli, A., Muslea, I.: Adaptive text extraction and mining (ATEM 2006). In: 11th Conference of the European Chapter of the Association for Computational Linguistics Proceedings of the Workshop, Trento, Italy, 4 April 2006
Google Scholar
Wilks, J.C.A.Y.: Information extraction, Lecture note on Information extraction
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
Book Google Scholar
Ababa, A.: Population Census Commission, Summary and statistical report of the 2007 population and housing census, population size by age and sex (2008)
Google Scholar
Etnologue: Show Language. http://www.ethnologue.com/web.asp. Accessed 15 Jan 2016
Siefkes, C., Siniakov, P.: An overview and classification of adaptive approaches to information extraction. J. Data Semantics IV, 172–212 (2005)
Google Scholar
Masche, P.: Multilingual information extraction, Master’s thesis, University of Helsinki, Faculty of Science, Department of Computer Science (2004)
Google Scholar
Tsedalu, G.: Information extraction model from Amharic news texts, Addis Ababa University (2010)
Google Scholar
Stanford tokenization. https://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html. Accessed 24 Apr 2017

Download references

Author information

Authors and Affiliations

College of Engineering and Technology, Adigrat University, POBOX 50, Adigrat, Ethiopia
Sisay Abera
Faculty of Computing, Bahir Dar Institute of Technology, Bahir Dar University, POBOX 26, Bahir Dar, Ethiopia
Tesfa Tegegne

Authors

Sisay Abera
View author publications
You can also search for this author in PubMed Google Scholar
Tesfa Tegegne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Sisay Abera or Tesfa Tegegne .

Editor information

Editors and Affiliations

Council for Scientific and Industrial Research, Meraka ICT Institute, Pretoria, South Africa
Fisseha Mekuria
Department of Future Technologies, University of Turku, Turku, Finland
Ethiopia Nigussie
ICT4D Research Center, Bahir Dar University, Bahir Dar , Ethiopia
Tesfa Tegegne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abera, S., Tegegne, T. (2019). Information Extraction Model for Afan Oromo News Text. In: Mekuria, F., Nigussie, E., Tegegne, T. (eds) Information and Communication Technology for Development for Africa. ICT4DA 2019. Communications in Computer and Information Science, vol 1026. Springer, Cham. https://doi.org/10.1007/978-3-030-26630-1_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-26630-1_28
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26629-5
Online ISBN: 978-3-030-26630-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Information Extraction Model for Afan Oromo News Text

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Framework for Event Information Extraction from Chinese News Online

Extractive Document Summarization of Text in Odia Language

Inclusive Review on Extractive and Abstractive Text Summarization: Taxonomy, Datasets, Techniques and Challenges

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Information Extraction Model for Afan Oromo News Text

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Framework for Event Information Extraction from Chinese News Online

Extractive Document Summarization of Text in Odia Language

Inclusive Review on Extractive and Abstractive Text Summarization: Taxonomy, Datasets, Techniques and Challenges

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation