Analyzing Tagging Accuracy of Part-of-Speech Taggers

Nyein Pyae Pyae Khin⁷ &
Than Nwe Aung⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 388))

Included in the following conference series:

International Conference on Genetic and Evolutionary Computing

1743 Accesses
1 Citations

Abstract

Automated part-of-speech (POS) tagging has been a very active research area for many years and is the foundation of natural language processing systems. Natural Language Toolkit (NLTK) library in the Python environment provides the necessary tools for tagging, but doesn’t actually tell us what methods work the best. Therefore, this work analyzes the performance of part-of-speech taggers, namely the NLTK Default tagger, Regex tagger and N-gram taggers (Unigram, Bigram and Trigram) on a particular corpus. The corpora we have used for the analysis are; Brown, Penn Treebank and CoNLL2000. We have applied all taggers to these three corpora, resultantly we have shown that whereas Unigram tagger does the best tagging in all corpora, the combination of taggers does better if it is correctly ordered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 103.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 129.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Towards POS Tagging Methods for Bengali Language: A Comparative Analysis

“Part of Speech Tagging – A Corpus Based Approach”

Part-of-Speech Tagging of Hindi Corpus Using Rule-Based Method

References

Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. OReilly Media, USA (2009)
MATH Google Scholar
Boehm, I.: Unigram Backoff vs. TnT Evaluating Part of Speech Taggers, Introduction to Computational Linguistics, Austria
Google Scholar
Smedt, T.D., Marfia, F., Matteucci, M., Daelemans, W.: Using Wiktionary to Build an Italian, CLiPS Computational Linguistics Research Group. University of Antwerp
Google Scholar
Sheikh, Z.M.A.W.: A Trigram Part-of-Speech Tagger for the Apertium Free/Open Source Machine Translation Platform, Computer Science and Engineering. National Institute of Technology Allahabad-211004, India
Google Scholar
Hagerman, C.: Evaluating the Performance of Automated Part-of-Speech Taggers on an L2 Corpus. Osaka Jogakuin College
Google Scholar
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics 19, 313–330 (1993)
Google Scholar
Part-Of-Speech tagging with NLTK. https://streamhacker.wordpress.com/tag/tagging/
NLTK 3.0 Documentation. http://www.nltk.org/
Brown Corpus Manual. http://icame.uib.no/brown/bcm.html
NLTK Default Tagger Performance on CoNLL2000. http://streamhacker.com/2011/01/25/nltk-default-tagger-conll2000-tag-coverage/
Processing Corpora with Python and the Natural Language Toolkit. http://www.freecode.com/articles/processing-corpora-with-python-and-the-natural-language-toolkit
Corpus Readers-Tagged Corpora. http://www.nltk.org/howto/corpus.html#tagged-corpora

Download references

Author information

Authors and Affiliations

University of Computer Studies, Mandalay, Myanmar
Nyein Pyae Pyae Khin & Than Nwe Aung

Authors

Nyein Pyae Pyae Khin
View author publications
You can also search for this author in PubMed Google Scholar
Than Nwe Aung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Nyein Pyae Pyae Khin or Than Nwe Aung .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Miyazaki, Miyazaki, Japan
Thi Thi Zin
School of Computer Science and Tech..., Harbin Institute of Technology, Shenzhen Graduate School , Shenzhen, China
Jerry Chun-Wei Lin
College of Information Science and Engg, Fujian University of Technology, Fuzhou, China
Jeng-Shyang Pan
Faculty of Engineering, University of Miyazaki, Miyazaki, Japan
Pyke Tin
Faculty of Engineering, University of Miyazaki, Miyazaki, Japan
Mitsuhiro Yokota

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khin, N.P.P., Aung, T.N. (2016). Analyzing Tagging Accuracy of Part-of-Speech Taggers. In: Zin, T., Lin, JW., Pan, JS., Tin, P., Yokota, M. (eds) Genetic and Evolutionary Computing. GEC 2015. Advances in Intelligent Systems and Computing, vol 388. Springer, Cham. https://doi.org/10.1007/978-3-319-23207-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-319-23207-2_35
Published: 04 September 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23206-5
Online ISBN: 978-3-319-23207-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Analyzing Tagging Accuracy of Part-of-Speech Taggers

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Towards POS Tagging Methods for Bengali Language: A Comparative Analysis

“Part of Speech Tagging – A Corpus Based Approach”

Part-of-Speech Tagging of Hindi Corpus Using Rule-Based Method

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Analyzing Tagging Accuracy of Part-of-Speech Taggers

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Towards POS Tagging Methods for Bengali Language: A Comparative Analysis

“Part of Speech Tagging – A Corpus Based Approach”

Part-of-Speech Tagging of Hindi Corpus Using Rule-Based Method

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation