[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Part-of-speech tagging for portuguese texts

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (SBIA 1995)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 991))

Included in the following conference series:

  • 153 Accesses

Abstract

In this paper we will describe the work that is being cooperatively done by Portugal and Brazil. It uses Statistical Methods for Natural Language Processing. Namely, we will focus on the problem of Part-of-Speech (POS) Tagging. POS Tagging is a recent and successful technique for assigning each word in a sentence its correct POS tag. This technique can achieve more than 96% of accuracy, even with unseen untagged texts. All steps involved in this process will be described as well as the problems faced. Besides, we will present the stochastic approach to POS Tagging, which treats the generation of tag alignments as a probabilistic problem. Finally, we will report the results achieved by using these kinds of techniques for Portuguese texts.

Work partially suported by a PhD Scholarship by JNICT-PRAXIS XXI/BD/2909/94

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  1. BRILL, E.; A Simple Rule-Based Part of Speech Tagger. In Proceedings of the DARPA Speech and Natural Language Workshop, 112–116, 1992.

    Google Scholar 

  2. BRISCOE, E.J.B.; CARROL, J.; Robust Parsing — Advanced Course. ESSLLI'94.

    Google Scholar 

  3. CHANG, C.H.; CHEN, C.D.; HMM-based Part-of-Speech Tagging for Chinese Corpora. In Proceedings of the Workshop on Very Large Corpora: Academic and Industrial Perspectives, p. 40–47, 1993.

    Google Scholar 

  4. CHANOD, J. P.;TAPANAINEN, P.; Creating a tagset, lexicon and guesser for a French tagger. CMP-LG, 1995.

    Google Scholar 

  5. CHURCH, K. W.; A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing (ACL), p 136–143, 1988.

    Google Scholar 

  6. CHURCH, K. W.; GALE, W.A.; A Comparison of the Enhanced Good-Turing and Deleted Estimation Methods for Estimating Probabilities of English Bigrams. In Computer Speech and Language, 5:19–54, 1991.

    Article  Google Scholar 

  7. CUTTING, D.; KUPIEC, J.; PEDERSEN, J.; SIBUN, P.; A practical part-of-speech tagger. In Proceedings of the 3rd Conference on Applied Language Processing, Trento, Italy, 133–140,1992.

    Google Scholar 

  8. ELWORTHY, D.; Does Baum-Welch Re-estimation Help Taggers? In CMP-LG 1994.

    Google Scholar 

  9. ELWORTHY, D.; Tagset Design and Inflected Languages. In CMP-LG 1994.

    Google Scholar 

  10. KEMPE, A.; A Stochastic Tagger and an Analysis of Tagging Errors. Internal Paper. Institute for Computational Linguistics, University of Stuttgart.

    Google Scholar 

  11. KEMPE, A.; Probabilistic Tagging with Feature Structures. IN CMP-LG 1994.

    Google Scholar 

  12. LOPES, J.G.P., SANTOS, A.M.M.; Portuguese Lexicon Acquisition Interface (PLAIN). In Eurolex 8∼90, Proceedings, BiblografVOX, 1992, 105–107.

    Google Scholar 

  13. Marques, N. M. C.; Lopes, J. G. P.; POLARIS: A Po rtuguese L exicon A cquisition and R etrieval I nteractive S ystem. In Proceedings of the Conference on Practical Applications of Prolog, 1994.

    Google Scholar 

  14. MERIALDO, B.; Tagging English Text with a Probabilistic Model. In Computational Linguistics, v. 20, n. 2, p. 155–171, 1994.

    Google Scholar 

  15. SCHIMID, H.; Part-of-Speech Tagging with Neural Networks. CMP-LG 1994.

    Google Scholar 

  16. VOUTILAINEN, A.; A syntax-based part-of-speech analyser. In CMP-LG 19955

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jacques Wainer Ariadne Carvalho

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Villavicencio, A., Lopes, J.G.P., Marques, N.M.C., Villavicencio, F. (1995). Part-of-speech tagging for portuguese texts. In: Wainer, J., Carvalho, A. (eds) Advances in Artificial Intelligence. SBIA 1995. Lecture Notes in Computer Science, vol 991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0034825

Download citation

  • DOI: https://doi.org/10.1007/BFb0034825

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60436-5

  • Online ISBN: 978-3-540-47467-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics