[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Four Methods for Supervised Word Sense Disambiguation

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4592))

Abstract

Word sense disambiguation is the task to identify the intended meaning of an ambiguous word in a certain context, one of the central problems in natural language processing. This paper describes four novel supervised disambiguation methods which adapt some familiar algorithms. They built on the Vector Space Model using an automatically generated stop list and two different statistical methods of finding index terms. These proceedings allow a fully automated and language independent disambiguation. The first method is based upon Latent Semantic Analysis, an automatic indexing method employed for text retrieval. The second one disambiguates via co-occurrence vectors of the target word. Disambiguation relying on Naive Bayes uses the Naive Bayes Classifier and disambiguation relying on SenseClusters uses an unsupervised word sense discrimination technique. These methods were implemented and evaluated to experience their performance, to compare the different approaches and to draw conclusions about the main characteristic of supervised disambiguation. The results show that the classification approach using Naive Bayes is the most efficient, scalable and successful method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Miller, G.A., Charles, W.G.: Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6(1), 1–28 (1991)

    Article  Google Scholar 

  2. Schütze, H.: Automatic Word Sense Discrimination. Computational Linguistics 24(1), 97–123 (1998)

    Google Scholar 

  3. Purandare, A., Pedersen, T.: Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces. In: Proceedings of CoNLL-2004, pp. 41–48 (2004)

    Google Scholar 

  4. Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 136–145. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Lesk, M.: Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. In: 5th International Conference on Systems Documentation (1986)

    Google Scholar 

  6. Levow, G.A.: Corpus-based techniques for Word Sense Disambiguation. Technical Report AIM-1637, MIT AI Lab, 1, Cambridge (1997)

    Google Scholar 

  7. Bagga, A., Baldwin, B.: Entity-Based Cross-Document Coreferencing Using the Vector Space Model. In: 16th conference on Computational linguistics (1996)

    Google Scholar 

  8. Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Information Retrieval. Communications of the ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  9. Purandare, A.: Unsupervised Word Sense Discrimination by Clustering Similar Contexts. University of Minnesota (August 2004)

    Google Scholar 

  10. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  11. Berry, M.W., Dumais, S.T., O’Brian, G.W.: Using Linear Algebra for Intelligent Information Retrieval. Computer Science Department, CS-94-270 (1994)

    Google Scholar 

  12. Kontostathis, A., Pottenger, W.M.: Detecting Patterns in the LSI Term-Term Matrix. Technical Report LU-CSE-02-010, Department of Computer Science and Engineering, Lehigh University (2002)

    Google Scholar 

  13. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  14. Russel, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs (2003)

    Google Scholar 

  15. Karov, Y., Edelman, S.: Similarity-based word sense disambiguation. Computational Linguistics 24(1) (March 1998)

    Google Scholar 

  16. Banko, M., Brill, E.: Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics (2001)

    Google Scholar 

  17. Pedersen, T., Banerjee, S., Patwardhan, S.: Maximizing Semantic Relatedness to Perform Word Sense Disambiguation, University of Minnesota Supercomputing Institute Research Report UMSI 2005/25 (March 2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zoubida Kedad Nadira Lammari Elisabeth Métais Farid Meziane Yacine Rezgui

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schumacher, K. (2007). Four Methods for Supervised Word Sense Disambiguation. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73351-5_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73350-8

  • Online ISBN: 978-3-540-73351-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics