[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Automatically Enriching a Thesaurus with Information from Dictionaries

  • Conference paper
Progress in Artificial Intelligence (EPIA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7026))

Included in the following conference series:

Abstract

Regarding that information in broad-coverage knowledge bases, such as thesauri, is usually incomplete, merging information from different sources is a good option to amplify coverage. We propose a method for the enrichment of a thesaurus with information acquired automatically from dictionaries: pairs of synonyms are assigned to candidate synsets and, the pairs whose elements are not in the thesaurus are clustered to identify new synsets. This method was used in the enrichment of a Brazilian Portuguese thesaurus with synonyms from a European Portuguese dictionary, and resulted in a larger and broader thesaurus with new words and new concepts. The assignments and the obtained synsets were manually evaluated and yielded correction scores higher than 71% and 85% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., Soroa, A.: A study on similarity and relatedness using distributional and WordNet-based approaches. In: Proc. Human Language Technologies: 2009 Annual Conference of the North American Chapter of ACL (NAACL-HLT), pp. 19–27. ACL, Stroudsburg (2009)

    Google Scholar 

  2. Dolan, W.B.: Word sense ambiguation: clustering related senses. In: Proc. 15th Conference on Computational Linguistics (COLING), pp. 712–716. ACL, Morristown (1994)

    Chapter  Google Scholar 

  3. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press (May 1998)

    Google Scholar 

  4. Gangemi, A., Guarino, N., Masolo, C., Oltramari, A.: Interfacing WordNet with DOLCE: towards OntoWordNet. In: Ontology and the Lexicon: A Natural Language Processing Perspective, ch.3. Cambridge University Press (2010)

    Google Scholar 

  5. Gfeller, D., Chappelier, J.C., Rios, P.D.L.: Synonym Dictionary Improvement through Markov Clustering and Clustering Stability. In: Proc. International Symposium on Applied Stochastic Models and Data Analysis (ASMDA), pp. 106–113 (2005)

    Google Scholar 

  6. Gomes, P., Pereira, F.C., Paiva, P., Seco, N., Carreiro, P., Ferreira, J.L., Bento, C.: Noun sense disambiguation with wordnet for software design retrieval. In: Proc. Advances in Artificial Intelligence, 16th Conference of the Canadian Society for Computational Studies of Intelligence, Halifax, Canada, pp. 537–543 (2003)

    Google Scholar 

  7. Gonçalo Oliveira, H., Gomes, P.: Onto.PT: Automatic Construction of a Lexical Ontology for Portuguese. In: Proc. 5th European Starting AI Researcher Symposium (STAIRS 2010). IOS Press (2010)

    Google Scholar 

  8. Gonçalo Oliveira, H., Gomes, P.: Automatic discovery of fuzzy synsets from dictionary definitions. In: Proc. 22nd International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain (2011)

    Google Scholar 

  9. Gonçalo Oliveira, H., Santos, D., Gomes, P.: Extracção de relações semânticas entre palavras a partir de um dicionário: o PAPEL e sua avaliação. Linguamática 2(1), 77–93 (2010)

    Google Scholar 

  10. Harabagiu, S.M., Moldovan, D.I.: Enriching the WordNet taxonomy with contextual knowledge acquired from text. In: Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language, pp. 301–333. MIT Press, Cambridge (2000)

    Google Scholar 

  11. Hearst, M.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database and Some of its Applications, pp. 131–153. MIT Press, Cambridge (1998)

    Google Scholar 

  12. Kilgarriff, A.: Word senses are not bona fide objects: implications for cognitive science, formal semantics. In: Proc. 5th International Conference on the Cognitive Science of Natural Language Processing, NLP, pp. 193–200 (1996)

    Google Scholar 

  13. Lin, D., Pantel, P.: Concept discovery from text. In: Proc. 19th International Conference on Computational Linguistics (COLING), pp. 577–583 (2002)

    Google Scholar 

  14. Maziero, E.G., Pardo, T.A.S., Felippo, A.D., Dias-da-Silva, B.C.: A Base de Dados Lexical e a Interface Web do TeP 2.0 - Thesaurus Eletrônico para o Português do Brasil. In: VI Workshop em Tecnologia da Informação e da Linguagem Humana (TIL), pp. 390–392 (2008)

    Google Scholar 

  15. Nastase, V., Szpakowicz, S.: Augmenting WordNet’s Structure Using LDOCE. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 281–294. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  16. Navarro, E., Sajous, F., Gaume, B., Prévot, L., Hsieh, S., Kuo, T.Y., Magistry, P., Huang, C.R.: Wiktionary and NLP: Improving synonymy networks. In: Proc. 2009 Workshop on The People’s Web Meets NLP: Collaboratively Constructed Semantic Resources, pp. 19–27. ACL, Suntec (2009)

    Chapter  Google Scholar 

  17. Navigli, R., Velardi, P., Cucchiarelli, A., Neri, F.: Extending and enriching WordNet with OntoLearn. In: Proc. 2nd Global WordNet Conference (GWC), pp. 279–284. Masaryk University, Brno (2004)

    Google Scholar 

  18. Niemann, E., Gurevych, I.: The people’s web meets linguistic knowledge: Automatic sense alignment of wikipedia and WordNet. In: Proc. International Conference on Computational Semantics (IWCS), Oxford, UK, pp. 205–214 (2011)

    Google Scholar 

  19. Pantel, P.: Inducing ontological co-occurrence vectors. In: Proc. 43rd Annual Meeting of the Association for Computational Linguistics, pp. 125–132. ACL Press (2005)

    Google Scholar 

  20. Pasca, M., Harabagiu, S.M.: The informative role of WordNet in open-domain question answering. In: Proc. NAACL 2001 Workshop on WordNet and Other Lexical Resources: Applications, Extensions and Customizations, Pittsburgh, USA, pp. 138–143 (2001)

    Google Scholar 

  21. Pease, A., Fellbaum, C.: Formal ontology as interlingua: the SUMO and WordNet linking project and global WordNet linking project and global WordNet. In: Ontology and the Lexicon: A Natural Language Processing Perspective, ch.2., Cambridge University Press (2010)

    Google Scholar 

  22. Peters, W., Peters, I., Vossen, P.: Automatic sense clustering in EuroWordnet. In: Proc. 1st International Conference on Language Resources and Evaluation (LREC), Granada, pp. 409–416 (May 1998)

    Google Scholar 

  23. Ponzetto, S.P., Navigli, R.: Large-scale taxonomy mapping for restructuring and integrating Wikipedia. In: Proc. 21st International Joint Conference on Artificial Intelligence (IJCAI), Pasadena, California, pp. 2083–2088 (2009)

    Google Scholar 

  24. Ponzetto, S.P., Navigli, R.: Knowledge-rich word sense disambiguation rivaling supervised systems. In: Procs. of 48th Annual Meeting of the Association for Computational Linguistics, pp. 1522–1531. ACL Press, Uppsala (2010)

    Google Scholar 

  25. Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic Assignment of Wikipedia Encyclopedic Entries to WordNet Synsets. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 380–386. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  26. Santos, D., Barreiro, A., Costa, L., Freitas, C., Gomes, P., Gonçalo Oliveira, H., Medeiros, J.C., Silva, R.: O papel das relações semânticas em português: Comparando o TeP, o MWN.PT e o PAPEL. In: Actas do XXV Encontro Nacional da Associação Portuguesa de Linguística (forthcomming, 2010)

    Google Scholar 

  27. Teixeira, J., Sarmento, L., Oliveira, E.: Comparing Verb Synonym Resources for Portuguese. In: Computational Processing of the Portuguese Language, 9th International Conference Proc. (PROPOR), Porto Alegre, Brasil, pp. 100–109 (2010)

    Google Scholar 

  28. Tonelli, S., Pighin, D.: New features for FrameNet: WordNet mapping. In: Proc. 13th Conference on Computational Natural Language Learning (CoNLL), pp. 219–227. ACL, Stroudsburg (2009)

    Chapter  Google Scholar 

  29. Toral, A., Muñoz, R., Monachini, M.: Named Entity Wordnet. In: Proc. International Conference on Language Resources and Evaluation (LREC). ELRA, Marrakech (2008)

    Google Scholar 

  30. Vossen, P.: EuroWordNet: a multilingual database for information retrievaleuroWordNet: a multilingual database for information retrieval. In: Proc. DELOS workshop on Cross-Language Information Retrieval, Zurich (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oliveira, H.G., Gomes, P. (2011). Automatically Enriching a Thesaurus with Information from Dictionaries. In: Antunes, L., Pinto, H.S. (eds) Progress in Artificial Intelligence. EPIA 2011. Lecture Notes in Computer Science(), vol 7026. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24769-9_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24769-9_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24768-2

  • Online ISBN: 978-3-642-24769-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics