[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Exploiting Disambiguated Thesauri for Information Retrieval in Metadata Catalogs

  • Conference paper
Current Topics in Artificial Intelligence (TTIA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3040))

Included in the following conference series:

Abstract

Information in Digital Libraries is explicitly organized, described, and managed. The content of their data resources is summarized into small descriptions, usually called metadata, which can be either introduced manually or automatically generated. In this context, specialized thesauri are frequently used to provide accurate content for subject or keyword metadata elements. However, if a Digital Library aims at providing access for the general public, it is not reasonable to assume that casual users will use the same terms as the keywords used in metadata records. As an initial step to fill the semantic gap between user queries and metadata records, the authors of this paper already created a method for the semantic disambiguation of thesauri with respect to an upper-level ontology (WordNet). This paper presents now the integration of this disambiguation within an information retrieval system, in this case adapting the vector-space retrieval model. Thanks to the disambiguation, both metadata records and queries can be homogenously represented as a collection of WordNet synsets, thus enabling the computing of a similarity value, which ranks the results.

The basic technology of this work has been partially supported by the Spanish Ministry of Science and Technology through the projects TIC2000-1568-C03-01 from the National Plan for Scientific Research, Development and Technology Innovation and FIT-150500-2003-519 from the National Plan for Information Society. The work of J. Lacasta has been partially supported by a grant from the Aragón Government and the European Social Fund (ref. B139/2003).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Clark, P., Thompson, J., Holmback, H., Duncan, L.: Exploiting a thesaurus-based semantic net for knowledge-based search. In: Proc 12th Conf on Innovative Application of AI (AAAI/IAAI 2000), pp. 988–995 (2000)

    Google Scholar 

  2. Mata, E.J., Ansó, J., Bañares, J.A., Muro-Medrano, P.R., Rubio, J.: Enriquecimiento de tesauros con wordnet: una aproximación heurística. In: Actas IX CAEPIA, Gijón, pp. 593–602 (2001)

    Google Scholar 

  3. Miller, G.A.: Wordnet: An on-line lexical database. Int. J. Lexicography 3 (1990)

    Google Scholar 

  4. Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve Text Retrieval. In: Proc. COLING/ACL 1998 Workshop on Usage of WordNet for Natural Language Processing (1998)

    Google Scholar 

  5. Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings of the 17th International Conference on Research and Development in Information Retrieval (1994)

    Google Scholar 

  6. Salton, G. (ed.): The SMART retrieval system - Experiments in Automatic Document Processing. Prentice Hall, Inc., Englewood Cliffs (1971)

    Google Scholar 

  7. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  8. Voorhees, E.M.: Using WordNet to disambiguate Word Senses for Text Retrieval. In: SIGIR 1993, Proc. 16th annual international ACM SIGIR conf. on Research and Development in Information Retrieval, pp. 171–180 (1993)

    Google Scholar 

  9. Voorhees, E.M.: On Expanding Query Vectors with Lexically Related Words. In: Text REtrieval Conference, pp. 223–232 (1993)

    Google Scholar 

  10. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24, 513–523 (1988)

    Article  Google Scholar 

  11. Bernabé, M.A., Gould, M., Muro-Medrano, P.R., Nogueras, J., Zarazaga, F.J.: Effective steps toward the Spain National Geographic Information Infrastructure. In: Proc 4th AGILE Conference on Geographic Information Science, Brno, Czech Republic, pp. 236–243 (2001)

    Google Scholar 

  12. Nassar, N.: Searching With Isearch, Moving beyond WAIS. Web Techniques magazine (1997), www.webtechniques.com

  13. Scherer, D., Brennan, C.: Exploring Oracle Text Basics. Oracle Magazine (March/April 2001) http://www.oracle.com/oramag/index.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nogueras-Iso, J., Lacasta, J., Bañares, J.Á., Muro-Medrano, P.R., Zarazaga-Soria, F.J. (2004). Exploiting Disambiguated Thesauri for Information Retrieval in Metadata Catalogs. In: Conejo, R., Urretavizcaya, M., Pérez-de-la-Cruz, JL. (eds) Current Topics in Artificial Intelligence. TTIA 2003. Lecture Notes in Computer Science(), vol 3040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25945-9_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-25945-9_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22218-7

  • Online ISBN: 978-3-540-25945-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics