Word Sense Language Model for Information Retrieval

Liqi Gao²⁰,
Yu Zhang²⁰,
Ting Liu²⁰ &
…
Guiping Liu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Included in the following conference series:

Asia Information Retrieval Symposium

1006 Accesses
1 Citations

Abstract

This paper proposes a word sense language model based method for information retrieval. This method, differing from most of traditional ones, combines word senses defined in a thesaurus with a classic statistical model. The word sense language model regards the word sense as a form of linguistic knowledge, which is helpful in handling mismatch caused by synonym and data sparseness due to data limit. Experimental results based on TREC-Mandarin corpus show that this method gains 12.5% improvement on MAP over traditional tf-idf retrieval method but 5.82% decrease on MAP compared to a classic language model. A combination result of this method and the language model yields 8.92% and 7.93% increases over either respectively. We present analysis and discussions on the not-so-exciting results and conclude that a higher performance of word sense language model will owe to high accurate of word sense labeling. We believe that linguistic knowledge such as word sense of a thesaurus will help IR improve ultimately in many ways.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Word Sense Disambiguation System for Information Retrieval in Telugu Language

WordNet and Wiktionary-Based Approach for Word Sense Disambiguation

State of the Art Analysis of Word Sense Disambiguation

References

Sanderson, M.: Word Sense Disambiguation and Information Retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1994), pp. 142–151 (1994)
Google Scholar
Krovetz, R., Croft, W.B.: Lexical Ambiguity and Information Retrieval. In: Proceedings of ACM Transactions on Information Systems, pp. 115–141 (1992)
Google Scholar
Weiss, S.F.: Learning to Disambiguate. Information Storage and Retrieval 9, 33–41 (1973)
Article Google Scholar
Voorhees, E.M.: Using WordNet to Disambiguate Word Senses for Text Retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1993), pp. 171–180 (1993)
Google Scholar
Uzuner, O.: Word Sense Disambiguation Applied to Information Retrieval. Master paper of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, 5–20 (1998)
Google Scholar
Christopher Stokoe, M.P.O., Tait, J.: Word Sense Disambiguation in Information Retrieval Revisited. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in Information Retrieval (SIGIR 2003), pp. 159–166 (2003)
Google Scholar
Ponte, J., Croft, W.B.: A Language Modeling Approach to Information Retrieval. In: Proceedings of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1998), pp. 275–281 (1998)
Google Scholar
Liu, X., Croft, W.B.: Statistical Language Modeling for Information Retrieval. In: The Annual Review of Information Science and Technology, vol. 39 (2003)
Google Scholar
Schütze, H., Pedersen, J.O.: Information Retrieval Based on Word Senses. In: Proceedings 4th Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1995), pp. 161–175 (1995)
Google Scholar
Gao, J., Nie, J.-Y., Wu, G., Cao, G.: Dependence Language Model for Information Retrieval. In: Proceedings of the 27th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2004), pp. 170–177 (2004)
Google Scholar
Berger, A., Lafferty, J.: Information Retrieval as Statistical Translation. In: Proceedings of the 22nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1999), pp. 222–229 (1999)
Google Scholar
Liu, T., Lu, Z., Li, S.: Implement a Full-Text Automatic System for Word Sense Tagging. Journal of Harbin Institute of Technology 37(12), 1603–1604 (2004)
Google Scholar
Zhai, C., Lafferty, J.: Two-Stage Language Models for Information Retrieval. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002), pp. 49–56 (2002)
Google Scholar
Miller, D.R.H., Leek, T., Schwartz, R.M.: A Hidden Markov Model Information Retrieval System. In: The Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR1999), pp. 214–221 (1999)
Google Scholar
Song, F., Croft, W.B.: A General Language Model for Information Retrieval. In: Proceedings of the Conference on Information and Knowledge Management (CIKM 1999), pp. 316–321 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Box 321, Harbin, 150001, P.R. China
Liqi Gao, Yu Zhang, Ting Liu & Guiping Liu

Authors

Liqi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ting Liu
View author publications
You can also search for this author in PubMed Google Scholar
Guiping Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Hwee Tou Ng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Mun-Kew Leong
Department of Computer Science, School of Computing, National University of Singapore, 117543, Singapore
Min-Yen Kan
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, P.O. Box, 119613, Singapore
Donghong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, L., Zhang, Y., Liu, T., Liu, G. (2006). Word Sense Language Model for Information Retrieval. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_13

Download citation

DOI: https://doi.org/10.1007/11880592_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics