[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-642-36973-5_36guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Two-Stage learning to rank for information retrieval

Published: 24 March 2013 Publication History

Abstract

Current learning to rank approaches commonly focus on learning the best possible ranking function given a small fixed set of documents. This document set is often retrieved from the collection using a simple unsupervised bag-of-words method, e.g. BM25. This can potentially lead to learning a sub-optimal ranking, since many relevant documents may be excluded from the initially retrieved set. In this paper we propose a novel two-stage learning framework to address this problem. We first learn a ranking function over the entire retrieval collection using a limited set of textual features including weighted phrases, proximities and expansion terms. This function is then used to retrieve the best possible subset of documents over which the final model is trained using a larger set of query- and document-dependent features. Empirical evaluation using two web collections unequivocally demonstrates that our proposed two-stage framework, being able to learn its model from more relevant documents, outperforms current learning to rank approaches.

References

[1]
Liu, T.Y.: Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3(3), 225-331 (2009)
[2]
Metzler, D., Croft, W. B.: Linear feature-based models for information retrieval. Information Retrieval 10(3), 257-274 (2007)
[3]
Liu, T.Y., Xu, J., Qin, T., Xiong, W., Li, H.: LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval. In: SIGIR (2007)
[4]
Bendersky, M., Metzler, D., Croft, W. B.: Effective query formulation with multiple information sources. In: WSDM, pp. 443-452 (2012)
[5]
Bendersky, M., Metzler, D., Croft, W. B.: Learning concept importance using a weighted dependence model. In: WSDM, pp. 31-40 (2010)
[6]
Metzler, D., Croft, W. B.: A Markov random field model for term dependencies. In: SIGIR, pp. 472-479 (2005)
[7]
Peng, J., Macdonald, C., He, B., Plachouras, V., Ounis, I.: Incorporating term dependency in the DFR framework. In: SIGIR, pp. 843-844 (2007)
[8]
Lu, Y., Peng, F., Mishne, G., Wei, X., Dumoulin, B.: Improving Web search relevance with semantic features. In: EMNLP, pp. 648-657 (2009)
[9]
Zhu, M., Shi, S., Li, M., Wen, J. R.: Effective top-k computation in retrieving structured documents with term-proximity support. In: CIKM, pp. 771-780 (2007)
[10]
Tonellotto, N., Macdonald, C., Ounis, I.: Efficient dynamic pruning with proximity support. In: LSDS-IR (2010)
[11]
Burges, C. J.C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G. N.: Learning to rank using gradient descent. In: ICML, pp. 89-96 (2005)
[12]
Burges, C. J.C., Ragno, R., Le, Q. V.: Learning to Rank with Nonsmooth Cost Functions. In: NIPS, pp. 193-200 (2006)
[13]
Macdonald, C., Santos, R., Ounis, I.: The whens and hows of learning to rank for web search. Information Retrieval, 1-45 (2012)
[14]
McCreadie, R., Macdonald, C., Santos, R. L. T., Ounis, I.: University of Glasgow at TREC 2011: Experiments with Terrier in Crowdsourcing, Microblog, and Web Tracks. In: TREC (2011)
[15]
Bendersky, M., Croft, W. B., Diao, Y.: Quality-biased ranking of web documents. In: WSDM, pp. 95-104 (2011)
[16]
Friedman, J. H.: Greedy function approximation: A gradient boosting machine. Annals of Statistics 29, 1189-1232 (1999)
[17]
Freund, Y., Iyer, R., Schapire, R., Singer, Y.: An efficient boosting algorithm for combining preferences. The Journal of Machine Learning Research 4, 933-969 (2003)
[18]
Wu, Q., Burges, C. J.C., Gao, K. S., Adapting, J.: boosting for information retrieval measures. Information Retrieval 13(3), 254-270 (2010)
[19]
Chapelle, O., Y.C.: Yahoo! learning to rank challenge overview. Machine Learning. Machine Learning Research - Proceedings Track 14, 1-24 (2011)
[20]
Donmez, P., Svore, K. M., Burges, C. J.C.: On the local optimality of Lambda Rank. In: SIGIR, pp. 460-467 (2009)
[21]
Metzler, D., Croft, W. B.: Latent concept expansion using markov random fields. In: Proceedings of the Annual ACM SIGIR Conference, pp. 311-318 (2007)
[22]
Aslam, J. A., Kanoulas, E., Pavlu, V., Savev, S., Yilmaz, E.: Document selection methodologies for efficient and effective learning-to-rank. In: SIGIR, pp. 468-475 (2009)
[23]
Donmez, P., Carbonell, J. G.: Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 78-89. Springer, Heidelberg (2009)
[24]
Yilmaz, E., Robertson, S.: On the choice of effectiveness measures for learning to rank. Information Retrieval 13, 271-290 (2010)
[25]
Boytsov, L., Belova, A.: Evaluating learning-to-rank methods in the web track adhoc task. In: TREC (2011)
[26]
Bendersky, M., Metzler, D., Croft, W. B.: Parameterized concept weighting in verbose queries. In: SIGIR, pp. 605-614 (2011)

Cited By

View all
  • (2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
  • (2023)Report on the 1st Workshop on Reaching Efficiency in Neural Information Retrieval (ReNeuIR 2022) at SIGIR 2022ACM SIGIR Forum10.1145/3582900.358291656:2(1-14)Online publication date: 31-Jan-2023
  • (2023)ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591922(3456-3459)Online publication date: 19-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECIR'13: Proceedings of the 35th European conference on Advances in Information Retrieval
March 2013
890 pages
ISBN:9783642369728
  • Editors:
  • Pavel Serdyukov,
  • Pavel Braslavski,
  • Sergei O. Kuznetsov,
  • Jaap Kamps,
  • Stefan Rüger

Sponsors

  • MRU: Mail.Ru
  • Google Inc.
  • ABBYY: ABBYY
  • RFBR: Russian Foundation for Basic Research
  • Yahoo! Labs

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 March 2013

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ReNeuIR at SIGIR 2024: The Third Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657994(3051-3054)Online publication date: 10-Jul-2024
  • (2023)Report on the 1st Workshop on Reaching Efficiency in Neural Information Retrieval (ReNeuIR 2022) at SIGIR 2022ACM SIGIR Forum10.1145/3582900.358291656:2(1-14)Online publication date: 31-Jan-2023
  • (2023)ReNeuIR at SIGIR 2023: The Second Workshop on Reaching Efficiency in Neural Information RetrievalProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591922(3456-3459)Online publication date: 19-Jul-2023
  • (2022)Stochastic Retrieval-Conditioned RerankingProceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3539813.3545141(81-91)Online publication date: 23-Aug-2022
  • (2022)Beyond Precision: A Study on Recall of Initial Retrieval with Neural RepresentationsInformation Retrieval10.1007/978-3-031-24755-2_7(76-89)Online publication date: 16-Sep-2022
  • (2022)How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map GenerationAdvances in Information Retrieval10.1007/978-3-030-99739-7_9(75-83)Online publication date: 10-Apr-2022
  • (2020)Parameter Tuning in Personal Search SystemsProceedings of the 13th International Conference on Web Search and Data Mining10.1145/3336191.3371820(97-105)Online publication date: 20-Jan-2020
  • (2019)An Analysis of Approaches Taken in the ACM RecSys Challenge 2018 for Automatic Music Playlist ContinuationACM Transactions on Intelligent Systems and Technology10.1145/334425710:5(1-21)Online publication date: 18-Sep-2019
  • (2018)Selective Gradient Boosting for Effective Learning to RankThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210048(155-164)Online publication date: 27-Jun-2018
  • (2018)X-CLEaVERACM Transactions on Intelligent Systems and Technology10.1145/32054539:6(1-26)Online publication date: 29-Oct-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media