[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1390334.1390380acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to rank with SoftRank and Gaussian processes

Published: 20 July 2008 Publication History

Abstract

In this paper we address the issue of learning to rank for document retrieval using Thurstonian models based on sparse Gaussian processes. Thurstonian models represent each document for a given query as a probability distribution in a score space; these distributions over scores naturally give rise to distributions over document rankings. However, in general we do not have observed rankings with which to train the model; instead, each document in the training set is judged to have a particular relevance level: for example "Bad", "Fair", "Good", or "Excellent". The performance of the model is then evaluated using information retrieval (IR) metrics such as Normalised Discounted Cumulative Gain (NDCG). Recently Taylor et al. presented a method called SoftRank which allows the direct gradient optimisation of a smoothed version of NDCG using a Thurstonian model. In this approach, document scores are represented by the outputs of a neural network, and score distributions are created artificially by adding random noise to the scores. The SoftRank mechanism is a general one; it can be applied to different IR metrics, and make use of different underlying models. In this paper we extend the SoftRank framework to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model. We further develop the model by using sparse Gaussian process techniques, which give improved performance and efficiency, and show competitive results against baseline methods when tested on the publicly available LETOR OHSUMED data set. We also explore how the available uncertainty information can be used in prediction and how it affects model performance.

References

[1]
C. Burges, R. Ragno, and Q. V. L. Le. Learning to rank with nonsmooth cost functions. In NIPS, 2006.
[2]
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML, 2005.
[3]
W. Chu and Z. Ghahramani. Gaussian processes for ordinal regression. JMLR, 6:1019--1041, 2005.
[4]
K. Crammer and Y. Singer. Pranking with ranking. In NIPS 14, 2002.
[5]
R. Herbrich, T. Graepel, and K. Obermayer. Large margin rank boundaries for ordinal regression. In Advances in Large Margin Classifiers, pages 115--132. MIT Press, 2000.
[6]
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, 2000.
[7]
T. Joachims. Optimizing search engines using clickthrough data. In KDD, 2002.
[8]
N. D. Lawrence. Learning for larger datasets with the Gaussian process latent variable model. In M. Meila and X. Shen, editors, AISTATS 11. Omnipress, 2007.
[9]
T.-Y. Liu. LETOR: Benchmark datasets for learning to rank, 2007. Microsoft Research Asia. http://research.microsoft.com/users/LETOR/.
[10]
R. M. Neal. Bayesian learning for neural networks. In Lecture Notes in Statistics 118. Springer, 1996.
[11]
J. Nocedal and S. Wright. Numerical Optimization, Second Edition. Springer, 2006.
[12]
J. Quiñonero Candela and C. E. Rasmussen. A unifying view of sparse approximate Gaussian process regression. JMLR, 6:1939--1959, Dec 2005.
[13]
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT press, 2006.
[14]
S. Robertson and H. Zaragoza. On rank-based effectiveness measures and optimization. Information Retrieval, 10(3):321--339, 2007.
[15]
S. Robertson, H. Zaragoza, and M. Taylor. A simple BM 25 extension to multiple weighted fields. In CIKM, pages 42--29, 2004.
[16]
E. Snelson and Z. Ghahramani. Sparse Gaussian processes using pseudo-inputs. In Y. Weiss, B. Schölkopf, and J. Platt, editors, NIPS 18, pages 1257--1264. MIT press, Cambridge, MA, 2006.
[17]
M. Taylor, J. Guiver, S. Robertson, and T. Minka. SoftRank: optimizing non-smooth rank metrics. In WSDM '08, pages 77--86. ACM, 2008.
[18]
L. L. Thurstone. A law of comparative judgement. Psychological Reviews, 34:273--286, 1927.

Cited By

View all
  • (2024)Stability and multigroup fairness in ranking with uncertain predictionsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692494(10661-10686)Online publication date: 21-Jul-2024
  • (2024)An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rankAdvances in Information Retrieval10.1007/978-3-031-56063-7_39(468-476)Online publication date: 23-Mar-2024
  • (2024)A Framework for Defining Algorithmic Fairness in the Context of Information AccessProceedings of the Association for Information Science and Technology10.1002/pra2.107761:1(667-672)Online publication date: 15-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
July 2008
934 pages
ISBN:9781605581644
DOI:10.1145/1390334
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Gaussian process
  2. information retrieval
  3. learning
  4. ranking

Qualifiers

  • Research-article

Conference

SIGIR '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Stability and multigroup fairness in ranking with uncertain predictionsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692494(10661-10686)Online publication date: 21-Jul-2024
  • (2024)An In-Depth Comparison of Neural and Probabilistic Tree Models for Learning-to-rankAdvances in Information Retrieval10.1007/978-3-031-56063-7_39(468-476)Online publication date: 23-Mar-2024
  • (2024)A Framework for Defining Algorithmic Fairness in the Context of Information AccessProceedings of the Association for Information Science and Technology10.1002/pra2.107761:1(667-672)Online publication date: 15-Oct-2024
  • (2023)Remote Sensing Object Counting Through Regression Ensembles and Learning to RankIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2023.326688461(1-17)Online publication date: 2023
  • (2023)An in-depth study on adversarial learning-to-rankInformation Retrieval Journal10.1007/s10791-023-09419-026:1Online publication date: 28-Feb-2023
  • (2023)Recommendation Uncertainty in Implicit Feedback Recommender SystemsArtificial Intelligence and Cognitive Science10.1007/978-3-031-26438-2_22(279-291)Online publication date: 23-Feb-2023
  • (2021)Diagnostic Evaluation of Policy-Gradient-Based RankingElectronics10.3390/electronics1101003711:1(37)Online publication date: 23-Dec-2021
  • (2021)BanditRank: Learning to Rank Using Contextual BanditsAdvances in Knowledge Discovery and Data Mining10.1007/978-3-030-75768-7_21(259-271)Online publication date: 8-May-2021
  • (2020)Smooth-AP: Smoothing the Path Towards Large-Scale Image RetrievalComputer Vision – ECCV 202010.1007/978-3-030-58545-7_39(677-694)Online publication date: 5-Nov-2020
  • (2019)Why train-and-select when you can use them all?Proceedings of the Genetic and Evolutionary Computation Conference10.1145/3321707.3321873(1408-1416)Online publication date: 13-Jul-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media