More Web Proxy on the site http://driver.im/

research-article

Learning to rank with partially-labeled data

Authors:

Katrin KirchhoffAuthors Info & Claims

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 251 - 258

https://doi.org/10.1145/1390334.1390379

Published: 20 July 2008 Publication History

Abstract

Ranking algorithms, whose goal is to appropriately order a set of objects/documents, are an important component of information retrieval systems. Previous work on ranking algorithms has focused on cases where only labeled data is available for training (i.e. supervised learning). In this paper, we consider the question whether unlabeled (test) data can be exploited to improve ranking performance. We present a framework for transductive learning of ranking functions and show that the answer is affirmative. Our framework is based on generating better features from the test data (via KernelPCA) and incorporating such features via Boosting, thus learning different ranking functions adapted to the individual test queries. We evaluate this method on the LETOR (TREC, OHSUMED) dataset and demonstrate significant improvements.

References

[1]

S. Agarwal. Ranking on graph data. In ICML, 2006.

Digital Library

[2]

R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. In Journal of Machine Learning Research, volume 6, 2005.

Digital Library

[3]

S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Neural Information Processing Systems, 2006.

[4]

C. Burges, R. Ragno, and Q. Le. Learning to rank with nonsmooth cost functions. In NIPS, 2006.

[5]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of the 22nd International Conference on Machine Learning, 2005.

Digital Library

[6]

W. Chu and Z. Ghahramani. Extensions of gaussian processes for ranking: semi-supervised and active learning. In NIPS Wksp on Learning to Rank, 2005.

[7]

K. Crammer and Y. Singer. Pranking with ranking. In Neural Information Processing Systems (NIPS), 2001.

[8]

Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 2003.

Digital Library

[9]

X. Geng, T.-Y. Liu, T. Qin, and H. Li. Feature selection for ranking. In SIGIR, 2007.

Digital Library

[10]

J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In ACM Conf. on Multimedia, 2004.

Digital Library

[11]

R. Herbrich, T. Graepel, and K. Obermayer. Advances in Large Margin Classifiers, chapter Large margin rank boundaries for ordinal regression. MIT Press, 2000.

[12]

K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, 2000.

Digital Library

[13]

T. Joachims. Optimizing search engines using clickthrough data. In SIGKDD, 2003.

Digital Library

[14]

J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 1999.

Digital Library

[15]

I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete structures. In ICML, 2002.

Digital Library

[16]

J. H. Lee. Analysis of multiple evidence combination. In SIGIR, 1997.

Digital Library

[17]

P. Li, C. Burges, and Q. Wu. McRank: Learning to rank using classification and gradient boosting. In NIPS, 2007.

[18]

T.-Y. Liu, T. Qin, J. Xu, W. Xiong, and H. Li. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In SIGIR Workshop on Learning to Rank for IR (LR4IR), 2007.

[19]

L. Mason, J. Baxter, P. Bartless, and M. Frean. Boosting as gradient descent. In NIPS, 2000.

[20]

D. Metzler. Direct maximization of rank-based metrics. Technical report, University of Massachusetts, Amherst CIIR, 2005.

[21]

R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Self-taught learning: transfer learning from unlabeled data. In ICML, 2007.

Digital Library

[22]

S. Robertson. Overview of the Okapi projects. Journal of Documentation, 53(1), 1997.

[23]

R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 1999.

Digital Library

[24]

B. Schölkopf, A. Smola, and K.-R. Müller. Nonlinear component analysis as kernel eigenvalue problem. Neural Computation, 10, 1998.

Digital Library

[25]

J. Shawe-Taylor and N. Cristianini. Kernel methods for pattern analysis. Cambridge Univ. Press, 2004.

Digital Library

[26]

A. Smola, O. Mangasarian, and B. Schölkopf. Sparse kernel feature analysis. Technical Report 99-03, University of Wisconsin, Data Mining Institute, 1999.

[27]

T. Truong, M.-R. Amini, and P. Gallinari. Learning to rank with partially labeled training data. In Int'l Conf. on Multidisciplinary Info Sci/Tech, 2006.

[28]

M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma. FRank: a ranking method with fidelity loss. In SIGIR, 2007.

Digital Library

[29]

J. Wang, M. Li, Z. Li, and W.-Y. Ma. Learning ranking function via relevance propagation. Technical report, Microsoft Research Asia, 2005.

[30]

J. Weston, R. Kuang, C. Leslie, and W. S. Noble. Protein ranking by semi-supervised network propagation. BMC Bioinformatics, 7, 2006.

[31]

J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, 1996.

Digital Library

[32]

J. Xu and H. Li. AdaRank: A boosting algorithm for information retrieval. In SIGIR, 2007.

Digital Library

[33]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR, 2007.

Digital Library

[34]

C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR, 2001.

Digital Library

[35]

Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgements. In SIGIR, 2007.

Digital Library

[36]

D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schökopf. Ranking on data manifolds. In NIPS, 2004.

[37]

X. Zhu. Semi-supervised learning literature survey. Technical Report 1530, University of Wisconsin, Madison, Computer Science Dept., 2005.

Cited By

Zhang CChen WZhang WXu M(2024)Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting ItemsACM Transactions on Intelligent Systems and Technology10.1145/3653983Online publication date: 26-Mar-2024
https://doi.org/10.1145/3653983
Yeh JTsai C(2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
https://doi.org/10.2298/CSIS201220042Y
Dammak FKammoun H(2021)Combining semi-supervised and active learning to rank algorithms: application to Document RetrievalInformation Retrieval Journal10.1007/s10791-021-09396-2Online publication date: 4-Oct-2021
https://doi.org/10.1007/s10791-021-09396-2
Show More Cited By

Index Terms

Learning to rank with partially-labeled data
1. Information systems
  1. Information retrieval
  2. Information systems applications

Recommendations

Incremental learning to rank with partially-labeled data
WSCD '09: Proceedings of the 2009 workshop on Web Search Click Data

In this paper we present a semi-supervised learning method for a problem of learning to rank where we exploit Markov random walks and graph regularization in order to incorporate not only "labeled" web pages but also plenty of "un-labeled" web pages (...
Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Training query filtering for semi-supervised learning to rank with pseudo labels

Semi-supervised learning is a machine learning paradigm that can be applied to create pseudo labels from unlabeled data for learning a ranking model, when there is only limited or no training examples available. However, the effectiveness of semi-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

July 2008

934 pages

ISBN:9781605581644

DOI:10.1145/1390334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '08

Sponsor:

SIGIR '08: The 31st Annual International ACM SIGIR Conference

July 20 - 24, 2008

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

55
Total Citations
View Citations
1,383
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)3

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang CChen WZhang WXu M(2024)Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting ItemsACM Transactions on Intelligent Systems and Technology10.1145/3653983Online publication date: 26-Mar-2024
https://doi.org/10.1145/3653983
Yeh JTsai C(2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
https://doi.org/10.2298/CSIS201220042Y
Dammak FKammoun H(2021)Combining semi-supervised and active learning to rank algorithms: application to Document RetrievalInformation Retrieval Journal10.1007/s10791-021-09396-2Online publication date: 4-Oct-2021
https://doi.org/10.1007/s10791-021-09396-2
Kuzenkov OMorozov AKuzenkova G(2020)Exploring Evolutionary Fitness in Biological Systems Using Machine Learning MethodsEntropy10.3390/e2301003523:1(35)Online publication date: 29-Dec-2020
https://doi.org/10.3390/e23010035
Lai YZhen J(2020)A Ranking Learning Training Method Based on Singular Value DecompositionCommunications, Signal Processing, and Systems10.1007/978-981-13-9409-6_144(1218-1221)Online publication date: 4-Apr-2020
https://doi.org/10.1007/978-981-13-9409-6_144
Rahangdale ARaut S(2019)Clustering-Based Transductive Semi-Supervised Learning for Learning-to-RankInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800141951007833:12(1951007)Online publication date: 26-Nov-2019
https://doi.org/10.1142/S0218001419510078
Zhang XZhao ZLiu CZhang CCheng Z(2019)Semi-supervised Learning to Rank with Uncertain DataWeb Information Systems and Applications10.1007/978-3-030-30952-7_4(28-39)Online publication date: 16-Sep-2019
https://doi.org/10.1007/978-3-030-30952-7_4
SUEHIRO DHATANO KTAKIMOTO E(2018)Efficient Reformulation of 1-Norm Ranking SVMIEICE Transactions on Information and Systems10.1587/transinf.2017EDP7233E101.D:3(719-729)Online publication date: 2018
https://doi.org/10.1587/transinf.2017EDP7233
Albuquerque AAmador TFerreira RVeloso AZiviani N(2018)Learning to Rank with Deep Autoencoder Features2018 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2018.8489646(1-8)Online publication date: Jul-2018
https://doi.org/10.1109/IJCNN.2018.8489646
Chen SChen KXu CLan L(2018)Flexible ranking extreme learning machine based on matrix-centering transformation2018 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2018.8489418(1-8)Online publication date: Jul-2018
https://doi.org/10.1109/IJCNN.2018.8489418
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents