[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1390334.1390379acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning to rank with partially-labeled data

Published: 20 July 2008 Publication History

Abstract

Ranking algorithms, whose goal is to appropriately order a set of objects/documents, are an important component of information retrieval systems. Previous work on ranking algorithms has focused on cases where only labeled data is available for training (i.e. supervised learning). In this paper, we consider the question whether unlabeled (test) data can be exploited to improve ranking performance. We present a framework for transductive learning of ranking functions and show that the answer is affirmative. Our framework is based on generating better features from the test data (via KernelPCA) and incorporating such features via Boosting, thus learning different ranking functions adapted to the individual test queries. We evaluate this method on the LETOR (TREC, OHSUMED) dataset and demonstrate significant improvements.

References

[1]
S. Agarwal. Ranking on graph data. In ICML, 2006.
[2]
R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. In Journal of Machine Learning Research, volume 6, 2005.
[3]
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Neural Information Processing Systems, 2006.
[4]
C. Burges, R. Ragno, and Q. Le. Learning to rank with nonsmooth cost functions. In NIPS, 2006.
[5]
C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In Proc. of the 22nd International Conference on Machine Learning, 2005.
[6]
W. Chu and Z. Ghahramani. Extensions of gaussian processes for ranking: semi-supervised and active learning. In NIPS Wksp on Learning to Rank, 2005.
[7]
K. Crammer and Y. Singer. Pranking with ranking. In Neural Information Processing Systems (NIPS), 2001.
[8]
Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 2003.
[9]
X. Geng, T.-Y. Liu, T. Qin, and H. Li. Feature selection for ranking. In SIGIR, 2007.
[10]
J. He, M. Li, H.-J. Zhang, H. Tong, and C. Zhang. Manifold-ranking based image retrieval. In ACM Conf. on Multimedia, 2004.
[11]
R. Herbrich, T. Graepel, and K. Obermayer. Advances in Large Margin Classifiers, chapter Large margin rank boundaries for ordinal regression. MIT Press, 2000.
[12]
K. Järvelin and J. Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, 2000.
[13]
T. Joachims. Optimizing search engines using clickthrough data. In SIGKDD, 2003.
[14]
J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 1999.
[15]
I. Kondor and J. Lafferty. Diffusion kernels on graphs and other discrete structures. In ICML, 2002.
[16]
J. H. Lee. Analysis of multiple evidence combination. In SIGIR, 1997.
[17]
P. Li, C. Burges, and Q. Wu. McRank: Learning to rank using classification and gradient boosting. In NIPS, 2007.
[18]
T.-Y. Liu, T. Qin, J. Xu, W. Xiong, and H. Li. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In SIGIR Workshop on Learning to Rank for IR (LR4IR), 2007.
[19]
L. Mason, J. Baxter, P. Bartless, and M. Frean. Boosting as gradient descent. In NIPS, 2000.
[20]
D. Metzler. Direct maximization of rank-based metrics. Technical report, University of Massachusetts, Amherst CIIR, 2005.
[21]
R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Self-taught learning: transfer learning from unlabeled data. In ICML, 2007.
[22]
S. Robertson. Overview of the Okapi projects. Journal of Documentation, 53(1), 1997.
[23]
R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37(3), 1999.
[24]
B. Schölkopf, A. Smola, and K.-R. Müller. Nonlinear component analysis as kernel eigenvalue problem. Neural Computation, 10, 1998.
[25]
J. Shawe-Taylor and N. Cristianini. Kernel methods for pattern analysis. Cambridge Univ. Press, 2004.
[26]
A. Smola, O. Mangasarian, and B. Schölkopf. Sparse kernel feature analysis. Technical Report 99-03, University of Wisconsin, Data Mining Institute, 1999.
[27]
T. Truong, M.-R. Amini, and P. Gallinari. Learning to rank with partially labeled training data. In Int'l Conf. on Multidisciplinary Info Sci/Tech, 2006.
[28]
M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma. FRank: a ranking method with fidelity loss. In SIGIR, 2007.
[29]
J. Wang, M. Li, Z. Li, and W.-Y. Ma. Learning ranking function via relevance propagation. Technical report, Microsoft Research Asia, 2005.
[30]
J. Weston, R. Kuang, C. Leslie, and W. S. Noble. Protein ranking by semi-supervised network propagation. BMC Bioinformatics, 7, 2006.
[31]
J. Xu and W. B. Croft. Query expansion using local and global document analysis. In SIGIR, 1996.
[32]
J. Xu and H. Li. AdaRank: A boosting algorithm for information retrieval. In SIGIR, 2007.
[33]
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR, 2007.
[34]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR, 2001.
[35]
Z. Zheng, H. Zha, K. Chen, and G. Sun. A regression framework for learning ranking functions using relative relevance judgements. In SIGIR, 2007.
[36]
D. Zhou, J. Weston, A. Gretton, O. Bousquet, and B. Schökopf. Ranking on data manifolds. In NIPS, 2004.
[37]
X. Zhu. Semi-supervised learning literature survey. Technical Report 1530, University of Wisconsin, Madison, Computer Science Dept., 2005.

Cited By

View all
  • (2024)Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting ItemsACM Transactions on Intelligent Systems and Technology10.1145/3653983Online publication date: 26-Mar-2024
  • (2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
  • (2021)Combining semi-supervised and active learning to rank algorithms: application to Document RetrievalInformation Retrieval Journal10.1007/s10791-021-09396-2Online publication date: 4-Oct-2021
  • Show More Cited By

Index Terms

  1. Learning to rank with partially-labeled data

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
      July 2008
      934 pages
      ISBN:9781605581644
      DOI:10.1145/1390334
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 July 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. boosting
      2. information retrieval
      3. kernel principal components analysis
      4. learning to rank
      5. transductive learning

      Qualifiers

      • Research-article

      Conference

      SIGIR '08
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting ItemsACM Transactions on Intelligent Systems and Technology10.1145/3653983Online publication date: 26-Mar-2024
      • (2022)A graph-based feature selection method for learning to rank using spectral clustering for redundancy minimization and biased PageRank for relevance analysisComputer Science and Information Systems10.2298/CSIS201220042Y19:1(141-164)Online publication date: 2022
      • (2021)Combining semi-supervised and active learning to rank algorithms: application to Document RetrievalInformation Retrieval Journal10.1007/s10791-021-09396-2Online publication date: 4-Oct-2021
      • (2020)Exploring Evolutionary Fitness in Biological Systems Using Machine Learning MethodsEntropy10.3390/e2301003523:1(35)Online publication date: 29-Dec-2020
      • (2020)A Ranking Learning Training Method Based on Singular Value DecompositionCommunications, Signal Processing, and Systems10.1007/978-981-13-9409-6_144(1218-1221)Online publication date: 4-Apr-2020
      • (2019)Clustering-Based Transductive Semi-Supervised Learning for Learning-to-RankInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800141951007833:12(1951007)Online publication date: 26-Nov-2019
      • (2019)Semi-supervised Learning to Rank with Uncertain DataWeb Information Systems and Applications10.1007/978-3-030-30952-7_4(28-39)Online publication date: 16-Sep-2019
      • (2018)Efficient Reformulation of 1-Norm Ranking SVMIEICE Transactions on Information and Systems10.1587/transinf.2017EDP7233E101.D:3(719-729)Online publication date: 2018
      • (2018)Learning to Rank with Deep Autoencoder Features2018 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2018.8489646(1-8)Online publication date: Jul-2018
      • (2018)Flexible ranking extreme learning machine based on matrix-centering transformation2018 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN.2018.8489418(1-8)Online publication date: Jul-2018
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media