[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Leveraging Auxiliary Data for Learning to Rank

Published: 01 February 2012 Publication History

Abstract

In learning to rank, both the quality and quantity of the training data have significant impacts on the performance of the learned ranking functions. However, in many applications, there are usually not sufficient labeled training data for the construction of an accurate ranking model. It is therefore desirable to leverage existing training data from other tasks when learning the ranking function for a particular task, an important problem which we tackle in this article utilizing a boosting framework with transfer learning. In particular, we propose to adaptively learn transferable representations called super-features from the training data of both the target task and the auxiliary task. Those super-features and the coefficients for combining them are learned in an iterative stage-wise fashion. Unlike previous transfer learning methods, the super-features can be adaptively learned by weak learners from the data. Therefore, the proposed framework is sufficiently flexible to deal with complicated common structures among different learning tasks. We evaluate the performance of the proposed transfer learning method for two datasets from the Letor collection and one dataset collected from a commercial search engine, and we also compare our methods with several existing transfer learning methods. Our results demonstrate that the proposed method can enhance the ranking functions of the target tasks utilizing the training data from the auxiliary tasks.

References

[1]
Amini, M. R., Truong, T. V., and Goutte, C. 2008. A boosting algorithm for learning bipartite ranking functions with partially labeled data. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 99--106.
[2]
Ando, R. K. and Zhang, T. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6, 1817--1853.
[3]
Argyriou, A., Evgeniou, T., and Pontil, M. 2007. Multi-Task feature learning. In Neural Information Processing Systems 19, B. Schölkopf, J. Platt, and T. Hoffman Eds., MIT Press, Cambridge, MA, 41--48.
[4]
Bai, J., Zhou, K., Xue, G., Zha, H., Sun, G., Tseng, B., Zheng, Z., and Chang, Y. 2009. Multi-task learning for learning to rank in web search. In Proceeding of the 18th ACM Conference on Information and Knowledge Management. ACM, New York, 1549--1552.
[5]
Bakker, B. and Heskes, T. 2003. Task clustering and gating for bayesian multitask learning. J. Mach. Learn. Res. 4, 83--99.
[6]
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., and Hullender, G. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning. ACM, New York, 89--96.
[7]
Burges, C. J. 2010. From ranknet to lambdarank to lambdamart: An overview. Tech. rep. MSR-TR-2010-82, Microsoft Research.
[8]
Burges, C. J., Ragno, R., and Le, Q. V. 2007. Learning to rank with nonsmooth cost functions. In Neural Information Processing Systems 19.
[9]
Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., and Li, H. 2007. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, 129--136.
[10]
Caruana, R. 1997. Multitask learning. In Machine Learning. Vol. 28. Kluwer Academic Publishers, Hingham, MA, 41--75.
[11]
Chen, D., Yan, J., Wang, G., Xiong, Y., Fan, W., and Chen, Z. 2008a. TransRank: A novel algorithm for transfer of rank learning. In Proceedings of the IEEE International Conference on Data Mining Workshops. IEEE Computer Society, Los Alamitos, CA, 106--115.
[12]
Chen, K., Lu, R., Wong, C. K., Sun, G., Heck, L., and Tseng, B. 2008b. Trada: Tree based ranking function adaptation. In Proceedings of the ACM International Conference on Information and Knowledge Management. ACM, New York, 1143--1152.
[13]
Cortes, C., Mohri, M., and Rastogi, A. 2007. Magnitude-Preserving ranking algorithms. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, 169--176.
[14]
Cossock, D. and Zhang, T. 2006. Subset ranking using regression. In Learning Theory, G. Lugosi and H. Simon Eds., Lecture Notes in Computer Science, vol. 4005, Springer, 605--619.
[15]
Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, 193--200.
[16]
Dai, W., Yang, Q., Xue, G.-R., and Yu, Y. 2008. Self-Taught clustering. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, 200--207.
[17]
Duh, K. and Kirchhoff, K. 2008. Learning to rank with partially-labeled data. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 251--258.
[18]
Evgeniou, T. and Pontil, M. 2004. Regularized multi-task learning. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, 109--117.
[19]
Freund, Y., Iyer, R. D., Schapire, R. E., and Singer, Y. 1998. An efficient boosting algorithm for combining preferences. In Proceedings of the 15th International Conference on Machine Learning. 170--178.
[20]
Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Ann. Statist. 29, 1189--1232.
[21]
Fung, G., Rosales, R., and Krishnapuram, B. 2006. Learning rankings via convex hull separation. In Neural Information Processing Systems 18, Y. Weiss, B. Schölkopf, and J. Platt Eds., 395--402.
[22]
Guiver, J. and Snelson, E. 2008. Learning to rank with softrank and gaussian processes. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 259--266.
[23]
Heskes, T. 2000. Empirical bayes for learning to learn. In Proceedings of the 17th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, 367--374.
[24]
Järvelin, K. and Kekäläinen, J. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 422--446.
[25]
Joachims, T. 2002a. Optimizing search engines using clickthrough data. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 133--142.
[26]
Joachims, T. 2002b. Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 133--142.
[27]
Joachims, T. 2005. A support vector method for multivariate performance measures. In Proceedings of the 22nd International Conference on Machine Learning. 377--384.
[28]
Joachims, T., Granka, L., Pan, B., Hembrooke, H., and Gay, G. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’05).
[29]
Lawrence, N. D. and Platt, J. C. 2004. Learning to learn with the informative vector machine. In Proceedings of the 21st International Conference on Machine Learning. ACM, New York, 65.
[30]
Li, P. 2010. Learning to rank using robust logitboost. http://learningtorankchallenge.yahoo.com/presentations/LogitBoostLTR.pdf.
[31]
Li, P., Burges, C. J. C., and Wu, Q. 2007. Mcrank: Learning to rank using multiple classification and gradient boosting. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), J. C. Platt, D. Koller, Y. Singer, and S. T. Roweis Eds., MIT Press.
[32]
Liu, T.-Y., Xu, J., Qin, T., Xiong, W., and Li, H. 2007. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In Proceedings of the LR4IR in conjunction with SIGIR’07.
[33]
Obozinski, G., Taskar, B., and Jordan, M. 2007. Joint covariate selection for grouped classification. Tech. rep., UC Berkeley.
[34]
Pahikkala, T., Tsivtsivadze, E., Airola, A., Järvinen, J., and Boberg, J. 2009. An efficient algorithm for learning to rank from preference graphs. Mach. Learn. 75, 129--165.
[35]
Pan, S. J. and Yang, Q. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Engin. 22, 1345--1359.
[36]
Raina, R., Battle, A., Lee, H., Packer, B., and Ng, A. Y. 2007. Self-Taught learning: Transfer learning from unlabeled data. In Proceedings of the 24th International Conference on Machine Learning. ACM, New York, 759--766.
[37]
Wang, B., Tang, J., Fan, W., Chen, S., Yang, Z., and Liu, Y. 2009. Heterogeneous cross domain ranking in latent space. In Proceedings of the ACM International Conference on Information and Knowledge Management. ACM, New York, 987--996.
[38]
Wu, P. and Dietterich, T. G. 2004. Improving SVM accuracy by training on auxiliary data sources. In Proceedings of the 21st International Conference on Machine Learning. ACM, New York, 110--117.
[39]
Xia, F., Liu, T.-Y., Wang, J., Zhang, W., and Li, H. 2008. Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, 1192--1199.
[40]
Xu, J. and Li, H. 2007. Adarank: a boosting algorithm for information retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 391--398.
[41]
Xu, J., Liu, T.-Y., Lu, M., Li, H., and Ma, W.-Y. 2008. Directly optimizing evaluation measures in learning to rank. In Proceedings of the 31st Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’08). 107--114.
[42]
Xue, Y., Liao, X., Carin, L., and Krishnapuram, B. 2007. Multi-Task learning for classification with dirichlet process priors. J. Mach. Learn. Res. 8, 35--63.
[43]
Yu, K., Tresp, V., and Schwaighofer, A. 2005. Learning Gaussian processes from multiple tasks. In Proceedings of the 22nd International Conference on Machine Learning. ACM, New York, 1012--1019.
[44]
Yue, Y., Finley, T., Radlinski, F., and Joachims, T. 2007. A support vector method for optimizing average precision. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 271--278.
[45]
Zhang, T. and Yu, B. 2005. Boosting with early stopping: Convergence and consistency. In Ann. Statist. 33, 1538.
[46]
Zheng, Z., Zha, H., Chen, K., and Sun, G. 2007. A regression framework for learning ranking functions using relative relevance judgments. In Proceedings of the 30th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR’07).
[47]
Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., and Sun, G. 2008. A general boosting method and its application to learning ranking functions for web search. In Advan. Neural Inf. Process. Syst. 20, J. Platt, D. Koller, Y. Singer, and S. Roweis Eds., MIT Press, Cambridge, MA, 1697--1704.

Cited By

View all
  • (2017)Robust Learning to Rank Based on Portfolio Theory and AMOSA AlgorithmIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2016.258478647:6(1007-1018)Online publication date: Jun-2017

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 3, Issue 2
February 2012
455 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2089094
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 February 2012
Accepted: 01 September 2011
Revised: 01 July 2011
Received: 01 November 2010
Published in TIST Volume 3, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Relevance
  2. convergence analysis
  3. experimental evaluation
  4. learning to rank
  5. search engine
  6. super-features
  7. transfer learning

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Robust Learning to Rank Based on Portfolio Theory and AMOSA AlgorithmIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2016.258478647:6(1007-1018)Online publication date: Jun-2017

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media