[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3269206.3274274acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
tutorial

Unbiased Learning to Rank: Theory and Practice

Published: 17 October 2018 Publication History

Abstract

Implicit feedback (e.g., user clicks) is an important source of data for modern search engines. While heavily biased [8, 9, 11, 27], it is cheap to collect and particularly useful for user-centric retrieval applications such as search ranking. To develop an unbiased learning-to-rank system with biased feedback, previous studies have focused on constructing probabilistic graphical models (e.g., click models) with user behavior hypothesis to extract and train ranking systems with unbiased relevance signals. Recently, a novel counterfactual learning framework that estimates and adopts examination propensity for unbiased learning to rank has attracted much attention. Despite its popularity, there is no systematic comparison of the unbiased learning-to-rank frameworks based on counterfactual learning and graphical models. In this tutorial, we aim to provide an overview of the fundamental mechanism for unbiased learning to rank. We will describe the theory behind existing frameworks, and give detailed instructions on how to conduct unbiased learning to rank in practice.

References

[1]
Qingyao Ai, Liu Yang, Jiafeng Guo, and W. Bruce Croft. 2016. Analysis of the paragraph vector model for information retrieval. In Proceedings of the 2rd ACM ICTIR. ACM, 133--142.
[2]
Olivier Chapelle, Thorsten Joachims, Filip Radlinski, and Yisong Yue. 2012. Large-scale validation and analysis of interleaved search evaluation. ACM Transactions on Information Systems, Vol. 30, 1 (2012), 6.
[3]
Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th WWW. ACM, 1--10.
[4]
Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, and W. Bruce Croft. 2017. Neural Ranking Models with Weak Supervision. In Proceedings of the 40th ACM SIGIR (SIGIR '17). ACM, 65--74.
[5]
Anhai Doan, Raghu Ramakrishnan, and Alon Y. Halevy. 2011. Crowdsourcing systems on the world-wide web. Commun. ACM, Vol. 54, 4 (2011), 86--96.
[6]
Georges E. Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st ACM SIGIR. ACM, 331--338.
[7]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM CIKM. ACM, 55--64.
[8]
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately interpreting clickthrough data as implicit feedback. In Proceedings of the 28th annual ACM SIGIR. Acm, 154--161.
[9]
Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Systems (TOIS), Vol. 25, 2 (2007), 7.
[10]
Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased learning-to-rank with biased feedback. In Proceedings of the 10th ACM WSDM. ACM, 781--789.
[11]
Mark T. Keane and Maeve O'Brien. 2006. Modeling Result-List Searching in the World Wide Web: The Role of Relevance Topologies and Trust Bias. In Proceedings of the Cognitive Science Society, Vol. 28.
[12]
Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceedings of the SIGCHI. ACM, 453--456.
[13]
Tie-Yan Liu. 2009. Learning to rank for information retrieval. Foundations and Trends in Information Retrieval, Vol. 3, 3 (2009), 225--331.
[14]
Cheng Luo, Yukun Zheng, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2017. Training deep ranking model with weak relevance labels. In Australasian Database Conference. Springer, 205--216.
[15]
Bhaskar Mitra, Fernando Diaz, and Nick Craswell. 2017. Learning to Match Using Local and Distributed Representations of Text for Web Search. In Proceedings of the 26th WWW (WWW '17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1291--1299.
[16]
Karthik Raman and Thorsten Joachims. 2013. Learning socially optimal information systems from egoistic users. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 128--144.
[17]
Paul R. Rosenbaum and Donald B. Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, Vol. 70, 1 (1983), 41--55.
[18]
Anne Schuth, Harrie Oosterhuis, Shimon Whiteson, and Maarten de Rijke. 2016. Multileave gradient descent for fast online learning to rank. In Proceedings of the 9th ACM WSDM. ACM, 457--466.
[19]
Adith Swaminathan and Thorsten Joachims. 2015. Batch learning from logged bandit feedback through counterfactual risk minimization. Journal of Machine Learning Research, Vol. 16 (2015), 1731--1755.
[20]
Adith Swaminathan and Thorsten Joachims. 2015. Counterfactual risk minimization: Learning from logged bandit feedback. In ICML. 814--823.
[21]
Chao Wang, Yiqun Liu, Meng Wang, Ke Zhou, Jian-yun Nie, and Shaoping Ma. 2015. Incorporating non-sequential behavior into click models. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 283--292.
[22]
Hongning Wang, ChengXiang Zhai, Anlei Dong, and Yi Chang. 2013. Content-aware click modeling. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1365--1376.
[23]
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to rank with selection bias in personal search. In Proceedings of the 39th ACM SIGIR. ACM, 115--124.
[24]
Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the 11th ACM WSDM (WSDM '18). ACM, New York, NY, USA, 610--618.
[25]
Wanhong Xu, Eren Manavoglu, and Erick Cantu-Paz. 2010. Temporal click model for sponsored search. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 106--113.
[26]
Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In Proceedings of the 26th ICML. ACM, 1201--1208.
[27]
Yisong Yue, Rajan Patel, and Hein Roehrig. 2010. Beyond position bias: Examining result attractiveness as a source of presentation bias in clickthrough data. In Proceedings of the 19th WWW. ACM, 1011--1018.

Cited By

View all
  • (2024)CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657851(1221-1231)Online publication date: 10-Jul-2024
  • (2024)Whole Page Unbiased Learning to RankProceedings of the ACM Web Conference 202410.1145/3589334.3645474(1431-1440)Online publication date: 13-May-2024
  • (2024) LT 2 R: Learning to Online Learning to Rank for Web Search 2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00360(4733-4746)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. Unbiased Learning to Rank: Theory and Practice

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
    October 2018
    2362 pages
    ISBN:9781450360142
    DOI:10.1145/3269206
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2018

    Check for updates

    Author Tags

    1. click model
    2. counterfactual learning
    3. unbiased learning to rank
    4. user bias

    Qualifiers

    • Tutorial

    Conference

    CIKM '18
    Sponsor:

    Acceptance Rates

    CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)34
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657851(1221-1231)Online publication date: 10-Jul-2024
    • (2024)Whole Page Unbiased Learning to RankProceedings of the ACM Web Conference 202410.1145/3589334.3645474(1431-1440)Online publication date: 13-May-2024
    • (2024) LT 2 R: Learning to Online Learning to Rank for Web Search 2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00360(4733-4746)Online publication date: 13-May-2024
    • (2023)Learning To Rank Resources with GNNProceedings of the ACM Web Conference 202310.1145/3543507.3583360(3247-3256)Online publication date: 30-Apr-2023
    • (2022)LBDProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602690(33400-33413)Online publication date: 28-Nov-2022
    • (2022)Scalar is Not EnoughProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539468(136-145)Online publication date: 14-Aug-2022
    • (2021)ULTRA: An Unbiased Learning To Rank Algorithm ToolboxProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482020(4613-4622)Online publication date: 26-Oct-2021
    • (2021)Adapting Interactional Observation Embedding for Counterfactual Learning to RankProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462901(285-294)Online publication date: 11-Jul-2021
    • (2021)Beyond Probability Ranking Principle: Modeling the Dependencies among DocumentsProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462808(2647-2650)Online publication date: 11-Jul-2021
    • (2019)To Model or to InterveneProceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3331184.3331269(15-24)Online publication date: 18-Jul-2019
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media