More Web Proxy on the site http://driver.im/

research-article

Unbiased Top-$k$ Learning to Rank with Causal Likelihood Decomposition

Authors:

Ji-Rong WenAuthors Info & Claims

SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

Pages 129 - 138

https://doi.org/10.1145/3624918.3625340

Published: 26 November 2023 Publication History

Abstract

Unbiased learning to rank methods have been proposed to address biases in search ranking. These biases, known as position bias and sample selection bias, often occur simultaneously in real applications. Existing approaches either tackle these biases separately or treat them as identical, leading to incomplete elimination of both biases. This paper employs a causal graph approach to investigate the mechanisms and interplay between position bias and sample selection bias. The analysis reveals that position bias is a common confounder bias, while sample selection bias falls under the category of collider bias. These biases collectively introduce a cascading process that leads to biased clicks. Based on our analysis, we propose Causal Likelihood Decomposition (CLD), a unified method that effectively mitigates both biases in top-k learning to rank. CLD removes position bias by leveraging propensity scores and then decomposes the likelihood of selection biased data into sample selection bias term and relevance term. By maximizing the overall log-likelihood function, we obtain an unbiased ranking model from the relevance term. We also extend CLD to pairwise neural ranking. Extensive experiments demonstrate that CLD and its pairwise neural extension outperform baseline methods by effectively mitigating both position bias and sample selection bias. The robustness of CLD is further validated through empirical studies considering variations in bias severity and click noise.

References

[1]

Aman Agarwal, Kenta Takatsu, Ivan Zaitsev, and Thorsten Joachims. 2019. A General Framework for Counterfactual Learning-to-Rank. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). ACM, New York, NY, USA, 5–14. https://doi.org/10.1145/3331184.3331202

Digital Library

[2]

Aman Agarwal, Xuanhui Wang, Cheng Li, Michael Bendersky, and Marc Najork. 2019. Addressing Trust Bias for Unbiased Learning-to-Rank. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). ACM, New York, NY, USA, 4–14. https://doi.org/10.1145/3308558.3313697

Digital Library

[3]

Aman Agarwal, Ivan Zaitsev, Xuanhui Wang, Cheng Li, Marc Najork, and Thorsten Joachims. 2019. Estimating Position Bias without Intrusive Interventions. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia) (WSDM ’19). ACM, New York, NY, USA, 474–482. https://doi.org/10.1145/3289600.3291017

Digital Library

[4]

Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, and W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In The 41st International ACM SIGIR Conference on Research Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR ’18). ACM, New York, NY, USA, 385–394. https://doi.org/10.1145/3209978.3209986

Digital Library

[5]

Qingyao Ai, Tao Yang, Huazheng Wang, and Jiaxin Mao. 2021. Unbiased Learning to Rank: Online or Offline?ACM Trans. Inf. Syst. 39, 2, Article 21 (Feb. 2021), 29 pages. https://doi.org/10.1145/3439861

Digital Library

[6]

T. Amemiya. 1984. Tobit models: A survey. Journal of Econometrics 24, 1-2 (1984).

[7]

Ben Carterette and Praveen Chandar. 2018. Offline Comparative Evaluation with Incremental, Minimally-Invasive Online Feedback. In The 41st International ACM SIGIR Conference on Research Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR ’18). ACM, New York, NY, USA, 705–714. https://doi.org/10.1145/3209978.3210050

Digital Library

[8]

Olivier Chapelle and Yi Chang. 2011. Yahoo! learning to rank challenge overview. In Proceedings of the learning to rank challenge. PMLR, 1–24.

[9]

Jiawei Chen, Hande Dong, Yang Qiu, Xiangnan He, Xin Xin, Liang Chen, Guli Lin, and Keping Yang. 2021. AutoDebias: Learning to Debias for Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). ACM, New York, NY, USA, 21–30. https://doi.org/10.1145/3404835.3462919

Digital Library

[10]

Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, and Xiangnan He. 2020. Bias and debias in recommender system: A survey and future directions. arXiv preprint arXiv:2010.03240 (2020).

[11]

Stephen R Cole, Robert W Platt, Enrique F Schisterman, Haitao Chu, Daniel Westreich, David Richardson, and Charles Poole. 2010. Illustrating bias due to conditioning on a collider. International journal of epidemiology 39, 2 (2010), 417–420.

[12]

Nick Craswell, Onno Zoeter, Michael Taylor, and Bill Ramsey. 2008. An Experimental Comparison of Click Position-Bias Models. In Proceedings of the 2008 International Conference on Web Search and Data Mining (Palo Alto, California, USA) (WSDM ’08). ACM, New York, NY, USA, 87–94. https://doi.org/10.1145/1341531.1341545

Digital Library

[13]

Zhenhua Dong, Hong Zhu, Pengxiang Cheng, Xinhua Feng, Guohao Cai, Xiuqiang He, Jun Xu, and Jirong Wen. 2020. Counterfactual learning for recommender system. In Fourteenth ACM Conference on Recommender Systems. 568–569.

Digital Library

[14]

Zhichong Fang, Aman Agarwal, and Thorsten Joachims. 2019. Intervention Harvesting for Context-Dependent Examination-Bias Estimation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). ACM, New York, NY, USA, 825–834. https://doi.org/10.1145/3331184.3331238

Digital Library

[15]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 249–256.

[16]

James J Heckman. 1976. The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. In Annals of economic and social measurement. NBER, 475–492.

[17]

James J Heckman. 1979. Sample selection bias as a specification error. Econometrica: Journal of the econometric society (1979), 153–161.

[18]

Ziniu Hu, Yang Wang, Qu Peng, and Hang Li. 2019. Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). ACM, New York, NY, USA, 2830–2836. https://doi.org/10.1145/3308558.3313447

Digital Library

[19]

Rolf Jagerman, Harrie Oosterhuis, and Maarten de Rijke. 2019. To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). ACM, New York, NY, USA, 15–24. https://doi.org/10.1145/3331184.3331269

Digital Library

[20]

Thorsten Joachims. 2002. Optimizing Search Engines Using Clickthrough Data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Edmonton, Alberta, Canada) (KDD ’02). Association for Computing Machinery, New York, NY, USA, 133–142. https://doi.org/10.1145/775047.775067

Digital Library

[21]

Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, and Geri Gay. 2005. Accurately Interpreting Clickthrough Data as Implicit Feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Salvador, Brazil) (SIGIR ’05). ACM, New York, NY, USA, 154–161. https://doi.org/10.1145/1076034.1076063

Digital Library

[22]

Thorsten Joachims, Laura Granka, Bing Pan, Helene Hembrooke, Filip Radlinski, and Geri Gay. 2007. Evaluating the Accuracy of Implicit Feedback from Clicks and Query Reformulations in Web Search. ACM Trans. Inf. Syst. 25, 2 (2007), 7–es. https://doi.org/10.1145/1229179.1229181

Digital Library

[23]

Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (Cambridge, United Kingdom) (WSDM ’17). ACM, New York, NY, USA, 781–789. https://doi.org/10.1145/3018661.3018699

Digital Library

[24]

Shuzi Niu, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2012. Top-k Learning to Rank: Labeling, Ranking and Evaluation. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR ’12). ACM, New York, NY, USA, 751–760. https://doi.org/10.1145/2348283.2348384

Digital Library

[25]

Harrie Oosterhuis and Maarten de Rijke. 2020. Policy-Aware Unbiased Learning to Rank for Top-k Rankings. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). ACM, New York, NY, USA, 489–498. https://doi.org/10.1145/3397271.3401102

Digital Library

[26]

Harrie Oosterhuis and Maarten de Rijke. 2021. Unifying Online and Counterfactual Learning to Rank: A Novel Counterfactual Estimator That Effectively Utilizes Online Interventions. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (Virtual Event, Israel) (WSDM ’21). ACM, New York, NY, USA, 463–471. https://doi.org/10.1145/3437963.3441794

Digital Library

[27]

Harrie Oosterhuis and Maarten de de Rijke. 2021. Robust Generalization and Safe Query-Specializationin Counterfactual Learning to Rank. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 158–170. https://doi.org/10.1145/3442381.3450018

Digital Library

[28]

Zohreh Ovaisi, Ragib Ahsan, Yifan Zhang, Kathryn Vasilaky, and Elena Zheleva. 2020. Correcting for Selection Bias in Learning-to-Rank Systems. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). ACM, New York, NY, USA, 1863–1873. https://doi.org/10.1145/3366423.3380255

Digital Library

[29]

Zohreh Ovaisi, Kathryn Vasilaky, and Elena Zheleva. 2021. Propensity-Independent Bias Recovery in Offline Learning-to-Rank Systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). ACM, New York, NY, USA, 1763–1767. https://doi.org/10.1145/3404835.3463097

Digital Library

[30]

Judea Pearl. 2009. Causality. Cambridge university press.

[31]

Judea Pearl. 2016 - 2016. Causal inference in statistics : a primer. Wiley, Chichester, West Sussex.

[32]

Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv preprint arXiv:1306.2597 (2013).

[33]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).

[34]

Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting Clicks: Estimating the Click-through Rate for New Ads(WWW ’07). ACM, New York, NY, USA, 521–530. https://doi.org/10.1145/1242572.1242643

Digital Library

[35]

Yuta Saito. 2020. Asymmetric Tri-Training for Debiasing Missing-Not-At-Random Explicit Feedback. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR ’20). ACM, New York, NY, USA, 309–318. https://doi.org/10.1145/3397271.3401114

Digital Library

[36]

Tobias Schnabel, Adith Swaminathan, Ashudeep Singh, Navin Chandak, and Thorsten Joachims. 2016. Recommendations as treatments: Debiasing learning and evaluation. In international conference on machine learning. PMLR, 1670–1679.

[37]

Ali Vardasbi, Harrie Oosterhuis, and Maarten de Rijke. 2020. When Inverse Propensity Scoring Does Not Work: Affine Corrections for Unbiased Learning to Rank. In Proceedings of the 29th ACM International Conference on Information Knowledge Management. ACM, New York, NY, USA, 1475–1484. https://doi.org/10.1145/3340531.3412031

Digital Library

[38]

Wenjie Wang, Fuli Feng, Xiangnan He, Xiang Wang, and Tat-Seng Chua. 2021. Deconfounded Recommendation for Alleviating Bias Amplification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining (Virtual Event, Singapore) (KDD ’21). Association for Computing Machinery, New York, NY, USA, 1717–1725. https://doi.org/10.1145/3447548.3467249

Digital Library

[39]

Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (Pisa, Italy) (SIGIR ’16). ACM, New York, NY, USA, 115–124. https://doi.org/10.1145/2911451.2911537

Digital Library

[40]

Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. 2018. Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM ’18). ACM, New York, NY, USA, 610–618. https://doi.org/10.1145/3159652.3159732

Digital Library

[41]

Xiaojie Wang, Rui Zhang, Yu Sun, and Jianzhong Qi. 2019. Doubly robust joint learning for recommendation on data missing not at random. In International Conference on Machine Learning. PMLR, 6638–6647.

[42]

Yixin Wang, Dawen Liang, Laurent Charlin, and David M. Blei. 2020. Causal Inference for Recommender Systems. In Fourteenth ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). ACM, New York, NY, USA, 426–431. https://doi.org/10.1145/3383313.3412225

Digital Library

[43]

Tianxin Wei, Fuli Feng, Jiawei Chen, Ziwei Wu, Jinfeng Yi, and Xiangnan He. 2021. Model-Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery Data Mining (Virtual Event, Singapore) (KDD ’21). Association for Computing Machinery, New York, NY, USA, 1791–1800. https://doi.org/10.1145/3447548.3467289

Digital Library

[44]

Bowen Yuan, Yaxu Liu, Jui-Yang Hsia, Zhenhua Dong, and Chih-Jen Lin. 2020. Unbiased Ad Click Prediction for Position-Aware Advertising Systems. In Fourteenth ACM Conference on Recommender Systems (Virtual Event, Brazil) (RecSys ’20). ACM, New York, NY, USA, 368–377. https://doi.org/10.1145/3383313.3412241

Digital Library

[45]

Yang Zhang, Fuli Feng, Xiangnan He, Tianxin Wei, Chonggang Song, Guohui Ling, and Yongdong Zhang. 2021. Causal Intervention for Leveraging Popularity Bias in Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). ACM, New York, NY, USA, 11–20. https://doi.org/10.1145/3404835.3462875

Digital Library

[46]

Yu Zheng, Chen Gao, Xiang Li, Xiangnan He, Yong Li, and Depeng Jin. 2021. Disentangling User Interest and Conformity for Recommendation with Causal Embedding. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). ACM, New York, NY, USA, 2980–2991. https://doi.org/10.1145/3442381.3449788

Digital Library

[47]

Honglei Zhuang, Zhen Qin, Xuanhui Wang, Michael Bendersky, Xinyu Qian, Po Hu, and Dan Chary Chen. 2021. Cross-Positional Attention for Debiasing Clicks. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 788–797. https://doi.org/10.1145/3442381.3450098

Digital Library

Index Terms

Unbiased Top-$k$ Learning to Rank with Causal Likelihood Decomposition
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Learning from implicit feedback
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank

Recommendations

Doubly Robust Estimation for Correcting Position Bias in Click Feedback for Unbiased Learning to Rank
Clicks on rankings suffer from position bias: generally items on lower ranks are less likely to be examined—and thus clicked—by users, in spite of their actual preferences between items. The prevalent approach to unbiased click-based learning-to-rank (LTR)...
Whole Page Unbiased Learning to Rank
WWW '24: Proceedings of the ACM Web Conference 2024

The page presentation biases in the information retrieval system, especially on the click behavior, is a well-known challenge that hinders improving ranking models' performance with implicit user feedback. Unbiased Learning to Rank~(ULTR) algorithms are ...
Unbiased Learning to Rank: Theory and Practice
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Implicit feedback (e.g., user clicks) is an important source of data for modern search engines. While heavily biased [8, 9, 11, 27], it is cheap to collect and particularly useful for user-centric retrieval applications such as search ranking. To ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

November 2023

324 pages

ISBN:9798400704086

DOI:10.1145/3624918

Editors:
Qingyao Ai
Tsinghua University, China
,
Yiqin Liu
Tsinghua University, China
,
Alistair Moffat
The University of Melbourne, Australia
,
Xuanjing Huang
Fudan University, China
,
Tetsuya Sakai
Waseda University, Japan
,
Justin Zobel
The University of Melbourne, Australia

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Beijing Outstanding Young Scientist Program

Conference

SIGIR-AP '23

Sponsor:

SIGIR

SIGIR-AP '23: Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region

November 26 - 28, 2023

Beijing, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
121
Total Downloads

Downloads (Last 12 months)76
Downloads (Last 6 weeks)2

Reflects downloads up to 14 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents