[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3460231.3478849acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
extended-abstract

Estimating and Penalizing Preference Shift in Recommender Systems

Published: 13 September 2021 Publication History

Abstract

Recommender systems trained via long-horizon optimization (e.g., reinforcement learning) will have incentives to actively manipulate user preferences through the recommended content. While some work has argued for making systems myopic to avoid this issue, even such systems can induce systematic undesirable preference shifts. Thus, rather than artificially stifling the capabilities of the system, in this work we explore how we can make capable systems that explicitly avoid undesirable shifts. We advocate for (1) estimating the preference shifts that would be induced by recommender system policies, and (2) explicitly characterizing what unwanted shifts are and assessing before deployment whether such policies will produce them – ideally even actively optimizing to avoid them. These steps involve two challenging ingredients: (1) requires the ability to anticipate how hypothetical policies would influence user preferences if deployed; instead, (2) requires metrics to assess whether such influences are manipulative or otherwise unwanted. We study how to do (1) from historical user interaction data by building a user predictive model that implicitly contains their preference dynamics; to address (2), we introduce the notion of a “safe policy”, which defines a trust region within which behavior is believed to be safe. We show that recommender systems that optimize for staying in the trust region avoid manipulative behaviors (e.g., changing preferences in ways that make users more predictable), while still generating engagement.

Supplementary Material

MP4 File (Recsys21 Video.mp4)
User preferences over the content they want to watch (or read, or purchase) are non-stationary. Further, the actions that a recommender system (RS) takes -- the content it exposes users to -- plays a role in \emph{changing} these preferences. Therefore, when an RS designer chooses which system or policy to deploy, they are implicitly \emph{choosing how to shift} or influence user preferences. Even more, if the RS is trained via long-horizon optimization (e.g. reinforcement learning), it will have incentives to manipulate user preferences -- shift them so they are more easy to satisfy, and thus conducive to higher reward. While some work has argued for making systems myopic to avoid this issue, the reality is that such systems will still influence preferences, sometimes in an undesired way. In this work, we argue that we need to enable system designers to 1) estimate the shifts an RS would induce, 2) evaluate, before deployment, whether the shifts are undesirable, and even 3) actively optimize to avoid such shifts.

References

[1]
2018. Aspiration: The Agency of Becoming. Oxford University Press, Oxford, New York.
[2]
M. Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. Reinforcement learning based recommender systems: A survey. arXiv:2101.06286 [cs] (Jan. 2021). http://arxiv.org/abs/2101.06286 arXiv:2101.06286.
[3]
Cecilie Schou Andreassen. 2015. Online Social Network Site Addiction: A Comprehensive Review. Current Addiction Reports 2, 2 (June 2015), 175–184. https://doi.org/10.1007/s40429-015-0056-9
[4]
Xueying Bai, Jian Guan, and Hongning Wang. 2020. Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation. arXiv:1911.03845 [cs, stat] (Jan. 2020). http://arxiv.org/abs/1911.03845 arXiv:1911.03845.
[5]
B Douglas Bernheim, Luca Braghieri, Alejandro Martínez-Marquina, and David Zuckerman. 2021. A theory of chosen preferences. American Economic Review 111, 2 (2021), 720–54.
[6]
Dimitrios Bountouridis, Jaron Harambam, Mykola Makhortykh, Mónica Marrero, Nava Tintarev, and Claudia Hauff. 2019. SIREN: A Simulation Framework for Understanding the Effects of Recommender Systems in Online News Environments. In Proceedings of the Conference on Fairness, Accountability, and Transparency. ACM, Atlanta GA USA, 150–159. https://doi.org/10.1145/3287560.3287583
[7]
Allison J. B. Chaney, Brandon M. Stewart, and Barbara E. Engelhardt. 2018. How Algorithmic Confounding in Recommendation Systems Increases Homogeneity and Decreases Utility. Proceedings of the 12th ACM Conference on Recommender Systems (Sept. 2018), 224–232. https://doi.org/10.1145/3240323.3240370 arXiv:1710.11214.
[8]
Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2019. Generative Adversarial User Model for Reinforcement Learning Based Recommendation System. arXiv:1812.10613 [cs, stat] (Dec. 2019). http://arxiv.org/abs/1812.10613 arXiv:1812.10613.
[9]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. arXiv:1606.07792 [cs, stat] (June 2016). http://arxiv.org/abs/1606.07792 arXiv:1606.07792.
[10]
Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing? how recommender system interfaces affect users’ opinions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(CHI ’03). Association for Computing Machinery, New York, NY, USA, 585–592. https://doi.org/10.1145/642611.642713
[11]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems - RecSys ’16. ACM Press, Boston, Massachusetts, USA, 191–198. https://doi.org/10.1145/2959100.2959190
[12]
Alice Dechêne, Christoph Stahl, Jochim Hansen, and Michaela Wänke. 2010. The Truth About the Truth: A Meta-Analytic Review of the Truth Effect. Personality and Social Psychology Review 14, 2 (May 2010), 238–257. https://doi.org/10.1177/1088868309352251 Publisher: SAGE Publications Inc.
[13]
Michael D. Ekstrand and Martijn C. Willemsen. 2016. Behaviorism is Not Enough: Better Recommendations through Listening to Users. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, Boston Massachusetts USA, 221–224. https://doi.org/10.1145/2959100.2959179
[14]
Ulrike Gretzel and Daniel Fesenmaier. 2006. Persuasion in Recommender Systems. International Journal of Electronic Commerce 11, 2 (Dec. 2006), 81–100. https://doi.org/10.2753/JEC1086-4415110204
[15]
Md Rajibul Hasan, Ashish Kumar Jha, and Yi Liu. 2018. Excessive use of online video streaming services: Impact of recommender system use, psychological factors, and motives. Computers in Human Behavior 80 (March 2018), 220–228. https://doi.org/10.1016/j.chb.2017.11.020
[16]
David Holtz, Benjamin Carterette, Praveen Chandar, Zahra Nazari, Henriette Cramer, and Sinan Aral. 2020. The Engagement-Diversity Connection: Evidence from a Field Experiment on Spotify. arXiv:2003.08203 [cs] (March 2020). http://arxiv.org/abs/2003.08203 arXiv:2003.08203.
[17]
James P. Hughes, Peter Guttorp, and Stephen P. Charles. 1999. A Non-Homogeneous Hidden Markov Model for Precipitation Occurrence. Journal of the Royal Statistical Society. Series C (Applied Statistics) 48, 1(1999), 15–30. http://www.jstor.org/stable/2680815
[18]
Gerald Häubl and Kyle B. Murray. 2003. Preference Construction and Persistence in Digital Marketplaces: The Role of Electronic Recommendation Agents. Journal of Consumer Psychology 13, 1 (Jan. 2003), 75–91. https://doi.org/10.1207/S15327663JCP13-1&2_07
[19]
Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Morgane Lustman, Vince Gatto, Paul Covington, Jim McFadden, Tushar Chandra, and Craig Boutilier. 2019. Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology. arXiv:1905.12767 [cs, stat] (May 2019). http://arxiv.org/abs/1905.12767 arXiv:1905.12767.
[20]
Ray Jiang, Silvia Chiappa, Tor Lattimore, András György, and Pushmeet Kohli. 2019. Degenerate Feedback Loops in Recommender Systems. Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Jan. 2019), 383–390. https://doi.org/10.1145/3306618.3314288 arXiv:1902.10730.
[21]
Prerna Juneja and Tanushree Mitra. 2021. Auditing E-Commerce Platforms for Algorithmically Curated Vaccine Misinformation. (2021), 27.
[22]
Adam D. I. Kramer, Jamie E. Guillory, and Jeffrey T. Hancock. 2014. Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences 111, 24 (June 2014), 8788–8790. https://doi.org/10.1073/pnas.1320040111 Publisher: National Academy of Sciences Section: Social Sciences.
[23]
David Krueger, Tegan Maharaj, and Jan Leike. 2020. Hidden Incentives for Auto-Induced Distributional Shift. arXiv:2009.09153 [cs, stat] (Sept. 2020). http://arxiv.org/abs/2009.09153 arXiv:2009.09153.
[24]
Lei Li, Li Zheng, Fan Yang, and Tao Li. 2014. Modeling and broadening temporal user interest in personalized news recommendation. Expert Systems with Applications 41, 7 (June 2014), 3168–3177. https://doi.org/10.1016/j.eswa.2013.11.020
[25]
Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, and Ion Stoica. 2018. RLlib: Abstractions for Distributed Reinforcement Learning. arXiv:1712.09381 [cs] (June 2018). http://arxiv.org/abs/1712.09381 arXiv:1712.09381.
[26]
Jordan Louviere, David Hensher, and Joffre Swait. 2000. Stated choice methods: analysis and application. Vol. 17. https://doi.org/10.1017/CBO9780511753831.008
[27]
Colin MacLeod and Lynlee Campbell. 1992. Memory accessibility and probability judgments: An experimental evaluation of the availability heuristic. Journal of Personality and Social Psychology 63, 6(1992), 890–902. https://doi.org/10.1037/0022-3514.63.6.890 Place: US Publisher: American Psychological Association.
[28]
Masoud Mansoury, Himan Abdollahpouri, Mykola Pechenizkiy, Bamshad Mobasher, and Robin Burke. 2020. Feedback Loop and Bias Amplification in Recommender Systems. arXiv:2007.13019 [cs] (July 2020). http://arxiv.org/abs/2007.13019 arXiv:2007.13019.
[29]
S. C. Matz, M. Kosinski, G. Nave, and D. J. Stillwell. 2017. Psychological targeting as an effective approach to digital mass persuasion. Proceedings of the National Academy of Sciences 114, 48 (Nov. 2017), 12714–12719. https://doi.org/10.1073/pnas.1710966114
[30]
Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, K. R. Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2021. High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models. arXiv:2104.05158 [cs] (April 2021). http://arxiv.org/abs/2104.05158 arXiv:2104.05158.
[31]
Tien T. Nguyen, Pik-Mai Hui, F. Maxwell Harper, Loren Terveen, and Joseph A. Konstan. 2014. Exploring the filter bubble: the effect of using recommender systems on content diversity. In Proceedings of the 23rd international conference on World wide web - WWW ’14. ACM Press, Seoul, Korea, 677–686. https://doi.org/10.1145/2566486.2568012
[32]
Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. 2017. Embedding-based News Recommendation for Millions of Users. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Halifax NS Canada, 1933–1942. https://doi.org/10.1145/3097983.3098108
[33]
L. A. Paul. 2014. Transformative experience(1st ed ed.). Oxford University Press, Oxford. OCLC: ocn872342141.
[34]
L R Rabiner. [n.d.]. An Introduction to Hidden Markov Models. ([n. d.]), 13.
[35]
Manoel Horta Ribeiro, Raphael Ottoni, Robert West, Virgílio A. F. Almeida, and Wagner Meira. 2019. Auditing Radicalization Pathways on YouTube. arXiv:1908.08313 [cs] (Dec. 2019). http://arxiv.org/abs/1908.08313 arXiv:1908.08313.
[36]
Stuart Russell and Peter Norvig. 2002. Artificial intelligence: a modern approach. (2002).
[37]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017).
[38]
Jonathan Stray. 2021. Designing Recommender Systems to Depolarize. arXiv:2107.04953 [cs] (July 2021). http://arxiv.org/abs/2107.04953 arXiv:2107.04953.
[39]
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. arXiv:1904.06690 [cs] (Aug. 2019). http://arxiv.org/abs/1904.06690 arXiv:1904.06690.
[40]
Chao-Yuan Wu, Amr Ahmed, Alex Beutel, Alexander J Smola, and How Jing. 2017. Recurrent recommender networks. In Proceedings of the tenth ACM international conference on web search and data mining. 495–503.
[41]
Sirui Yao, Yoni Halpern, Nithum Thain, Xuezhi Wang, Kang Lee, Flavien Prost, Ed H. Chi, Jilin Chen, and Alex Beutel. 2021. Measuring Recommender System Effects with Simulated Users. arXiv:2101.04526 [cs] (Jan. 2021). http://arxiv.org/abs/2101.04526 arXiv:2101.04526.
[42]
Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep Learning based Recommender System: A Survey and New Perspectives. Comput. Surveys 52, 1 (Feb. 2019), 1–38. https://doi.org/10.1145/3285029 arXiv:1707.07435.
[43]
Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. 2014. Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks. arXiv:1404.5772 [cs] (July 2014). http://arxiv.org/abs/1404.5772 arXiv:1404.5772.
[44]
Xiangyu Zhao, Long Xia, Lixin Zou, Dawei Yin, and Jiliang Tang. 2019. Toward Simulating Environments in Reinforcement Learning Based Recommendations. arXiv:1906.11462 [cs] (Sept. 2019). http://arxiv.org/abs/1906.11462 arXiv:1906.11462.
[45]
Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems. arXiv:1902.05570 [cs] (July 2019). http://arxiv.org/abs/1902.05570 arXiv:1902.05570.

Cited By

View all
  • (2024)Automated Influence and Value CollapseAmerican Philosophical Quarterly10.5406/21521123.61.4.0661:4(369-386)Online publication date: 1-Oct-2024
  • (2024)AdapTUI: Adaptation of Geometric-Feature-Based Tangible User Interfaces in Augmented RealityProceedings of the ACM on Human-Computer Interaction10.1145/36981278:ISS(44-69)Online publication date: 24-Oct-2024
  • (2024)Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender SystemsACM Transactions on Information Systems10.1145/363786942:4(1-32)Online publication date: 9-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems
September 2021
883 pages
ISBN:9781450384582
DOI:10.1145/3460231
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2021

Check for updates

Author Tags

  1. Changing Preferences
  2. Preference Manipulation
  3. Recommender Systems

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Funding Sources

  • ONR YIP

Conference

RecSys '21: Fifteenth ACM Conference on Recommender Systems
September 27 - October 1, 2021
Amsterdam, Netherlands

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)84
  • Downloads (Last 6 weeks)6
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Automated Influence and Value CollapseAmerican Philosophical Quarterly10.5406/21521123.61.4.0661:4(369-386)Online publication date: 1-Oct-2024
  • (2024)AdapTUI: Adaptation of Geometric-Feature-Based Tangible User Interfaces in Augmented RealityProceedings of the ACM on Human-Computer Interaction10.1145/36981278:ISS(44-69)Online publication date: 24-Oct-2024
  • (2024)Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender SystemsACM Transactions on Information Systems10.1145/363786942:4(1-32)Online publication date: 9-Feb-2024
  • (2024)Harm Mitigation in Recommender Systems under User Preference DynamicsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671925(255-265)Online publication date: 25-Aug-2024
  • (2024)Building Human Values into Recommender Systems: An Interdisciplinary SynthesisACM Transactions on Recommender Systems10.1145/36322972:3(1-57)Online publication date: 5-Jun-2024
  • (2024)Rewriting Bias: Mitigating Media Bias in News Recommender Systems through Automated RewritingProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659541(67-77)Online publication date: 22-Jun-2024
  • (2024)Beyond Preferences in AI AlignmentPhilosophical Studies10.1007/s11098-024-02249-wOnline publication date: 9-Nov-2024
  • (2023)The Algorithmic Management of Polarization and Violence on Social MediaSSRN Electronic Journal10.2139/ssrn.4429558Online publication date: 2023
  • (2023)Reward Reports for Reinforcement LearningProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3600211.3604698(84-130)Online publication date: 8-Aug-2023
  • (2023)User Tampering in Reinforcement Learning Recommender SystemsProceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society10.1145/3600211.3604669(58-69)Online publication date: 8-Aug-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media