tutorial

Hands-on Reinforcement Learning for Recommender Systems - From Bandits to SlateQ to Offline RL with Ray RLlib

Authors:

Christy D. Bergman,

Kourosh HakhamaneshiAuthors Info & Claims

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Pages 700 - 701

https://doi.org/10.1145/3523227.3547370

Published: 13 September 2022 Publication History

Get Access

Abstract

Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning for RecSys due to its ability to solve sequential decision-making processes for delayed rewards. Recent advances in offline reinforcement learning, off-policy evaluation, and more scalable, performant system design with the ability to run code in parallel, have made RL more tractable for the RecSys real time use cases. This tutorial introduces RLlib [9], a comprehensive open-source Python RL framework built for production workloads. RLlib is built on top of open-source Ray [8], an easy-to-use, distributed computing framework for Python that can handle complex, heterogeneous applications. Ray and RLlib run on compute clusters on any cloud without vendor lock. Using Colab notebooks, you will leave this tutorial with a complete, working example of parallelized Python RL code using RLlib for RecSys on a github repo.

References

[1]

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H. Chi. 2019. Top-K Off-Policy Correction for a REINFORCE Recommender System. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia) (WSDM ’19). Association for Computing Machinery, New York, NY, USA, 456–464. https://doi.org/10.1145/3289600.3290999

Digital Library

Google Scholar

[2]

Kourosh Hakhamaneshi, Ruihan Zhao, Albert Zhan, Pieter Abbeel, and Michael Laskin. 2021. Hierarchical few-shot imitation with skill transition models. arXiv preprint arXiv:2107.08981 abs/2107.08981, 1 (2021), 1–19.

Google Scholar

[3]

Xu He, Bo An, Yanghua Li, Haikai Chen, Rundong Wang, Xinrun Wang, Runsheng Yu, Xin Li, and Zhirong Wang. 2020. Learning to collaborate in multi-module recommendation via multi-agent reinforcement learning without communication. In Fourteenth ACM Conference on Recommender Systems. Fourteenth ACM Conference on Recommender Systems, New York, NY, USA, 210–219.

Digital Library

Google Scholar

[4]

Eugene Ie, Chih-wei Hsu, Martin Mladenov, Vihan Jain, Sanmit Narvekar, Jing Wang, Rui Wu, and Craig Boutilier. 2019. Recsim: A configurable simulation platform for recommender systems. arXiv preprint arXiv:1909.04847 abs/1909.04847 (2019), 1–23.

Google Scholar

[5]

Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu. 2020. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643 abs/2005.01643 (2020), 1–43.

Google Scholar

[6]

Eric Liang, Zhanghao Wu, Michael Luo, Sven Mika, and Ion Stoica. 2020. Distributed Reinforcement Learning is a Dataflow Problem. arXiv preprint arXiv:2011.12719 34 (2020), 5506–5517.

Google Scholar

[7]

Francois Mairesse, Zhonghao Luo, and Tao Ye. 2021. Learning a Voice-based Conversational Recommender using Offline Policy Optimization. In Fifteenth ACM Conference on Recommender Systems. Association for Computing Machinery, New York, NY, USA, 562–564.

Digital Library

Google Scholar

[8]

Ray. 2022. Ray provides a simple, universal API for building distributed applications. ray.io. Retrieved July 12, 2022 from https://github.com/ray-project/ray

Google Scholar

[9]

RLlib. 2022. RLlib: Industry-Grade Reinforcement Learning. ray.io. Retrieved July 12, 2022 from https://github.com/ray-project/ray/tree/master/rllib

Google Scholar

[10]

Michael Schaarschmidt, Sven Mika, Kai Fricke, and Eiko Yoneki. 2019. RLgraph: Modular Computation Graphs for Deep Reinforcement Learning. Proceedings of Machine Learning and Systems 1 (2019), 65–80.

Google Scholar

[11]

Wildlife Studios. 2021. Using Reinforcement Learning to Optimize IAP Offer Recommendations in Mobile Games. wildlifestudios.com. Retrieved July 12, 2022 from https://www.youtube.com/watch?v=cGQk8rIoc1Y

Google Scholar

[12]

Qing Wang, Jiechao Xiong, Lei Han, Han Liu, Tong Zhang, 2018. Exponentially weighted imitation learning for batched historical data. Advances in Neural Information Processing Systems 31 (2018), 6288–6297.

Google Scholar

[13]

Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Y. Siegel, Josh Merel, Çaglar Gülçehre, Nicolas Heess, and Nando de Freitas. 2020. Critic Regularized Regression. CoRR abs/2006.15134(2020), 1–24. arXiv:2006.15134https://arxiv.org/abs/2006.15134

Google Scholar

Index Terms

Hands-on Reinforcement Learning for Recommender Systems - From Bandits to SlateQ to Offline RL with Ray RLlib
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed algorithms
  2. Machine learning
    1. Learning settings
      1. Batch learning
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning

Recommendations

Conversational Recommender System Using Deep Reinforcement Learning
RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Deep Reinforcement Learning (DRL) uses the best of both Reinforcement Learning and Deep Learning for solving problems which cannot be addressed by them individually. Deep Reinforcement Learning has been used widely for games, robotics etc. Limited work ...
Reinforcement Learning based Recommender Systems: A Survey
Recommender systems (RSs) have become an inseparable part of our everyday lives. They help us find our favorite items to purchase, our friends on social networks, and our favorite movies to watch. Traditionally, the recommendation problem was considered ...
Hands-On Reinforcement Learning with Python: Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

September 2022

743 pages

ISBN:9781450392785

DOI:10.1145/3523227

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

RecSys '22

Sponsor:

RecSys '22: Sixteenth ACM Conference on Recommender Systems

September 18 - 23, 2022

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
408
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)7

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

References

Index Terms

Recommendations

Conversational Recommender System Using Deep Reinforcement Learning

Reinforcement Learning based Recommender Systems: A Survey

Hands-On Reinforcement Learning with Python: Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations