More Web Proxy on the site http://driver.im/

survey

Reinforcement Learning based Recommender Systems: A Survey

Authors:

M. Mehdi Afsar,

Trafford Crump,

Behrouz FarAuthors Info & Claims

ACM Computing Surveys, Volume 55, Issue 7

Article No.: 145, Pages 1 - 38

https://doi.org/10.1145/3543846

Published: 15 December 2022 Publication History

Abstract

Recommender systems (RSs) have become an inseparable part of our everyday lives. They help us find our favorite items to purchase, our friends on social networks, and our favorite movies to watch. Traditionally, the recommendation problem was considered to be a classification or prediction problem, but it is now widely agreed that formulating it as a sequential decision problem can better reflect the user-system interaction. Therefore, it can be formulated as a Markov decision process (MDP) and be solved by reinforcement learning (RL) algorithms. Unlike traditional recommendation methods, including collaborative filtering and content-based filtering, RL is able to handle the sequential, dynamic user-system interaction and to take into account the long-term user engagement. Although the idea of using RL for recommendation is not new and has been around for about two decades, it was not very practical, mainly because of scalability problems of traditional RL algorithms. However, a new trend has emerged in the field since the introduction of deep reinforcement learning (DRL), which made it possible to apply RL to the recommendation problem with large state and action spaces. In this paper, a survey on reinforcement learning based recommender systems (RLRSs) is presented. Our aim is to present an outlook on the field and to provide the reader with a fairly complete knowledge of key concepts of the field. We first recognize and illustrate that RLRSs can be generally classified into RL- and DRL-based methods. Then, we propose an RLRS framework with four components, i.e., state representation, policy optimization, reward formulation, and environment building, and survey RLRS algorithms accordingly. We highlight emerging topics and depict important trends using various graphs and tables. Finally, we discuss important aspects and challenges that can be addressed in the future.

References

[1]

Cisco Visual Networking Index. 2013. The Zettabyte Era–Trends and Analysis. Cisco White Paper (2013).

[2]

Dietmar Jannach, Markus Zanker, Alexander Felfernig, and Gerhard Friedrich. 2010. Recommender Systems: An Introduction. Cambridge University Press.

[3]

Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender Systems Handbook. 1–35.

[4]

George Lekakos and Petros Caravelas. 2008. A hybrid approach for movie recommendation. Multimedia Tools and Applications 36, 1 (2008), 55–70.

Digital Library

[5]

Hung-Chen Chen and Arbee L. P. Chen. 2001. A music recommendation system based on music data grouping and user interests. In CIKM’01. 231–238.

Digital Library

[6]

Xuan Zhu, Yuan-Yuan Shi, Hyoung-Gook Kim, and Ki-Wan Eom. 2006. An integrated music recommendation system. IEEE Transactions on Consumer Electronics 52, 3 (2006), 917–925.

Digital Library

[7]

J. Ben Schafer, Joseph Konstan, and John Riedl. 1999. Recommender systems in e-commerce. In EC’99. 158–166.

Digital Library

[8]

Mozhgan Karimi, Dietmar Jannach, and Michael Jugovac. 2018. News recommender systems–survey and roads ahead. Information Processing & Management 54, 6 (2018), 1203–1227.

[9]

Aleksandra Klašnja-Milićević, Mirjana Ivanović, and Alexandros Nanopoulos. 2015. Recommender systems in e-learning environments: A survey of the state-of-the-art and possible extensions. Artificial Intelligence Review 44, 4 (2015), 571–604.

Digital Library

[10]

Emre Sezgin and Sevgi Özkan. 2013. A systematic literature review on health recommender systems. In EHB’13. 1–4.

[11]

Netflix Update: Try This at Home. https://sifter.org/simon/journal/20061211.html. ([n. d.]).

[12]

Jesús Bobadilla, Fernando Ortega, Antonio Hernando, and Abraham Gutiérrez. 2013. Recommender systems survey. Knowledge-based Systems 46 (2013), 109–132.

Digital Library

[13]

Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. MIT press Cambridge.

Digital Library

[14]

Shuai Zhang, Lina Yao, Aixin Sun, and Yi Tay. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR) 52, 1 (2019), 1–38.

Digital Library

[15]

Yuxi Li. 2017. Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274 (2017).

[16]

Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, and David Meger. 2018. Deep reinforcement learning that matters. In AAAI’18.

[17]

Ahmad EL Sallab, Mohammed Abdou, Etienne Perot, and Senthil Yogamani. 2017. Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017, 19 (2017), 70–76.

[18]

Changxi You, Jianbo Lu, Dimitar Filev, and Panagiotis Tsiotras. 2019. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robotics and Autonomous Systems 114 (2019), 1–18.

Digital Library

[19]

Jens Kober, J. Andrew Bagnell, and Jan Peters. 2013. Reinforcement learning in robotics: A survey. International Journal of Robotics Research 32, 11 (2013), 1238–1274.

Digital Library

[20]

Richard Meyes, Hasan Tercan, Simon Roggendorf, Thomas Thiele, Christian Büscher, Markus Obdenbusch, Christian Brecher, Sabina Jeschke, and Tobias Meisen. 2017. Motion planning for industrial robots using reinforcement learning. CIRP’17 63 (2017), 107–112.

[21]

Zhengyao Jiang, Dixing Xu, and Jinjun Liang. 2017. A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059 (2017).

[22]

Arthur Guez, Robert D. Vincent, Massimo Avoli, and Joelle Pineau. 2008. Adaptive treatment of epilepsy via batch-mode reinforcement learning. In AAAI’08. 1671–1678.

[23]

Omer Gottesman, Fredrik Johansson, Matthieu Komorowski, Aldo Faisal, David Sontag, Finale Doshi-Velez, and Leo Anthony Celi. 2019. Guidelines for reinforcement learning in healthcare. Nature Medicine 25, 1 (2019), 16–18.

[24]

Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2015. Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679 (2015).

[25]

Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed H. Chi. 2019. Top-k off-policy correction for a REINFORCE recommender system. In WSDM’19. 456–464.

Digital Library

[26]

Aleksandrs Slivkins. 2019. Introduction to multi-armed bandits. arXiv preprint arXiv:1904.07272 (2019).

[27]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT press.

Digital Library

[28]

Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In WWW’10. 661–670.

Digital Library

[29]

Shuai Li, Alexandros Karatzoglou, and Claudio Gentile. 2016. Collaborative filtering bandits. In SIGIR’16. 539–548.

Digital Library

[30]

Charu C. Aggarwal. 2016. Recommender Systems. Springer.

[31]

M. Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. An exploration on-demand article recommender system for cancer patients information provisioning. In FLAIRS’21, Vol. 34.

[32]

Gangan Elena, Kudus Milos, and Ilyushin Eugene. 2021. Survey of multiarmed bandit algorithms applied to recommendation systems. International Journal of Open Information Technologies 9, 4 (2021), 12–27.

[33]

Xiaoyuan Su and Taghi M. Khoshgoftaar. 2009. A survey of collaborative filtering techniques. Advances in Artificial Intelligence (2009).

Digital Library

[34]

Yue Shi, Martha Larson, and Alan Hanjalic. 2014. Collaborative filtering beyond the user-item matrix: A survey of the state of the art and future challenges. ACM Computing Surveys (CSUR) 47, 1 (2014), 1–45.

Digital Library

[35]

Robin Burke. 2002. Hybrid recommender systems: Survey and experiments. User Modeling and User-adapted Interaction 12, 4 (2002), 331–370.

Digital Library

[36]

Feng Xia, Nana Yaw Asabere, Ahmedin Mohammed Ahmed, Jing Li, and Xiangjie Kong. 2013. Mobile multimedia recommendation in smart communities: A survey. IEEE Access 1 (2013), 606–624.

[37]

Yashar Deldjoo, Markus Schedl, Paolo Cremonesi, and Gabriella Pasi. 2020. Recommender systems leveraging multimedia content. ACM Computing Surveys (CSUR) 53, 5 (2020), 1–38.

Digital Library

[38]

Yongfeng Zhang and Xu Chen. 2018. Explainable recommendation: A survey and new perspectives. arXiv preprint arXiv:1804.11192 (2018).

[39]

Joeran Beel, Bela Gipp, Stefan Langer, and Corinna Breitinger. 2016. Paper recommender systems: A literature survey. International Journal on Digital Libraries 17, 4 (2016), 305–338.

Digital Library

[40]

Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin. 2019. Deep reinforcement learning for search, recommendation, and online advertising: A survey. ACM SIGWEB Newsletter (2019), 1–15.

Digital Library

[41]

Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–36.

Digital Library

[42]

Shoujin Wang, Longbing Cao, Yan Wang, Quan Z. Sheng, Mehmet A. Orgun, and Defu Lian. 2021. A survey on session-based recommender systems. ACM Computing Surveys (CSUR) 54, 7 (2021), 1–38.

Digital Library

[43]

Shi-Yong Chen, Yang Yu, Qing Da, Jun Tan, Hai-Kuan Huang, and Hai-Hong Tang. 2018. Stabilizing reinforcement learning in dynamic environment with application to online recommendation. In SIGKDD’18. 1187–1196.

[44]

Sungwoon Choi, Heonseok Ha, Uiwon Hwang, Chanju Kim, Jung-Woo Ha, and Sungroh Yoon. 2018. Reinforcement learning based recommender system using biclustering technique. arXiv preprint arXiv:1801.05532 (2018).

[45]

Isshu Munemasa, Yuta Tomomatsu, Kunioki Hayashi, and Tomohiro Takagi. 2018. Deep reinforcement learning for recommender systems. In ICOIACT’18. 226–233.

[46]

Xiangyu Zhao, Long Xia, Liang Zhang, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2018. Deep reinforcement learning for page-wise recommendations. In RecSys’18. 95–103.

Digital Library

[47]

Guanjie Zheng, Fuzheng Zhang, Zihan Zheng, Yang Xiang, Nicholas Jing Yuan, Xing Xie, and Zhenhui Li. 2018. DRN: A deep reinforcement learning framework for news recommendation. In WWW’18. 167–176.

[48]

Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Long Xia, Jiliang Tang, and Dawei Yin. 2018. Recommendations with negative feedback via pairwise deep reinforcement learning. In SIGKDD’18. 1040–1048.

[49]

Omar Moling, Linas Baltrunas, and Francesco Ricci. 2012. Optimal radio channel recommendations with explicit and implicit feedback. In RecSys. 75–82.

[50]

Guy Shani, David Heckerman, and Ronen I. Brafman. 2005. An MDP-based recommender system. Journal of Machine Learning Research 6 (2005), 1265–1295.

Digital Library

[51]

Binbin Hu, Chuan Shi, and Jian Liu. 2017. Playlist recommendation based on reinforcement learning. In ICIS’17. 172–182.

[52]

Xiangyu Zhao, Liang Zhang, Long Xia, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2017. Deep reinforcement learning for list-wise recommendations. arXiv preprint arXiv:1801.00209 (2017).

[53]

Paul Resnick and Hal R. Varian. 1997. Recommender systems. Commun. ACM 40, 3 (1997), 56–58.

Digital Library

[54]

David Goldberg, David Nichols, Brian M. Oki, and Douglas Terry. 1992. Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 12 (1992), 61–70.

Digital Library

[55]

Michael J. Pazzani and Daniel Billsus. 2007. Content-based recommendation systems. In The Adaptive Web. Springer, 325–341.

[56]

Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. 2011. Content-based recommender systems: State of the art and trends. In Recommender Systems Handbook. 73–105.

[57]

Christopher Watkins. 1989. Learning from Delayed Rewards. University of Cambridge.

[58]

Gavin A. Rummery and Mahesan Niranjan. 1994. On-line Q-learning Using Connectionist Systems. Vol. 37. University of Cambridge.

[59]

Anton Schwartz. 1993. A reinforcement learning method for maximizing undiscounted rewards. In ICML’93. 298–305.

[60]

Cameron B. Browne, Edward Powley, Daniel Whitehouse, Simon M. Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. 2012. A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games 4, 1 (2012), 1–43.

[61]

Guillaume Chaslot, Sander Bakkes, Istvan Szita, and Pieter Spronck. 2008. Monte-Carlo tree search: A new framework for game AI. AIIDE 8 (2008), 216–217.

[62]

David Silver et al. 2016. Mastering the game of go with deep neural networks and tree search. Nature 529, 7587 (2016), 484–489.

[63]

Geoffrey J. Gordon. 1999. Approximate Solutions to Markov Decision Processes. Carnegie Mellon University.

Digital Library

[64]

Damien Ernst, Pierre Geurts, and Louis Wehenkel. 2005. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6 (2005), 503–556.

Digital Library

[65]

Ronald J. Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 3-4 (1992), 229–256.

Digital Library

[66]

Andrew G. Barto, Richard S. Sutton, and Charles W. Anderson. 1983. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man, and Cybernetics5 (1983), 834–846.

[67]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[68]

Volodymyr Mnih et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533.

[69]

Long-Ji Lin. 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 3-4 (1992), 293–321.

Digital Library

[70]

Sebastian Thrun and Anton Schwartz. 1993. Issues in using function approximation for reinforcement learning. In Connectionist Models Summer School Hillsdale.

[71]

Hado Van Hasselt, Arthur Guez, and David Silver. 2015. Deep reinforcement learning with double Q-learning. arXiv preprint arXiv:1509.06461 (2015).

[72]

Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In ICML’16. 1995–2003.

[73]

Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).

[74]

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).

[75]

David Silver et al. 2014. Deterministic policy gradient algorithms. In ICML’14.

[76]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

[77]

John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In ICML’15. 1889–1897.

[78]

Mohammad Mehdi Afsar, Trafford Crump, and Behrouz Far. 2022. Sample efficiency in deep reinforcement learning based recommender systems with imitation learning. In Canadian Conference on Artificial Intelligence.

[79]

A. Zimdars, D. M. Chickering, and C. Meek. 2001. Using temporal data for making recommendations. In UAI’01. 580–588.

[80]

Eugene Ie et al. 2019. Reinforcement learning for slate-based recommender systems: A tractable decomposition and practical methodology. arXiv preprint arXiv:1905.12767 (2019).

[81]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

[82]

Joeran Beel and Stefan Langer. 2015. A comparison of offline evaluations, online evaluations, and user studies in the context of research-paper recommender systems. In TPDL’15. 153–168.

[83]

Thorsten Joachims, Dayne Freitag, Tom Mitchell, et al. 1997. WebWatcher: A tour guide for the World Wide Web. In IJCAI’97. 770–777.

[84]

Mircea Preda and Dan Popescu. 2005. Personalized web recommendations: Supporting epistemic information about end-users. In WI’05. 692–695.

[85]

Nima Taghipour, Ahmad Kardan, and Saeed Shiry Ghidary. 2007. Usage-based web recommendations: A reinforcement learning approach. In RecSys’07. 113–120.

Digital Library

[86]

Tariq Mahmood and Francesco Ricci. 2007. Learning and adaptivity in interactive recommender systems. In EC’07. 75–84.

Digital Library

[87]

Nima Taghipour and Ahmad Kardan. 2008. A hybrid web recommender system based on Q-learning. In SAC’08. 1164–1168.

Digital Library

[88]

Tariq Mahmood and Francesco Ricci. 2009. Improving recommender systems with adaptive conversational strategies. In HT’09. 73–82.

Digital Library

[89]

Chung-Yi Chi, Richard Tzong-Han Tsai, Jeng-You Lai, and Jane Yung-jen Hsu. 2010. A reinforcement learning approach to emotion-based automatic playlist generation. In TAAI’10. 60–65.

[90]

Susan M. Shortreed, Eric Laber, Daniel J. Lizotte, T. Scott Stroup, Joelle Pineau, and Susan A. Murphy. 2011. Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine Learning 84, 1-2 (2011), 109–136.

Digital Library

[91]

Yufan Zhao, Donglin Zeng, Mark A. Socinski, and Michael R. Kosorok. 2011. Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67, 4 (2011), 1422–1433.

[92]

Tariq Mahmood, Ghulam Mujtaba, and Adriano Venturini. 2014. Dynamic personalization in conversational recommender systems. Information Systems and E-Business Management 12, 2 (2014), 213–238.

Digital Library

[93]

Elad Liebman, Maytal Saar-Tsechansky, and Peter Stone. 2014. DJ-MC: A reinforcement-learning agent for music playlist recommendation. arXiv preprint arXiv:1401.1880 (2014).

[94]

Georgios Theocharous, Philip S. Thomas, and Mohammad Ghavamzadeh. 2015. Personalized ad recommendation systems for life-time value optimization with guarantees. In IJCAI’15.

[95]

Zhongqi Lu and Qiang Yang. 2016. Partially observable Markov decision process for recommender systems. arXiv preprint arXiv:1608.07793 (2016).

[96]

Yang Zhang, Chenwei Zhang, and Xiaozhong Liu. 2017. Dynamic scholarly collaborator recommendation via competitive multi-agent reinforcement learning. In RecSys’17. 331–335.

Digital Library

[97]

Wacharawan Intayoad, Chayapol Kamyod, and Punnarumol Temdee. 2018. Reinforcement learning for online learning recommendation system. In GWS. 167–170.

[98]

Jia-Wei Chang, Ching-Yi Chiou, Jia-Yi Liao, Ying-Kai Hung, Chien-Che Huang, Kuan-Cheng Lin, and Ying-Hung Pu. 2019. Music recommender using deep embedding-based features and behavior-based reinforcement learning. Multimedia Tools and Applications (2019), 1–28.

[99]

Jing Chen and Wenjun Jiang. 2019. Context-aware personalized POI sequence recommendation. In iSCI’19. Springer, 197–210.

[100]

Yu Wang. 2020. A hybrid recommendation for music based on reinforcement learning. In PAKDD’20. 91–103.

[101]

Marios Kokkodis and Panagiotis G. Ipeirotis. 2021. Demand-aware career path recommendations: A reinforcement learning approach. Management Science 67, 7 (2021), 4362–4383.

Digital Library

[102]

Bamshad Mobasher, Robert Cooley, and Jaideep Srivastava. 2000. Automatic personalization based on web usage mining. Commun. ACM 43, 8 (2000), 142–151.

Digital Library

[103]

Andriy Mnih and Russ R. Salakhutdinov. 2008. Probabilistic matrix factorization. In NIPS’08. 1257–1264.

[104]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NIPS’13. 3111–3119.

[105]

Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016).

[106]

Hui Li, Tsz Nam Chan, Man Lung Yiu, and Nikos Mamoulis. 2017. FEXIPRO: Fast and exact inner product retrieval in recommender systems. In SIGMOD’17. 835–850.

Digital Library

[107]

Richard E. Bellman. 2015. Adaptive Control Processes. Princeton University Press.

[108]

Andrew G. Barto. 1995. Reinforcement learning and dynamic programming. In Analysis, Design and Evaluation of Man-Machine Systems. 407–412.

[109]

Levente Kocsis and Csaba Szepesvári. 2006. Bandit based Monte-Carlo planning. In ECML’06. 282–293.

[110]

MovieLens. https://grouplens.org/datasets/movielens/. ([n. d.]).

[111]

Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. 2011. The million song dataset. In ISMIR’11. 591–596.

[112]

Peter Sunehag, Richard Evans, Gabriel Dulac-Arnold, Yori Zwols, Daniel Visentin, and Ben Coppin. 2015. Deep reinforcement learning with attention for slate Markov decision processes with high-dimensional states and actions. arXiv preprint arXiv:1512.01124 (2015).

[113]

Shamim Nemati, Mohammad M. Ghassemi, and Gari D. Clifford. 2016. Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach. In EMBC’16. 2978–2981.

[114]

Claudio Greco, Alessandro Suglia, Pierpaolo Basile, and Giovanni Semeraro. 2017. Converse-Et-Impera: Exploiting deep learning and hierarchical reinforcement learning for conversational recommender systems. In AIxIA’17. 372–386.

[115]

Aniruddh Raghu, Matthieu Komorowski, Imran Ahmed, Leo Celi, Peter Szolovits, and Marzyeh Ghassemi. 2017. Deep reinforcement learning for sepsis treatment. arXiv preprint arXiv:1711.09602 (2017).

[116]

Lu Wang, Wei Zhang, Xiaofeng He, and Hongyuan Zha. 2018. Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In SIGKDD’18. 2447–2456.

[117]

Yueming Sun and Yi Zhang. 2018. Conversational recommender system. In SIGIR’18. 235–244.

Digital Library

[118]

Chenfei Zhao and Lan Hu. 2019. CapDRL: A deep capsule reinforcement learning for movie recommendation. In PRICAI’19. Springer, 734–739.

[119]

Lixin Zou, Long Xia, Zhuoye Ding, Jiaxing Song, Weidong Liu, and Dawei Yin. 2019. Reinforcement learning to optimize long-term user engagement in recommender systems. In SIGKDD’19. 2810–2818.

[120]

Rong Gao, Haifeng Xia, Jing Li, Donghua Liu, Shuai Chen, and Gang Chun. 2019. DRCGR: Deep reinforcement learning framework incorporating CNN and GAN-based for interactive recommendation. In ICDM’19. 1048–1053.

[121]

Tong Yu, Yilin Shen, Ruiyi Zhang, Xiangyu Zeng, and Hongxia Jin. 2019. Vision-language recommendation via attribute augmented multimodal reinforcement learning. In MM’19. 39–47.

Digital Library

[122]

Daisuke Tsumita and Tomohiro Takagi. 2019. Dialogue based recommender system that flexibly mixes utterances and recommendations. In WI’19. 51–58.

[123]

Jing Zhang, Bowen Hao, Bo Chen, Cuiping Li, Hong Chen, and Jimeng Sun. 2019. Hierarchical reinforcement learning for course recommendation in MOOCs. In AAAI’19. 435–442.

[124]

Jason Zhang, Junming Yin, Dongwon Lee, and Linhong Zhu. 2019. Deep reinforcement learning for personalized search story recommendation. Journal of Environmental Sciences (2019).

[125]

Dong Liu and Chenyang Yang. 2019. A deep reinforcement learning approach to proactive content pushing and recommendation for mobile users. IEEE Access 7 (2019), 83120–83136.

[126]

Zhang Yuyan, Su Xiayao, and Liu Yong. 2019. A novel movie recommendation system based on deep reinforcement learning with prioritized experience replay. In ICCT’19. 1496–1500.

[127]

Tao Gui, Peng Liu, Qi Zhang, Liang Zhu, Minlong Peng, Yunhua Zhou, and Xuanjing Huang. 2019. Mention recommendation in Twitter with cooperative multi-agent reinforcement learning. In SIGIR’19. 535–544.

[128]

Ruiyi Zhang, Tong Yu, Yilin Shen, Hongxia Jin, and Changyou Chen. 2019. Text-based interactive recommendation via constraint-augmented reinforcement learning. In NIPS’19.

[129]

Yu Lei and Wenjie Li. 2019. Interactive recommendation with user-specific deep reinforcement learning. ACM Transactions on Knowledge Discovery from Data (TKDD) 13, 6 (2019), 1–15.

Digital Library

[130]

Xueying Bai, Jian Guan, and Hongning Wang. 2019. Model-based reinforcement learning with adversarial training for online recommendation. arXiv preprint arXiv:1911.03845 (2019).

[131]

Floris Den Hengst, Mark Hoogendoorn, Frank Van Harmelen, and Joost Bosman. 2019. Reinforcement learning for personalized dialogue management. In WI’19. 59–67.

[132]

Yikun Xian, Zuohui Fu, S. Muthukrishnan, Gerard De Melo, and Yongfeng Zhang. 2019. Reinforcement knowledge graph reasoning for explainable recommendation. In SIGIR’19. 285–294.

[133]

Haokun Chen, Xinyi Dai, Han Cai, Weinan Zhang, Xuejian Wang, Ruiming Tang, Yuzhou Zhang, and Yong Yu. 2019. Large-scale interactive recommendation with tree-structured policy gradient. In AAAI’19, Vol. 33. 3312–3320.

[134]

Lixin Zou, Long Xia, Zhuoye Ding, Dawei Yin, Jiaxing Song, and Weidong Liu. 2019. Reinforcement learning to diversify top-n recommendation. In DASFAA’19. 104–120.

[135]

Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2019. Generative adversarial user model for reinforcement learning based recommendation system. In ICML’19. 1052–1061.

[136]

Weiping Song, Zhijian Duan, Ziqing Yang, Hao Zhu, Ming Zhang, and Jian Tang. 2019. Explainable knowledge graph-based recommendation via deep reinforcement learning. arXiv preprint arXiv:1906.09506 (2019).

[137]

Lixin Zou, Long Xia, Pan Du, Zhuo Zhang, Ting Bai, Weidong Liu, Jian-Yun Nie, and Dawei Yin. 2020. Pseudo Dyna-Q: A reinforcement learning framework for interactive recommendation. In WSDM’20. 816–824.

Digital Library

[138]

Yu Lei, Zhitao Wang, Wenjie Li, Hongbin Pei, and Quanyu Dai. 2020. Social attentive deep Q-networks for recommender systems. IEEE Transactions on Knowledge and Data Engineering (2020).

[139]

Xiangyu Zhao, Xudong Zheng, Xiwang Yang, Xiaobing Liu, and Jiliang Tang. 2020. Jointly learning to recommend and advertise. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3319–3327.

Digital Library

[140]

Shenggong Ji, Zhaoyuan Wang, Tianrui Li, and Yu Zheng. 2020. Spatio-temporal feature fusion for dynamic taxi route recommendation via deep reinforcement learning. Knowledge-Based Systems 205 (2020), 106302.

[141]

Peter Wei, Stephen Xia, Runfeng Chen, Jingyi Qian, Chong Li, and Xiaofan Jiang. 2020. A deep reinforcement learning based recommender system for occupant-driven energy optimization in commercial buildings. IEEE Internet of Things Journal (2020).

[142]

Yu Lei, Hongbin Pei, Hanqi Yan, and Wenjie Li. 2020. Reinforcement learning based recommendation with graph convolutional Q-network. In SIGIR’20. 1757–1760.

Digital Library

[143]

Dongyang Zhao, Liang Zhang, Bo Zhang, Lizhou Zheng, Yongjun Bao, and Weipeng Yan. 2020. MaHRL: Multi-goals abstraction based deep hierarchical reinforcement learning for recommendations. In SIGIR’20. 871–880.

Digital Library

[144]

Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, and Yuzhou Zhang. 2018. Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027 (2018).

[145]

Feng Liu, Ruiming Tang, Xutao Li, Weinan Zhang, Yunming Ye, Haokun Chen, Huifeng Guo, Yuzhou Zhang, and Xiuqiang He. 2020. State representation modeling for deep reinforcement learning based recommendation. Knowledge-Based Systems 205 (2020), 106170.

[146]

Weiwen Liu, Feng Liu, Ruiming Tang, Ben Liao, Guangyong Chen, and Pheng Ann Heng. 2020. Balancing between accuracy and fairness for interactive recommendation with reinforcement learning. Advances in Knowledge Discovery and Data Mining 12084 (2020), 155.

Digital Library

[147]

Wenqiang Lei, Xiangnan He, Yisong Miao, Qingyun Wu, Richang Hong, Min-Yen Kan, and Tat-Seng Chua. 2020. Estimation-action-reflection: Towards deep interaction between conversational and recommender systems. In WSDM’20. 304–312.

Digital Library

[148]

Xu He, Bo An, Yanghua Li, Haikai Chen, Rundong Wang, Xinrun Wang, Runsheng Yu, Xin Li, and Zhirong Wang. 2020. Learning to collaborate in multi-module recommendation via multi-agent reinforcement learning without communication. In RecSys’20. 210–219.

Digital Library

[149]

Xiangyu Zhao, Long Xia, Lixin Zou, Hui Liu, Dawei Yin, and Jiliang Tang. 2020. Whole-chain recommendations. In CIKM’20. 1883–1891.

Digital Library

[150]

Xiang Wang, Yaokun Xu, Xiangnan He, Yixin Cao, Meng Wang, and Tat-Seng Chua. 2020. Reinforced negative sampling over knowledge graph for recommendation. In WWW’20. 99–109.

Digital Library

[151]

Xiaocong Chen, Chaoran Huang, Lina Yao, Xianzhi Wang, Wenjie Zhang, et al. 2020. Knowledge-guided deep reinforcement learning for interactive recommendation. In IJCNN’20. 1–8.

[152]

Xuhui Ren, Hongzhi Yin, Tong Chen, Hao Wang, Nguyen Quoc Viet Hung, Zi Huang, and Xiangliang Zhang. 2020. CRSAL: Conversational recommender systems with adversarial learning. ACM Transactions on Information Systems (TOIS) 38, 4 (2020), 1–40.

Digital Library

[153]

Wenqiang Lei, Gangyi Zhang, Xiangnan He, Yisong Miao, Xiang Wang, Liang Chen, and Tat-Seng Chua. 2020. Interactive path reasoning on graph for conversational recommendation. In SIGKDD’20. 2073–2083.

[154]

Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, and Joemon M. Jose. 2020. Self-supervised reinforcement learning for recommender systems. In SIGIR’20. 931–940.

Digital Library

[155]

Kangzhi Zhao, Xiting Wang, Yuren Zhang, Li Zhao, Zheng Liu, Chunxiao Xing, and Xing Xie. 2020. Leveraging demonstrations for reinforcement recommendation reasoning over knowledge graphs. In SIGIR’20. 239–248.

Digital Library

[156]

Pengfei Wang, Yu Fan, Long Xia, Wayne Xin Zhao, ShaoZhang Niu, and Jimmy Huang. 2020. KERL: A knowledge-guided reinforcement learning model for sequential recommendation. In SIGIR’20. 209–218.

Digital Library

[157]

Huizhi Liang. 2020. DRprofiling: Deep reinforcement user profiling for recommendations in heterogenous information networks. IEEE Transactions on Knowledge and Data Engineering (2020).

[158]

Sijin Zhou, Xinyi Dai, Haokun Chen, Weinan Zhang, Kan Ren, Ruiming Tang, Xiuqiang He, and Yong Yu. 2020. Interactive recommender system via knowledge graph-enhanced reinforcement learning. In SIGIR’20. 179–188.

Digital Library

[159]

Ashudeep Singh, Yoni Halpern, Nithum Thain, Konstantina Christakopoulou, EH Chi, Jilin Chen, and Alex Beutel. 2020. Building healthy recommendation sequences for everyone: A safe reinforcement learning approach. In FAccTRec Workshop.

[160]

Feng Liu, Huifeng Guo, Xutao Li, Ruiming Tang, Yunming Ye, and Xiuqiang He. 2020. End-to-end deep reinforcement learning based recommendation with supervised embedding. In WSDM’20. 384–392.

Digital Library

[161]

Feng Liu, Ruiming Tang, Huifeng Guo, Xutao Li, Yunming Ye, and Xiuqiang He. 2020. Top-aware reinforcement learning based recommendation. Neurocomputing 417 (2020), 255–269.

[162]

Vahid Baghi, Seyed Mohammad Seyed Motehayeri, Ali Moeini, and Rooholah Abedian. 2021. Improving ranking function and diversification in interactive recommendation systems based on deep reinforcement learning. In CSICC’21. 1–7.

[163]

Zefang Liu, Shuran Wen, and Yinzhu Quan. 2021. Deep reinforcement learning based group recommender system. arXiv preprint arXiv:2106.06900 (2021).

[164]

Yingqiang Ge, Shuchang Liu, Ruoyuan Gao, Yikun Xian, Yunqi Li, Xiangyu Zhao, Changhua Pei, Fei Sun, Junfeng Ge, Wenwu Ou, et al. 2021. Towards long-term fairness in recommendation. In WSDM’21. 445–453.

Digital Library

[165]

Mingsheng Fu, Anubha Agrawal, Athirai A. Irissappane, Jie Zhang, Liwei Huang, and Hong Qu. 2021. Deep reinforcement learning framework for category-based item recommendation. IEEE Transactions on Cybernetics (2021).

[166]

Yuanguo Lin, Shibo Feng, Fan Lin, Wenhua Zeng, Yong Liu, and Pengcheng Wu. 2021. Adaptive course recommendation in MOOCs. Knowledge-Based Systems 224 (2021), 107085.

[167]

Shaohua Tao, Runhe Qiu, Yuan Ping, and Hui Ma. 2021. Multi-modal knowledge-aware reinforcement learning network for explainable recommendation. Knowledge-Based Systems (2021), 107217.

Digital Library

[168]

Weijia Zhang, Hao Liu, Fan Wang, Tong Xu, Haoran Xin, Dejing Dou, and Hui Xiong. 2021. Intelligent electric vehicle charging recommendation based on multi-agent reinforcement learning. In WWW’21. 1856–1867.

Digital Library

[169]

Danyang Liu, Jianxun Lian, Zheng Liu, Xiting Wang, Guangzhong Sun, and Xing Xie. 2021. Reinforced anchor knowledge graph generation for news recommendation reasoning. In SIGKDD’21. 1055–1065.

[170]

Yang Deng, Yaliang Li, Fei Sun, Bolin Ding, and Wai Lam. 2021. Unified conversational recommendation policy learning via graph-based reinforcement learning. arXiv preprint arXiv:2105.09710 (2021).

[171]

Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, and Leyu Lin. 2021. Hierarchical reinforcement learning for integrated recommendation. In AAAI’21.

[172]

Kai Wang, Zhene Zou, Qilin Deng, Runze Wu, Jianrong Tao, Changjie Fan, Liang Chen, and Peng Cui. 2021. Reinforcement learning with a disentangled universal value function for item recommendation. arXiv preprint arXiv:2104.02981 (2021).

[173]

Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Hui Liu, and Jiliang Tang. 2021. DEAR: Deep reinforcement learning for online advertising impression in recommender systems. In AAAI’21. 750–758.

[174]

Chengqian Gao, Ke Xu, and Peilin Zhao. 2021. Value penalized Q-learning for recommender systems. arXiv preprint arXiv:2110.07923 (2021).

[175]

Teng Xiao and Donglin Wang. 2021. A general offline reinforcement learning framework for interactive recommendation. In AAAI’21.

[176]

Minmin Chen, Bo Chang, Can Xu, and Ed H. Chi. 2021. User response models to improve a REINFORCE recommender system. In WSDM’21. 121–129.

Digital Library

[177]

Dankit K. Nassiuma. 2001. Survey sampling: Theory and methods. (2001).

[178]

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, and Wojciech Zaremba. 2017. Hindsight experience replay. arXiv preprint arXiv:1707.01495 (2017).

[179]

Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML’09. 1201–1208.

[180]

Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In AAAI’17.

[181]

George E. Uhlenbeck and Leonard S. Ornstein. 1930. On the theory of the Brownian motion. Physical Review 36, 5 (1930), 823.

[182]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In ICML’18. 1861–1870.

[183]

Jakob N. Foerster, Yannis M. Assael, Nando De Freitas, and Shimon Whiteson. 2016. Learning to communicate with deep multi-agent reinforcement learning. arXiv preprint arXiv:1605.06676 (2016).

[184]

Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv preprint arXiv:1706.02275 (2017).

[185]

Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. NIPS’16 29 (2016), 4565–4573.

[186]

David G. Kleinbaum and Mitchel Klein. 2010. Survival Analysis. Springer.

[187]

How Jing and Alexander J. Smola. 2017. Neural survival recommender. In WSDM’17. 515–524.

Digital Library

[188]

Eugene Ie et al. 2019. RecSim: A configurable simulation platform for recommender systems. arXiv preprint arXiv:1909.04847 (2019).

[189]

Edward L. Ionides. 2008. Truncated importance sampling. Journal of Computational and Graphical Statistics 17, 2 (2008), 295–311.

[190]

Yelp. https://www.yelp.com/academic_dataset. ([n. d.]).

[191]

Lloyd S. Shapley. 1953. Stochastic games. PNAS’53 39, 10 (1953), 1095–1100.

[192]

Kaiqing Zhang, Zhuoran Yang, and Tamer Başar. 2021. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control (2021), 321–384.

[193]

Robert J. Aumann. 1974. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics 1, 1 (1974), 67–96.

[194]

Andrew G. Barto and Sridhar Mahadevan. 2003. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems 13, 1 (2003), 41–77.

Digital Library

[195]

Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. 2017. Feudal networks for hierarchical reinforcement learning. In ICML’17. 3540–3549.

[196]

Ofir Nachum, Shixiang Gu, Honglak Lee, and Sergey Levine. 2018. Data-efficient hierarchical reinforcement learning. arXiv preprint arXiv:1805.08296 (2018).

[197]

Tejas D. Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. 2016. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. NIPS’16 29 (2016), 3675–3683.

[198]

Richard S. Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 1–2 (1999), 181–211.

Digital Library

[199]

Steffen Rendle. 2010. Factorization machines. In ICDM’10. 995–1000.

[200]

Michael T. Rosenstein, Andrew G. Barto, Jennie Si, Andy Barto, Warren Powell, and Donald Wunsch. 2004. Supervised actor-critic reinforcement learning. Learning and Approximate Dynamic Programming: Scaling Up to the Real World (2004), 359–380.

[201]

Javier Garcıa and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437–1480.

Digital Library

[202]

Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan, and Chrisina Jayne. 2017. Imitation learning: A survey of learning methods. ACM Computing Surveys (CSUR) 50, 2 (2017), 1–35.

Digital Library

[203]

M. Milani Fard and Joelle Pineau. 2011. Non-deterministic policies in Markovian decision processes. Journal of Artificial Intelligence Research 40 (2011), 1–24.

Digital Library

[204]

Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is seeing believing? How recommender system interfaces affect users’ opinions. In SIGCHI’03. 585–592.

[205]

Li Chen and Pearl Pu. 2005. Trust building in recommender agents. In ICETE’05. 135–145.

[206]

Nava Tintarev and Judith Masthoff. 2007. Effective explanations of recommendations: User-centered design. In RecSys’07. 153–156.

Digital Library

[207]

Zachary C. Lipton. 2018. The mythos of model interpretability. Queue 16, 3 (2018), 31–57.

Digital Library

[208]

Xiting Wang, Yiru Chen, Jie Yang, Le Wu, Zhengtao Wu, and Xing Xie. 2018. A reinforcement learning framework for explainable recommendation. In ICDM’18. 587–596.

[209]

Changhua Pei, Xinru Yang, Qing Cui, Xiao Lin, Fei Sun, Peng Jiang, Wenwu Ou, and Yongfeng Zhang. 2019. Value-aware recommendation based on reinforcement profit maximization. In WWW’19. 3123–3129.

Digital Library

[210]

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. 2017. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864 (2017).

[211]

Mehdi Afsar, Trafford Crump, and Behrouz Far. 2021. A load balanced recommendation approach. arXiv preprint arXiv:2105.09981 (2021).

[212]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016).

[213]

Jason Gauci, Edoardo Conti, Yitao Liang, Kittipat Virochsiri, Yuchen He, Zachary Kaden, Vivek Narayanan, Xiaohui Ye, Zhengxing Chen, and Scott Fujimoto. 2018. Horizon: Facebook’s open source applied reinforcement learning platform. arXiv preprint arXiv:1811.00260 (2018).

[214]

David Rohde, Stephen Bonner, Travis Dunlop, Flavian Vasile, and Alexandros Karatzoglou. 2018. RecoGym: A reinforcement learning environment for the problem of product recommendation in online advertising. arXiv preprint arXiv:1808.00720 (2018).

[215]

Xiangyu Zhao, Long Xia, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2019. Toward simulating environments in reinforcement learning based recommendations. arXiv preprint arXiv:1906.11462 (2019).

[216]

Bichen Shi, Makbule Gulcin Ozsoy, Neil Hurley, Barry Smyth, Elias Z. Tragos, James Geraci, and Aonghus Lawlor. 2019. PyRecGym: A reinforcement learning gym for recommender systems. In RecSys’19. 491–495.

Digital Library

[217]

Jin Huang, Harrie Oosterhuis, Maarten de Rijke, and Herke van Hoof. 2020. Keeping dataset biases out of the simulation: A debiased simulator for reinforcement learning based recommender systems. In RecSys’20. 190–199.

Digital Library

[218]

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).

[219]

Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR’19. 4401–4410.

[220]

Animesh Karnewar and Oliver Wang. 2020. MSG-GAN: Multi-scale gradients for generative adversarial networks. In CVPR’20. 7799–7808.

Cited By

Tan MKasireddy HSatriya AAbdul Karim HAlDahoul N(2025)Health is beyond genetics: on the integration of lifestyle and environment in real-time for hyper-personalized medicineFrontiers in Public Health10.3389/fpubh.2024.152267312Online publication date: 7-Jan-2025
https://doi.org/10.3389/fpubh.2024.1522673
Zhang ZZhang XLiu X(2025)A novel algorithm for personalized learning in preschool education using artificial intelligenceJournal of Computational Methods in Sciences and Engineering10.1177/14727978251314562Online publication date: 11-Jan-2025
https://doi.org/10.1177/14727978251314562
Ali WZhou XShao J(2025)Privacy-preserved and Responsible Recommenders: From Conventional Defense to Federated Learning and BlockchainACM Computing Surveys10.1145/370898257:5(1-35)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3708982
Show More Cited By

Index Terms

Reinforcement Learning based Recommender Systems: A Survey
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Reinforcement learning for addressing the cold-user problem in recommender systems
Abstract
Recommender systems are widely used in webshops because of their ability to provide users with personalized recommendations. However, the cold-user problem (i.e., recommending items to new users) is an important issue many webshops face. With the ...
A survey of serendipity in recommender systems

We summarize most efforts on serendipity in recommender systems.We compare definitions of serendipity in recommender systems.We classify the state-of-the-art serendipity-oriented recommendation algorithms.We review methods to assess serendipity in ...
A survey of active learning in collaborative filtering recommender systems

In collaborative filtering recommender systems user's preferences are expressed as ratings for items, and each additional rating extends the knowledge of the system and affects the system's recommendation accuracy. In general, the more ratings are ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 55, Issue 7

July 2023

813 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3567472

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 December 2022

Online AM: 15 June 2022

Accepted: 03 June 2022

Revised: 29 May 2022

Received: 03 December 2020

Published in CSUR Volume 55, Issue 7

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

178
Total Citations
View Citations
11,215
Total Downloads

Downloads (Last 12 months)3,595
Downloads (Last 6 weeks)419

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tan MKasireddy HSatriya AAbdul Karim HAlDahoul N(2025)Health is beyond genetics: on the integration of lifestyle and environment in real-time for hyper-personalized medicineFrontiers in Public Health10.3389/fpubh.2024.152267312Online publication date: 7-Jan-2025
https://doi.org/10.3389/fpubh.2024.1522673
Zhang ZZhang XLiu X(2025)A novel algorithm for personalized learning in preschool education using artificial intelligenceJournal of Computational Methods in Sciences and Engineering10.1177/14727978251314562Online publication date: 11-Jan-2025
https://doi.org/10.1177/14727978251314562
Ali WZhou XShao J(2025)Privacy-preserved and Responsible Recommenders: From Conventional Defense to Federated Learning and BlockchainACM Computing Surveys10.1145/370898257:5(1-35)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3708982
Cao YShang SWang JZhang W(2025)Explainable Session-Based Recommendation via Path ReasoningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348632637:1(278-290)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3486326
Li MMa WChu Z(2025)User preference interaction fusion and swap attention graph neural network for recommender systemNeural Networks10.1016/j.neunet.2024.107116184(107116)Online publication date: Apr-2025
https://doi.org/10.1016/j.neunet.2024.107116
Song LLi DXu X(2025)Adaptive generative adversarial maximum entropy inverse reinforcement learningInformation Sciences10.1016/j.ins.2024.121712695(121712)Online publication date: Mar-2025
https://doi.org/10.1016/j.ins.2024.121712
Luo WYu XYen GWei Y(2025)Deep reinforcement learning-guided coevolutionary algorithm for constrained multiobjective optimizationInformation Sciences10.1016/j.ins.2024.121648692(121648)Online publication date: Feb-2025
https://doi.org/10.1016/j.ins.2024.121648
Munson JCummins BZosso D(2025)An introduction to collaborative filtering through the lens of the Netflix PrizeKnowledge and Information Systems10.1007/s10115-024-02315-zOnline publication date: 10-Jan-2025
https://doi.org/10.1007/s10115-024-02315-z
Yao YZhan HNoorian AHazratifard M(2025)Enhancing POI recommendations on social media: a sequential approach incorporating LSTM and user feedbackComputing10.1007/s00607-024-01385-9107:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00607-024-01385-9
Bai SZeng LZhao CDuan XTalebi MCheng PChen JKiyavash NMooij J(2024)Differentially private no-regret exploration in adversarial Markov decision processesProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702687(235-272)Online publication date: 15-Jul-2024
https://dl.acm.org/doi/10.5555/3702676.3702687
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents