[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3511808.3557611acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation

Published: 17 October 2022 Publication History

Abstract

Ads allocation, which involves allocating ads and organic items to limited slots in feed with the purpose of maximizing platform revenue, has become a research hotspot. Notice that, platforms (e.g., e-commerce platforms, video platforms, food delivery platforms and so on) usually have multiple entrances for different categories and some entrances have few visits. Data from these entrances has low coverage, which makes it difficult for the agent to learn. To address this challenge, we propose Similarity-based Hybrid Transfer for Ads Allocation (SHTAA), which effectively transfers samples as well as knowledge from data-rich entrance to data-poor entrance. Specifically, we define an uncertainty-aware similarity for MDP to estimate the similarity of MDP for different entrances. Based on this similarity, we design a hybrid transfer method, including instance transfer and strategy transfer, to efficiently transfer samples and knowledge from one entrance to another. Both offline and online experiments on Meituan food delivery platform demonstrate that the proposed method could achieve better performance for data-poor entrance and increase the revenue for the platform.

Supplementary Material

MP4 File (CIKM22-sp0599.mp4)
Presentation video

References

[1]
Will Dabney, Mark Rowland, Marc Bellemare, and Rémi Munos. 2018. Distributional reinforcement learning with quantile regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
[2]
Chao Du, Zhifeng Gao, Shuo Yuan, Lining Gao, Ziyan Li, Yifan Zeng, Xiaoqiang Zhu, Jian Xu, Kun Gai, and Kuang-Chih Lee. 2021. Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2792--2801.
[3]
Jun Feng, H. Li, Minlie Huang, Shichen Liu, Wenwu Ou, Zhirong Wang, and Xiaoyan Zhu. 2018. Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning. Proceedings of the 2018 World Wide Web Conference (2018).
[4]
A. Ghose and Sha Yang. 2009. An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets. Manag. Sci., Vol. 55 (2009), 1605--1622.
[5]
Anastasios Giannopoulos, Sotirios Spantideas, Nikolaos Kapsalis, Panagiotis Karkazis, and Panagiotis Trakadas. 2021. Deep reinforcement learning for energy-efficient multi-channel transmissions in 5G cognitive hetnets: Centralized, decentralized and transfer learning based solutions. IEEE Access, Vol. 9 (2021), 129358--129374.
[6]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[7]
Xiang Li, Chao Wang, Bin Tong, Jiwei Tan, Xiaoyi Zeng, and Tao Zhuang. 2020. Deep Time-Aware Item Evolution Network for Click-Through Rate Prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 785--794.
[8]
Guogang Liao, Ze Wang, Xiaoxu Wu, Xiaowen Shi, Chuheng Zhang, Yongkang Wang, Xingxing Wang, and Dong Wang. 2021. Cross DQN: Cross Deep Q Network for Ads Allocation in Feed. arXiv preprint arXiv:2109.04353 (2021).
[9]
Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, and Changjie Fan. 2019. Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns. In IJCAI. 457--463.
[10]
Aranyak Mehta. 2013. Online Matching and Ad Allocation. Found. Trends Theor. Comput. Sci., Vol. 8 (2013), 265--368.
[11]
ML Menéndez, JA Pardo, L Pardo, and MC Pardo. 1997. The jensen-shannon divergence. Journal of the Franklin Institute, Vol. 334, 2 (1997), 307--318.
[12]
Carl Edward Rasmussen. 2003. Gaussian processes in machine learning. In Summer school on machine learning. Springer, 63--71.
[13]
Hugh Salimbeni and Marc Deisenroth. 2017. Doubly stochastic variational inference for deep Gaussian processes. Advances in neural information processing systems, Vol. 30 (2017).
[14]
Richard S Sutton, Andrew G Barto, et al. 1998. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.
[15]
Yunzhe Tao, Sahika Genc, Jonathan Chung, Tao Sun, and Sunil Mallya. 2021. REPAINT: Knowledge Transfer in Deep Reinforcement Learning. In International Conference on Machine Learning. PMLR, 10141--10152.
[16]
Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta, and Marcello Restelli. 2018. Importance weighted transfer of samples in reinforcement learning. In International Conference on Machine Learning. PMLR, 4936--4945.
[17]
B. Wang, Zhaonan Li, Jie Tang, Kuo Zhang, Songcan Chen, and Liyun Ru. 2011. Learning to Advertise: How Many Ads Are Enough?. In PAKDD.
[18]
Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, and Leyu Lin. 2021. Hierarchical Reinforcement Learning for Integrated Recommendation. In Proceedings of AAAI.
[19]
Jinyun Yan, Zhiyuan Xu, Birjodh Tiwana, and Shaunak Chatterjee. 2020. Ads Allocation in Feed via Constrained Optimization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3386--3394.
[20]
Weiru Zhang, Chao Wei, Xiaonan Meng, Yi Hu, and Hao Wang. 2018. The whole-page optimization via dynamic ad allocation. In Companion Proceedings of the The Web Conference. 1407--1411.
[21]
Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Hui Liu, and Jiliang Tang. 2021. DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 750--758.
[22]
Xiangyu Zhao, Xudong Zheng, Xiwang Yang, Xiaobing Liu, and Jiliang Tang. 2020. Jointly learning to recommend and advertise. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3319--3327.
[23]
Zhuangdi Zhu, Kaixiang Lin, and Jiayu Zhou. 2020. Transfer learning in deep reinforcement learning: A survey. arXiv preprint arXiv:2009.07888 (2020).

Cited By

View all
  • (2024)Ads Supply Personalization via Doubly Robust LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680035(4874-4881)Online publication date: 21-Oct-2024
  • (2024)Deep Automated Mechanism Design for Integrating Ad Auction and Allocation in FeedProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657774(1211-1220)Online publication date: 10-Jul-2024
  • (2023)MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel FeedProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592018(2159-2163)Online publication date: 19-Jul-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
October 2022
5274 pages
ISBN:9781450392365
DOI:10.1145/3511808
  • General Chairs:
  • Mohammad Al Hasan,
  • Li Xiong
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ads allocation
  2. reinforcement learning
  3. transfer learning

Qualifiers

  • Short-paper

Conference

CIKM '22
Sponsor:

Acceptance Rates

CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Ads Supply Personalization via Doubly Robust LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680035(4874-4881)Online publication date: 21-Oct-2024
  • (2024)Deep Automated Mechanism Design for Integrating Ad Auction and Allocation in FeedProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657774(1211-1220)Online publication date: 10-Jul-2024
  • (2023)MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel FeedProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592018(2159-2163)Online publication date: 19-Jul-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media