More Web Proxy on the site http://driver.im/

short-paper

Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation

Authors:

Dong WangAuthors Info & Claims

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Pages 4560 - 4564

https://doi.org/10.1145/3511808.3557611

Published: 17 October 2022 Publication History

Abstract

Ads allocation, which involves allocating ads and organic items to limited slots in feed with the purpose of maximizing platform revenue, has become a research hotspot. Notice that, platforms (e.g., e-commerce platforms, video platforms, food delivery platforms and so on) usually have multiple entrances for different categories and some entrances have few visits. Data from these entrances has low coverage, which makes it difficult for the agent to learn. To address this challenge, we propose Similarity-based Hybrid Transfer for Ads Allocation (SHTAA), which effectively transfers samples as well as knowledge from data-rich entrance to data-poor entrance. Specifically, we define an uncertainty-aware similarity for MDP to estimate the similarity of MDP for different entrances. Based on this similarity, we design a hybrid transfer method, including instance transfer and strategy transfer, to efficiently transfer samples and knowledge from one entrance to another. Both offline and online experiments on Meituan food delivery platform demonstrate that the proposed method could achieve better performance for data-poor entrance and increase the revenue for the platform.

Supplementary Material

MP4 File (CIKM22-sp0599.mp4)

Presentation video

Download
13.35 MB

References

[1]

Will Dabney, Mark Rowland, Marc Bellemare, and Rémi Munos. 2018. Distributional reinforcement learning with quantile regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.

[2]

Chao Du, Zhifeng Gao, Shuo Yuan, Lining Gao, Ziyan Li, Yifan Zeng, Xiaoqiang Zhu, Jian Xu, Kun Gai, and Kuang-Chih Lee. 2021. Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2792--2801.

Digital Library

[3]

Jun Feng, H. Li, Minlie Huang, Shichen Liu, Wenwu Ou, Zhirong Wang, and Xiaoyan Zhu. 2018. Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning. Proceedings of the 2018 World Wide Web Conference (2018).

Digital Library

[4]

A. Ghose and Sha Yang. 2009. An Empirical Analysis of Search Engine Advertising: Sponsored Search in Electronic Markets. Manag. Sci., Vol. 55 (2009), 1605--1622.

Digital Library

[5]

Anastasios Giannopoulos, Sotirios Spantideas, Nikolaos Kapsalis, Panagiotis Karkazis, and Panagiotis Trakadas. 2021. Deep reinforcement learning for energy-efficient multi-channel transmissions in 5G cognitive hetnets: Centralized, decentralized and transfer learning based solutions. IEEE Access, Vol. 9 (2021), 129358--129374.

[6]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[7]

Xiang Li, Chao Wang, Bin Tong, Jiwei Tan, Xiaoyi Zeng, and Tao Zhuang. 2020. Deep Time-Aware Item Evolution Network for Click-Through Rate Prediction. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 785--794.

Digital Library

[8]

Guogang Liao, Ze Wang, Xiaoxu Wu, Xiaowen Shi, Chuheng Zhang, Yongkang Wang, Xingxing Wang, and Dong Wang. 2021. Cross DQN: Cross Deep Q Network for Ads Allocation in Feed. arXiv preprint arXiv:2109.04353 (2021).

[9]

Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, and Changjie Fan. 2019. Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns. In IJCAI. 457--463.

[10]

Aranyak Mehta. 2013. Online Matching and Ad Allocation. Found. Trends Theor. Comput. Sci., Vol. 8 (2013), 265--368.

Digital Library

[11]

ML Menéndez, JA Pardo, L Pardo, and MC Pardo. 1997. The jensen-shannon divergence. Journal of the Franklin Institute, Vol. 334, 2 (1997), 307--318.

[12]

Carl Edward Rasmussen. 2003. Gaussian processes in machine learning. In Summer school on machine learning. Springer, 63--71.

[13]

Hugh Salimbeni and Marc Deisenroth. 2017. Doubly stochastic variational inference for deep Gaussian processes. Advances in neural information processing systems, Vol. 30 (2017).

[14]

Richard S Sutton, Andrew G Barto, et al. 1998. Introduction to reinforcement learning. Vol. 135. MIT press Cambridge.

[15]

Yunzhe Tao, Sahika Genc, Jonathan Chung, Tao Sun, and Sunil Mallya. 2021. REPAINT: Knowledge Transfer in Deep Reinforcement Learning. In International Conference on Machine Learning. PMLR, 10141--10152.

[16]

Andrea Tirinzoni, Andrea Sessa, Matteo Pirotta, and Marcello Restelli. 2018. Importance weighted transfer of samples in reinforcement learning. In International Conference on Machine Learning. PMLR, 4936--4945.

[17]

B. Wang, Zhaonan Li, Jie Tang, Kuo Zhang, Songcan Chen, and Liyun Ru. 2011. Learning to Advertise: How Many Ads Are Enough?. In PAKDD.

[18]

Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, and Leyu Lin. 2021. Hierarchical Reinforcement Learning for Integrated Recommendation. In Proceedings of AAAI.

[19]

Jinyun Yan, Zhiyuan Xu, Birjodh Tiwana, and Shaunak Chatterjee. 2020. Ads Allocation in Feed via Constrained Optimization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3386--3394.

Digital Library

[20]

Weiru Zhang, Chao Wei, Xiaonan Meng, Yi Hu, and Hao Wang. 2018. The whole-page optimization via dynamic ad allocation. In Companion Proceedings of the The Web Conference. 1407--1411.

Digital Library

[21]

Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Hui Liu, and Jiliang Tang. 2021. DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 750--758.

[22]

Xiangyu Zhao, Xudong Zheng, Xiwang Yang, Xiaobing Liu, and Jiliang Tang. 2020. Jointly learning to recommend and advertise. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3319--3327.

Digital Library

[23]

Zhuangdi Zhu, Kaixiang Lin, and Jiayu Zhou. 2020. Transfer learning in deep reinforcement learning: A survey. arXiv preprint arXiv:2009.07888 (2020).

Cited By

Shi WFu CXu QChen SZhang JZhu QHua ZYang SSerra ESpezzano F(2024)Ads Supply Personalization via Doubly Robust LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680035(4874-4881)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680035
Li XWang ZZhu BHe FWang YWang XHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Deep Automated Mechanism Design for Integrating Ad Auction and Allocation in FeedProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657774(1211-1220)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657774
Shi XWang ZCai YWu XYang FLiao GWang YWang XWang DChen HDuh WHuang HKato MMothe JPoblete B(2023)MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel FeedProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592018(2159-2163)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3592018

Index Terms

Hybrid Transfer in Deep Reinforcement Learning for Ads Allocation
1. Information systems
  1. Information systems applications
    1. Computational advertising
  2. World Wide Web
    1. Online advertising
    2. Web applications
      1. Electronic commerce

Recommendations

Structural knowledge transfer by spatial abstraction for reinforcement learning agents

In this article we investigate the role of abstraction principles for knowledge transfer in agent control learning tasks. We analyze abstraction from a formal point of view and characterize three distinct facets: aspectualization, coarsening, and ...
Reinforcement learning transfer via sparse coding
AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Although reinforcement learning (RL) has been successfully deployed in a variety of tasks, learning speed remains a fundamental problem for applying RL in complex environments. Transfer learning aims to ameliorate this shortcoming by speeding up ...
Discarding Erroneous Knowledge Online in Transfer Reinforcement Learning
PRIMA 2024: Principles and Practice of Multi-Agent Systems
Abstract
This study deals with transfer reinforcement learning, i.e., reinforcement learning using knowledge acquired in one learning process (source task) in other learning processes (target tasks). The knowledge improves performance in the early stages ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

October 2022

5274 pages

ISBN:9781450392365

DOI:10.1145/3511808

General Chairs:
Mohammad Al Hasan
Indiana University Purdue University, Indianapolis, USA
,
Li Xiong
Emory University, Atlanta, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CIKM '22

Sponsor:

CIKM '22: The 31st ACM International Conference on Information and Knowledge Management

October 17 - 21, 2022

GA, Atlanta, USA

Acceptance Rates

CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
93
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shi WFu CXu QChen SZhang JZhu QHua ZYang SSerra ESpezzano F(2024)Ads Supply Personalization via Doubly Robust LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3680035(4874-4881)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3680035
Li XWang ZZhu BHe FWang YWang XHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Deep Automated Mechanism Design for Integrating Ad Auction and Allocation in FeedProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657774(1211-1220)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657774
Shi XWang ZCai YWu XYang FLiao GWang YWang XWang DChen HDuh WHuang HKato MMothe JPoblete B(2023)MDDL: A Framework for Reinforcement Learning-based Position Allocation in Multi-Channel FeedProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3592018(2159-2163)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3592018

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten