research-article

Optimizing communication in deep reinforcement learning with XingTian

Authors:

Wei Xia,

Jun Yao,

Zhen XiaoAuthors Info & Claims

Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference

Pages 255 - 268

https://doi.org/10.1145/3528535.3565249

Published: 08 November 2022 Publication History

Get Access

Abstract

Deep Reinforcement Learning (DRL) achieves great success in various domains. Communication in today's DRL algorithms takes non-negligible time compared to the computation. However, prior DRL frameworks usually focus on computation management while paying little attention to communication optimization, and fail to utilize the opportunity of the communication-computation overlap that hides the communication from the critical path of DRL algorithms. Consequently, communication can take more time than the computation in prior DRL frameworks. In this paper, we present XingTian, a novel DRL framework that co-designs the management of communication and computation in DRL algorithms. XingTian organizes the computation in DRL algorithms in a decentralized way and provides an asynchronous communication channel. XingTian makes the communication execute asynchronously and aggressively and takes advantage of the communication-computation overlapping opportunity from DRL algorithms. Experimental results show that XingTian improves data transmission efficiency and can transmit at least twice as much data per second as the state-of-the-art DRL framework RLLib. DRL algorithms based on XingTian achieve up to 70.71% more throughput than RLLib-based ones with better or similar convergent performance. XingTian maintains high communication efficiency under different scale deployments and the XingTian-based DRL algorithm achieves 91.12% higher throughput than the RLLib-based one when deployed in four machines. XingTian is open-sourced and publicly available at https://github.com/huawei-noah/xingtian.

References

[1]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (Savannah, GA, USA, November 2--4) (OSDI 2016). USENIX Association, USA, 265--283. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi

Abstract

References

Cited By

Index Terms

Recommendations

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning

Conversational Recommender System Using Deep Reinforcement Learning

A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations