[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3649153.3649193acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article
Open access

PEARL: Enabling Portable, Productive, and High-Performance Deep Reinforcement Learning using Heterogeneous Platforms

Published: 02 July 2024 Publication History

Abstract

Deep Reinforcement Learning (DRL) is vital in various AI applications. DRL algorithms comprise diverse compute kernels, which may not be simultaneously optimized using a homogeneous architecture. However, even with available heterogeneous architectures, optimizing DRL performance remains a challenge due to the complexity of hardware and programming models employed in modern data centers. To address this, we introduce PEARL, a toolkit for composing parallel DRL systems on heterogeneous platforms consisting of general-purpose processors (CPUs) and accelerators (GPUs, FPGAs). Our innovations include: 1. A general training protocol agnostic of the underlying hardware, enabling portable implementations across various platforms. 2. Incorporation of DRL-specific optimizations on runtime scheduling and resource allocation, facilitating parallelized training and enhancing the overall system performance. 3. Automatic optimization of DRL task-to-device assignments through throughput estimation. 4. High-level API for productive development using the toolkit. We showcase our toolkit through experimentation with two widely used DRL algorithms, DQN and DDPG, on two diverse heterogeneous platforms. The generated implementations outperform state-of-the-art libraries for CPU-GPU platforms by up to 2.2× throughput improvements, and 2.4× higher performance portability across platforms.

Supplemental Material

External - HeteroRL: v0.0.1-alpha
In this archive, we present the code we used for our publication titled "PEARL: Enabling Portable, Productive, and High-Performance Deep Reinforcement Learning using Heterogeneous Platforms" in ACM Computing Frontiers 2024.
Creative Commons Zero v1.0 Universal

References

[1]
2021. Intel Heterogeneous DevCloud. https://devcloud.intel.com/oneapi/
[2]
2022. Intel Extension for PyTorch. https://github.com/intel/intel-extension-for-pytorch
[3]
AMD. 2022. AMD Heterogeneous Accelerated Compute Clusters. https://www.amd-haccs.io/
[4]
Lorena A Barba, Andreas Klockner, Prabhu Ramachandran, and Rollin Thomas. 2021. Scientific computing with Python on high-performance heterogeneous systems. Computing in Science & Engineering 23, 04 (2021), 5--7.
[5]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540
[6]
Tianyue Cao. 2020. Study of sample efficiency improvements for reinforcement learning algorithms. In 2020 IEEE Integrated STEM Education Conference (ISEC). IEEE, 1--1.
[7]
Konstantinos Chatzilygeroudis, Roberto Rama, Rituraj Kaushik, Dorian Goepp, Vassilis Vassiliades, and Jean-Baptiste Mouret. 2017. Black-box data-efficient policy search for robotics. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 51--58.
[8]
Xinyu Chen, Hongshi Tan, Yao Chen, Bingsheng He, Weng-Fai Wong, and Deming Chen. 2021. ThunderGP: HLS-based graph processing framework on FPGAs. In The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 69--80.
[9]
Hyungmin Cho, Pyeongseok Oh, Jiyoung Park, Wookeun Jung, and Jaejin Lee. 2019. FA3C: FPGA-Accelerated Deep Reinforcement Learning. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 499--513.
[10]
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. 2016. Benchmarking deep reinforcement learning for continuous control. In International conference on machine learning. PMLR, 1329--1338.
[11]
Matteo Hessel, Joseph Modayil, Hado Van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver. 2018. Rainbow: Combining improvements in deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[12]
Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, and David Silver. 2018. Distributed Prioritized Experience Replay. CoRR abs/1803.00933 (2018). arXiv:1803.00933 http://arxiv.org/abs/1803.00933
[13]
Intel. 2022. Intel OneAPI. https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html
[14]
Eric Liang, Richard Liaw, Robert Nishihara, Philipp Moritz, Roy Fox, Joseph Gonzalez, Ken Goldberg, and Ion Stoica. 2017. Ray RLLib: A Composable and Scalable Reinforcement Learning Library. CoRR abs/1712.09381 (2017). arXiv:1712.09381 http://arxiv.org/abs/1712.09381
[15]
Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Manfred Otto Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. CoRR abs/1509.02971 (2016).
[16]
Yuan Meng, Sanmukh Kuppannagari, and Viktor Prasanna. 2020. Accelerating proximal policy optimization on cpu-fpga heterogeneous platforms. In 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 19--27.
[17]
Yuan Meng, Chi Zhang, and Viktor Prasanna. 2022. FPGA acceleration of deep reinforcement learning using on-chip replay management. In Proceedings of the 19th ACM International Conference on Computing Frontiers. 40--48.
[18]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. 2013. Playing Atari with Deep Reinforcement Learning. CoRR abs/1312.5602 (2013). arXiv:1312.5602 http://arxiv.org/abs/1312.5602
[19]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[20]
S John Pennycook, Jason D Sewall, Douglas W Jacobsen, Tom Deakin, and Simon McIntosh-Smith. 2021. Navigating performance, portability, and productivity. Computing in Science & Engineering 23, 5 (2021), 28--38.
[21]
Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22, 268 (2021), 1--8. http://jmlr.org/papers/v22/20-1364.html
[22]
H. Robbins and S. Monro. 1951. A stochastic approximation method. Annals of Mathematical Statistics 22 (1951), 400--407.
[23]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).
[24]
Brian Van Essen, Chris Macaraeg, Maya Gokhale, and Ryan Prenger. 2012. Accelerating a random forest classifier: Multi-core, GP-GPU, or FPGA?. In 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines. IEEE, 232--239.
[25]
Oriol Vinyals, Igor Babuschkin, Junyoung Chung, Michael Mathieu, Max Jaderberg, Wojciech M Czarnecki, Andrew Dudzik, Aja Huang, Petko Georgiev, Richard Powell, et al. 2019. Alphastar: Mastering the real-time strategy game starcraft ii. DeepMind blog 2 (2019).
[26]
Abdurrahman Yasar, Sivasankaran Rajamanickam, Jonathan W Berry, and Umit V Catalyurek. 2022. PGAbB: A Block-Based Graph Processing Framework for Heterogeneous Platforms. arXiv preprint arXiv:2209.04541 (2022).
[27]
Chi Zhang, Sanmukh Rao Kuppannagari, and Viktor K Prasanna. 2021. Parallel actors and learners: A framework for generating scalable RL implementations. In 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC). IEEE, 1--10.
[28]
Chi Zhang, Yuan Meng, and Viktor Prasanna. 2023. A Framework for Mapping DRL Algorithms With Prioritized Replay Buffer Onto Heterogeneous Platforms. IEEE Transactions on Parallel and Distributed Systems (2023).

Cited By

View all
  • (2024)Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A SurveyACM Computing Surveys10.1145/370345357:4(1-35)Online publication date: 10-Dec-2024

Index Terms

  1. PEARL: Enabling Portable, Productive, and High-Performance Deep Reinforcement Learning using Heterogeneous Platforms

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CF '24: Proceedings of the 21st ACM International Conference on Computing Frontiers
      May 2024
      345 pages
      ISBN:9798400705977
      DOI:10.1145/3649153
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 July 2024

      Check for updates

      Badges

      Author Tags

      1. Deep Reinforcement Learning
      2. Heterogeneous Computing

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      CF '24
      Sponsor:

      Acceptance Rates

      CF '24 Paper Acceptance Rate 33 of 105 submissions, 31%;
      Overall Acceptance Rate 273 of 785 submissions, 35%

      Upcoming Conference

      CF '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)148
      • Downloads (Last 6 weeks)39
      Reflects downloads up to 14 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A SurveyACM Computing Surveys10.1145/370345357:4(1-35)Online publication date: 10-Dec-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media