[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3299874.3319493acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

Published: 13 May 2019 Publication History

Abstract

In autonomous embedded systems, it is often vital to reduce the amount of actions taken in the real world and energy required to learn a policy. Training reinforcement learning agents from high dimensional image representations can be very expensive and time consuming. Autoencoders are deep neural network used to compress high dimensional data such as pixelated images into small latent representations. This compression model is vital to efficiently learn policies, especially when learning on embedded systems. We have implemented this model on the NVIDIA Jetson TX2 embedded GPU, and evaluated the power consumption, throughput, and energy consumption of the autoencoders for various CPU/GPU core combinations, frequencies, and model parameters. Additionally, we have shown the reconstructions generated by the autoencoder to analyze the quality of the generated compressed representation and also the performance of the reinforcement learning agent. Finally, we have presented an assessment of the viability of training these models on embedded systems and their usefulness in developing autonomous policies. Using autoencoders, we were able to achieve 4-5X improved performance compared to a baseline RL agent with a convolutional feature extractor, while using less than 2W of power.

References

[1]
Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern, John Valasek, and Nicholas R. Waytowich. 2019. Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real Time. AAAI Conference on Artificial Intelligence (2019).
[2]
David Ha and Jürgen Schmidhuber. 2018. World models. arXiv preprint arXiv:1803.10122 (2018).
[3]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018).
[4]
Ali Jafari, Ashwinkumar Ganesan, Chetan Sai Kumar Thalisetty, Varun Sivasubramanian, Tim Oates, and Tinoosh Mohsenin. 2018. Sensornet: A scalable and low-power deep convolutional neural network for multimodal data classification. IEEE Transactions on Circuits and Systems I: Regular Papers 99 (2018), 1--14.
[5]
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. 1928--1937.
[6]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529.
[7]
Antonin Raffin and Roma Sokolkov. 2019. Learning to Drive Smoothly in Minutes. https://github.com/araffin/learning-to-drive-in-5-minutes/. (2019).
[8]
William Saunders, Girish Sastry, Andreas Stuhlmueller, and Owain Evans. 2017. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention . (jul 2017). arxiv: 1707.05173 http://arxiv.org/abs/1707.05173
[9]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[10]
Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.
[11]
Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, and Peter Stone. 2018. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces. AAAI Conference on Artificial Intelligence (2018), 1545--1553. https://aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16200
[12]
Nicholas R. Waytowich, Vinicius G. Goecks, and Vernon J. Lawhern. 2018. Cycle-of-Learning for Autonomous Systems from Human Interaction. CoRR, Vol. abs/1808.09572v1 (2018). arxiv: 1808.09572v1 https://arxiv.org/abs/1808.09572v1

Cited By

View all
  • (2024)Hierarchical VAE Based Semantic Communications for POMDP TasksICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445833(5540-5544)Online publication date: 14-Apr-2024
  • (2024)Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responsesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106998126:PCOnline publication date: 1-Feb-2024
  • (2024)Integrated Discounted Future Prediction as Auxiliary Task for A3CBiologically Inspired Cognitive Architectures 202310.1007/978-3-031-50381-8_9(62-69)Online publication date: 14-Feb-2024
  • Show More Cited By

Index Terms

  1. On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GLSVLSI '19: Proceedings of the 2019 Great Lakes Symposium on VLSI
    May 2019
    562 pages
    ISBN:9781450362528
    DOI:10.1145/3299874
    © 2019 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. autoencoders
    2. deep learning
    3. embedded devices
    4. neural networks
    5. reinforcement learning

    Qualifiers

    • Research-article

    Funding Sources

    • U.S. Army Research Laboratory

    Conference

    GLSVLSI '19
    Sponsor:
    GLSVLSI '19: Great Lakes Symposium on VLSI 2019
    May 9 - 11, 2019
    VA, Tysons Corner, USA

    Acceptance Rates

    Overall Acceptance Rate 312 of 1,156 submissions, 27%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)43
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Hierarchical VAE Based Semantic Communications for POMDP TasksICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445833(5540-5544)Online publication date: 14-Apr-2024
    • (2024)Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responsesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106998126:PCOnline publication date: 1-Feb-2024
    • (2024)Integrated Discounted Future Prediction as Auxiliary Task for A3CBiologically Inspired Cognitive Architectures 202310.1007/978-3-031-50381-8_9(62-69)Online publication date: 14-Feb-2024
    • (2023)EXPLORA: AI/ML EXPLainability for the Open RANProceedings of the ACM on Networking10.1145/36291411:CoNEXT3(1-26)Online publication date: 28-Nov-2023
    • (2023)Coarse-to-fine fusion for language grounding in 3D navigationKnowledge-Based Systems10.1016/j.knosys.2023.110785277:COnline publication date: 9-Oct-2023
    • (2021)An Energy Efficient EdgeAI Autoencoder Accelerator for Reinforcement LearningIEEE Open Journal of Circuits and Systems10.1109/OJCAS.2020.30437372(182-195)Online publication date: 2021
    • (2021)An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)10.1109/AICAS51828.2021.9458548(1-4)Online publication date: 6-Jun-2021
    • (2020)Energy-Efficient Hardware for Language Guided Reinforcement LearningProceedings of the 2020 on Great Lakes Symposium on VLSI10.1145/3386263.3407652(131-136)Online publication date: 7-Sep-2020
    • (2020)A Low-Power LSTM Processor for Multi-Channel Brain EEG Artifact Detection2020 21st International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED48828.2020.9137056(105-110)Online publication date: Mar-2020
    • (2020)CSCMAC - Cyclic Sparsely Connected Neural Network Manycore Accelerator2020 21st International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED48828.2020.9137013(311-316)Online publication date: Mar-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media