research-article

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

Authors:

Bharat Prakash,

Mark Horton,

Nicholas R. Waytowich,

William David Hairston,

Tim Oates,

Tinoosh MohseninAuthors Info & Claims

GLSVLSI '19: Proceedings of the 2019 Great Lakes Symposium on VLSI

Pages 507 - 512

https://doi.org/10.1145/3299874.3319493

Published: 13 May 2019 Publication History

Get Access

Abstract

In autonomous embedded systems, it is often vital to reduce the amount of actions taken in the real world and energy required to learn a policy. Training reinforcement learning agents from high dimensional image representations can be very expensive and time consuming. Autoencoders are deep neural network used to compress high dimensional data such as pixelated images into small latent representations. This compression model is vital to efficiently learn policies, especially when learning on embedded systems. We have implemented this model on the NVIDIA Jetson TX2 embedded GPU, and evaluated the power consumption, throughput, and energy consumption of the autoencoders for various CPU/GPU core combinations, frequencies, and model parameters. Additionally, we have shown the reconstructions generated by the autoencoder to analyze the quality of the generated compressed representation and also the performance of the reinforcement learning agent. Finally, we have presented an assessment of the viability of training these models on embedded systems and their usefulness in developing autonomous policies. Using autoencoders, we were able to achieve 4-5X improved performance compared to a baseline RL agent with a convolutional feature extractor, while using less than 2W of power.

References

[1]

Vinicius G. Goecks, Gregory M. Gremillion, Vernon J. Lawhern, John Valasek, and Nicholas R. Waytowich. 2019. Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real Time. AAAI Conference on Artificial Intelligence (2019).

Google Scholar

[2]

David Ha and Jürgen Schmidhuber. 2018. World models. arXiv preprint arXiv:1803.10122 (2018).

Google Scholar

[3]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290 (2018).

Google Scholar

[4]

Ali Jafari, Ashwinkumar Ganesan, Chetan Sai Kumar Thalisetty, Varun Sivasubramanian, Tim Oates, and Tinoosh Mohsenin. 2018. Sensornet: A scalable and low-power deep convolutional neural network for multimodal data classification. IEEE Transactions on Circuits and Systems I: Regular Papers 99 (2018), 1--14.

Google Scholar

[5]

Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. 1928--1937.

Digital Library

Google Scholar

[6]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529.

Google Scholar

[7]

Antonin Raffin and Roma Sokolkov. 2019. Learning to Drive Smoothly in Minutes. https://github.com/araffin/learning-to-drive-in-5-minutes/. (2019).

Google Scholar

[8]

William Saunders, Girish Sastry, Andreas Stuhlmueller, and Owain Evans. 2017. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention . (jul 2017). arxiv: 1707.05173 http://arxiv.org/abs/1707.05173

Google Scholar

[9]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

Google Scholar

[10]

Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems. 1057--1063.

Digital Library

Google Scholar

[11]

Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, and Peter Stone. 2018. Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces. AAAI Conference on Artificial Intelligence (2018), 1545--1553. https://aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16200

Google Scholar

[12]

Nicholas R. Waytowich, Vinicius G. Goecks, and Vernon J. Lawhern. 2018. Cycle-of-Learning for Autonomous Systems from Human Interaction. CoRR, Vol. abs/1808.09572v1 (2018). arxiv: 1808.09572v1 https://arxiv.org/abs/1808.09572v1

Google Scholar

Cited By

View all

Chen DHua W(2024)Hierarchical VAE Based Semantic Communications for POMDP TasksICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445833(5540-5544)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10445833
Brown NGarland AFadel GLi G(2024)Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responsesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106998126:PCOnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.106998
Andronenko AAvshalumov MDemin V(2024)Integrated Discounted Future Prediction as Auxiliary Task for A3CBiologically Inspired Cognitive Architectures 202310.1007/978-3-031-50381-8_9(62-69)Online publication date: 14-Feb-2024
https://doi.org/10.1007/978-3-031-50381-8_9
Show More Cited By

Index Terms

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
      1. Embedded software

Recommendations

Deep learning: systematic review, models, challenges, and research directions
Abstract
The current development in deep learning is witnessing an exponential transition into automation applications. This automation transition can provide a promising framework for higher performance and lower complexity. This ongoing transition ...
Robust Deep Reinforcement Learning with Adversarial Attacks
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems

This paper proposes adversarial attacks for Reinforcement Learning (RL). These attacks are then leveraged during training to improve the robustness of RL within robust control framework. We show that this adversarial training of DRL algorithms like Deep ...
Conversational Recommender System Using Deep Reinforcement Learning
RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Deep Reinforcement Learning (DRL) uses the best of both Reinforcement Learning and Deep Learning for solving problems which cannot be addressed by them individually. Deep Reinforcement Learning has been used widely for games, robotics etc. Limited work ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

GLSVLSI '19: Proceedings of the 2019 Great Lakes Symposium on VLSI

May 2019

562 pages

ISBN:9781450362528

DOI:10.1145/3299874

General Chairs:
Houman Homayoun
George Mason University, USA
,
Baris Taskin
Drexel University, USA
,
Program Chairs:
Tinoosh Mohsenin
UMBC, USA
,
Weisheng Zhao
Beihang University, China

© 2019 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. Army Research Laboratory

Conference

GLSVLSI '19

Sponsor:

SIGDA

GLSVLSI '19: Great Lakes Symposium on VLSI 2019

May 9 - 11, 2019

VA, Tysons Corner, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
621
Total Downloads

Downloads (Last 12 months)43
Downloads (Last 6 weeks)6

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Chen DHua W(2024)Hierarchical VAE Based Semantic Communications for POMDP TasksICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445833(5540-5544)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10445833
Brown NGarland AFadel GLi G(2024)Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responsesEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106998126:PCOnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.engappai.2023.106998
Andronenko AAvshalumov MDemin V(2024)Integrated Discounted Future Prediction as Auxiliary Task for A3CBiologically Inspired Cognitive Architectures 202310.1007/978-3-031-50381-8_9(62-69)Online publication date: 14-Feb-2024
https://doi.org/10.1007/978-3-031-50381-8_9
Fiandrino CBonati LD'Oro SPolese MMelodia TWidmer J(2023)EXPLORA: AI/ML EXPLainability for the Open RANProceedings of the ACM on Networking10.1145/36291411:CoNEXT3(1-26)Online publication date: 28-Nov-2023
https://dl.acm.org/doi/10.1145/3629141
Nguyen TVo AChoi SKim Y(2023)Coarse-to-fine fusion for language grounding in 3D navigationKnowledge-Based Systems10.1016/j.knosys.2023.110785277:COnline publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1016/j.knosys.2023.110785
Manjunath NShiri AHosseini MPrakash BWaytowich NMohsenin T(2021)An Energy Efficient EdgeAI Autoencoder Accelerator for Reinforcement LearningIEEE Open Journal of Circuits and Systems10.1109/OJCAS.2020.30437372(182-195)Online publication date: 2021
https://doi.org/10.1109/OJCAS.2020.3043737
Shiri APrakash BMazumder AWaytowich NOates TMohsenin T(2021)An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS)10.1109/AICAS51828.2021.9458548(1-4)Online publication date: 6-Jun-2021
https://doi.org/10.1109/AICAS51828.2021.9458548
Shiri AMazumder APrakash BManjunath NHomayoun HSasan AWaytowich NMohsenin TMohsenin TZhao WChen YMutlu O(2020)Energy-Efficient Hardware for Language Guided Reinforcement LearningProceedings of the 2020 on Great Lakes Symposium on VLSI10.1145/3386263.3407652(131-136)Online publication date: 7-Sep-2020
https://dl.acm.org/doi/10.1145/3386263.3407652
Hasib-Al-Rashid Manjunath NPaneliya HHosseini MHairston WMohsenin T(2020)A Low-Power LSTM Processor for Multi-Channel Brain EEG Artifact Detection2020 21st International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED48828.2020.9137056(105-110)Online publication date: Mar-2020
https://doi.org/10.1109/ISQED48828.2020.9137056
Paneliya HHosseini MSasan AHomayoun HMohsenin T(2020)CSCMAC - Cyclic Sparsely Connected Neural Network Manycore Accelerator2020 21st International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED48828.2020.9137013(311-316)Online publication date: Mar-2020
https://doi.org/10.1109/ISQED48828.2020.9137013
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Deep learning: systematic review, models, challenges, and research directions

Robust Deep Reinforcement Learning with Adversarial Attacks

Conversational Recommender System Using Deep Reinforcement Learning