[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3278721.3278776acmconferencesArticle/Chapter ViewAbstractPublication PagesaiesConference Proceedingsconference-collections
research-article
Public Access

Transparency and Explanation in Deep Reinforcement Learning Neural Networks

Published: 27 December 2018 Publication History

Abstract

Autonomous AI systems will be entering human society in the near future to provide services and work alongside humans. For those systems to be accepted and trusted, the users should be able to understand the reasoning process of the system, i.e. the system should be transparent. System transparency enables humans to form coherent explanations of the system's decisions and actions. Transparency is important not only for user trust, but also for software debugging and certification. In recent years, Deep Neural Networks have made great advances in multiple application areas. However, deep neural networks are opaque. In this paper, we report on work in transparency in Deep Reinforcement Learning Networks (DRLN). Such networks have been extremely successful in accurately learning action control in image input domains, such as Atari games. In this paper, we propose a novel and general method that (a) incorporates explicit object recognition processing into deep reinforcement learning models, (b) forms the basis for the development of "object saliency maps", to provide visualization of internal states of DRLNs, thus enabling the formation of explanations and (c) can be incorporated in any existing deep reinforcement learning framework. We present computational results and human experiments to evaluate our approach.

References

[1]
Yuval Bitan and Joachim Meyer. 2007. Self-initiated and respondent actions in a simulated control task. Ergonomics, Vol. 50, 5 (2007), 763--788.
[2]
Roberto Brunelli. 2009. Template Matching Techniques in Computer Vision: Theory and Practice .Wiley Publishing.
[3]
Jessie Y Chen, Katelyn Procci, Michael Boyce, Julia Wright, Andre Garcia, and Michael Barnes. 2014. Situation awareness-based agent transparency . Technical Report. ARMY RESEARCH LAB ABERDEEN PROVING GROUND MD HUMAN RESEARCH AND ENGINEERING DIRECTORATE.
[4]
Ewart J de Visser, Marvin Cohen, Amos Freedy, and Raja Parasuraman. 2014. A design methodology for trust cue calibration in cognitive agents. In International Conference on Virtual, Augmented and Mixed Reality. Springer, 251--262.
[5]
Dumitru Erhan, Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2009. Visualizing higher-layer features of a deep network. University of Montreal, Vol. 1341 (2009), 3.
[6]
Tom Fawcett. 2006. An Introduction to ROC Analysis. Pattern Recogn. Lett., Vol. 27, 8 (June 2006), 861--874.
[7]
Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov. 2012. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012).
[8]
Itseez. 2015. Open Source Computer Vision Library. https://github.com/itseez/opencv .
[9]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[10]
Quoc V Le. 2013. Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 8595--8598.
[11]
John D Lee and Katrina A See. 2004. Trust in automation: Designing for appropriate reliance. Human factors, Vol. 46, 1 (2004), 50--80.
[12]
Yuezhang Li, Katia Sycara, and Rahul Iyer. 2017. Object Sensitive Deep Reinforcement Learning. In 3rd Global Conference on Artificial Intelligence . EPiC Series in Computing.
[13]
Michael P Linegang, Heather A Stoner, Michael J Patterson, Bobbie D Seppelt, Joshua D Hoffman, Zachariah B Crittendon, and John D Lee. 2006. Human-automation collaboration in dynamic mission planning: A challenge requiring an ecological approach. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 50. SAGE Publications Sage CA: Los Angeles, CA, 2482--2486.
[14]
Joseph B Lyons and Paul R Havig. 2014. Transparency in a human-machine context: approaches for fostering shared awareness/intent. In International Conference on Virtual, Augmented and Mixed Reality. Springer, 181--190.
[15]
Joseph E Mercado, Michael A Rupp, Jessie YC Chen, Michael J Barnes, Daniel Barber, and Katelyn Procci. 2016. Intelligent agent transparency in human--agent teaming for Multi-UxV management. Human factors, Vol. 58, 3 (2016), 401--415.
[16]
Volodymyr Mnih, Adrià Puigdomè nech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. CoRR, Vol. abs/1602.01783 (2016). http://arxiv.org/abs/1602.01783
[17]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (Feb. 2015), 529--533.
[18]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why Should I Trust You?: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135--1144.
[19]
Bobbie D Seppelt and John D Lee. 2007. Making adaptive cruise control (ACC) limits visible. International journal of human-computer studies, Vol. 65, 3 (2007), 192--205.
[20]
Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun. 2013. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013).
[21]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013).
[22]
Hyun Oh Song, Ross B Girshick, Stefanie Jegelka, Julien Mairal, Zaid Harchaoui, Trevor Darrell, et almbox. 2014. On learning to localize objects with minimal supervision. In ICML . 1611--1619.
[23]
Neville A Stanton, Mark S Young, and Guy H Walker. 2007. The psychology of driving automation: a discussion with Professor Don Norman. International journal of vehicle design, Vol. 45, 3 (2007), 289--306.
[24]
Richard S Sutton. 1996. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In Advances in neural information processing systems. 1038--1044.
[25]
Richard S Sutton and Andrew G Barto. 1998. Reinforcement learning: An introduction . Vol. 1. MIT press Cambridge.
[26]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.
[27]
Hado van Hasselt, Arthur Guez, and David Silver. 2015. Deep Reinforcement Learning with Double Q-learning. CoRR, Vol. abs/1509.06461 (2015). http://arxiv.org/abs/1509.06461
[28]
Ziyu Wang, Nando de Freitas, and Marc Lanctot. 2015. Dueling network architectures for deep reinforcement learning. arXiv preprint arXiv:1511.06581 (2015).
[29]
Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, Vol. 8, 3--4 (1992), 229--256.
[30]
Matthew D Zeiler and Rob Fergus. 2014. Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, 818--833.

Cited By

View all
  • (2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
  • (2024)eXplainable Artificial Intelligence in Process Engineering: Promises, Facts, and Current LimitationsApplied System Innovation10.3390/asi70601217:6(121)Online publication date: 30-Nov-2024
  • (2024)Toward Fairness, Accountability, Transparency, and Ethics in AI for Social Media and Health Care: Scoping ReviewJMIR Medical Informatics10.2196/5004812(e50048)Online publication date: 3-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
December 2018
406 pages
ISBN:9781450360128
DOI:10.1145/3278721
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep reinforcement learning
  2. explainable ai
  3. human factors
  4. human-ai interaction
  5. system transparency

Qualifiers

  • Research-article

Funding Sources

Conference

AIES '18
Sponsor:
AIES '18: AAAI/ACM Conference on AI, Ethics, and Society
February 2 - 3, 2018
LA, New Orleans, USA

Acceptance Rates

AIES '18 Paper Acceptance Rate 61 of 162 submissions, 38%;
Overall Acceptance Rate 61 of 162 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)549
  • Downloads (Last 6 weeks)45
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Explaining Sequences of Actions in Multi-agent Deep Reinforcement Learning ModelsProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663219(2537-2539)Online publication date: 6-May-2024
  • (2024)eXplainable Artificial Intelligence in Process Engineering: Promises, Facts, and Current LimitationsApplied System Innovation10.3390/asi70601217:6(121)Online publication date: 30-Nov-2024
  • (2024)Toward Fairness, Accountability, Transparency, and Ethics in AI for Social Media and Health Care: Scoping ReviewJMIR Medical Informatics10.2196/5004812(e50048)Online publication date: 3-Apr-2024
  • (2024)A Novel Tree-Based Method for Interpretable Reinforcement LearningACM Transactions on Knowledge Discovery from Data10.1145/369546418:9(1-22)Online publication date: 9-Sep-2024
  • (2024)Competence Awareness for Humans and Machines: A Survey and Future Research Directions from PsychologyACM Computing Surveys10.1145/368962657:1(1-26)Online publication date: 7-Oct-2024
  • (2024)XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning TechniquesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671595(6073-6082)Online publication date: 25-Aug-2024
  • (2024)Explainable Reinforcement Learning: A Survey and Comparative ReviewACM Computing Surveys10.1145/361686456:7(1-36)Online publication date: 9-Apr-2024
  • (2024)Explainable Artificial Intelligence (XAI) Approach for Reinforcement Learning SystemsProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635992(971-978)Online publication date: 8-Apr-2024
  • (2024)Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318495635:2(1696-1707)Online publication date: Feb-2024
  • (2024)A Multiobjective Genetic Algorithm to Evolving Local Interpretable Model-Agnostic Explanations for Deep Neural Networks in Image ClassificationIEEE Transactions on Evolutionary Computation10.1109/TEVC.2022.322559128:4(903-917)Online publication date: Aug-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media