[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3306127.3331818acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article
Public Access

Online Inverse Reinforcement Learning Under Occlusion

Published: 08 May 2019 Publication History

Abstract

Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from observing its behavior on a task. While this problem is witnessing sustained attention, the related problem of online IRL - where the observations are incrementally accrued, yet the real-time demands of the application often prohibit a full rerun of an IRL method - has received much less attention. We introduce a formal framework for online IRL, called incremental IRL (I2RL), and a new method that advances maximum entropy IRL with hidden variables, to this setting. Our formal analysis shows that the new method has a monotonically improving performance with more demonstration data, as well as probabilistically bounded error, both under full and partial observability. Experiments in a simulated robotic application, which involves learning under occlusion, show the significantly improved performance of online IRL as compared to both batch IRL and an online imitation learning method.

References

[1]
Pieter Abbeel and Andrew Y. Ng. 2004. Apprenticeship Learning via Inverse Reinforcement Learning. In Twenty-first International Conference on Machine Learning (ICML). 1--8.
[2]
Brenna D Argall, Sonia Chernova, Manuela Veloso, and Brett Browning. 2009. A survey of robot learning from demonstration. Robotics and autonomous systems, Vol. 57, 5 (2009), 469--483.
[3]
Monica Babes-Vroman, Vukosi Marivate, Kaushik Subramanian, and Michael Littman. 2011. Apprenticeship learning about multiple intentions. In 28th International Conference on Machine Learning (ICML). 897--904.
[4]
Kenneth Bogert and Prashant Doshi. 2014. Multi-robot Inverse Reinforcement Learning Under Occlusion with Interactions. In Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems (AAMAS '14). 173--180.
[5]
Kenneth Bogert and Prashant Doshi. 2015. Toward Estimating Others' Transition Models Under Occlusion for Multi-robot IRL. In 24th International Joint Conference on Artificial Intelligence (IJCAI) . 1867--1873.
[6]
Kenneth Bogert and Prashant Doshi. 2017. Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots Under Occlusion. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS '17). 522--529.
[7]
Kenneth Bogert, Jonathan Feng-Shun Lin, Prashant Doshi, and Dana Kulic. 2016. Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data. In 2016 International Conference on Autonomous Agents and Multiagent Systems . 1034--1042.
[8]
Abdeslam Boularias, Oliver Krömer, and Jan Peters. 2012. Structured Apprenticeship Learning. In European Conference on Machine Learning and Knowledge Discovery in Databases, Part II. 227--242.
[9]
Jaedeug Choi and Kee-Eung Kim. 2011. Inverse Reinforcement Learning in Partially Observable Environments. J. Mach. Learn. Res., Vol. 12 (2011), 691--730.
[10]
A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), Vol. 39 (1977), 1--38. Issue 1.
[11]
Jonathan Ho and Stefano Ermon. 2016. Generative Adversarial Imitation Learning. In Advances in Neural Information Processing Systems (NIPS) 29. 4565--4573.
[12]
Zhuo jun Jin, Hui Qian, Shen yi Chen, and Miao liang Zhu. 2010. Convergence Analysis of an Incremental Approach to Online Inverse Reinforcement Learning. Journal of Zhejiang University - Science C, Vol. 12, 1 (2010), 17--24.
[13]
Andrew Ng and Stuart Russell. 2000. Algorithms for inverse reinforcement learning . In Seventeenth International Conference on Machine Learning. 663--670.
[14]
Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, and Jan Peters. 2018. An Algorithmic Perspective on Imitation Learning. Foundations and Trends® in Robotics, Vol. 7, 1--2 (2018), 1--179.
[15]
Deepak Ramachandran and Eyal Amir. 2007. Bayesian Inverse Reinforcement Learning. In 20th International Joint Conference on Artifical Intelligence (IJCAI). 2586--2591.
[16]
Nicholas Rhinehart and Kris M. Kitani. 2017. First-Person Activity Forecasting with Online Inverse Reinforcement Learning. In International Conference on Computer Vision (ICCV) .
[17]
Stuart Russell. 1998. Learning Agents for Uncertain Environments (Extended Abstract). In Eleventh Annual Conference on Computational Learning Theory. 101--103.
[18]
Jacob Steinhardt and Percy Liang. 2014. Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm. In 31st International Conference on Machine Learning. 1593--1601.
[19]
M. Trivedi and P. Doshi. 2018. Inverse Learning of Robot Behavior for Collaborative Planning. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) . 1--9.
[20]
Shaojun Wang, Ronald Rosenfeld, Yunxin Zhao, and Dale Schuurmans. 2002. The Latent Maximum Entropy Principle. In IEEE International Symposium on Information Theory. 131--131.
[21]
Shaojun Wang and Dale Schuurmans Yunxin Zhao. 2012. The Latent Maximum Entropy Principle . ACM Transactions on Knowledge Discovery from Data, Vol. 6, 8 (2012).
[22]
Brian D. Ziebart, Andrew Maas, J. Andrew Bagnell, and Anind K. Dey. 2008. Maximum Entropy Inverse Reinforcement Learning. In 23rd National Conference on Artificial Intelligence - Volume 3. 1433--1438.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems
May 2019
2518 pages
ISBN:9781450363099

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 08 May 2019

Check for updates

Author Tags

  1. inverse reinforcement learning
  2. online learning
  3. reinforcement learning
  4. robot learning
  5. robotics

Qualifiers

  • Research-article

Funding Sources

Conference

AAMAS '19
Sponsor:

Acceptance Rates

AAMAS '19 Paper Acceptance Rate 193 of 793 submissions, 24%;
Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 230
    Total Downloads
  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)23
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media