Offline Causal Imitation Learning with Latent Confounders

Siyang Huang⁹,
Yan Zeng¹⁰,
Ruichu Cai⁹,
Zhifeng Hao¹¹ &
…
Fuchun Sun¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1732))

Included in the following conference series:

International Conference on Cognitive Computation and Systems

532 Accesses

Abstract

Learning an imitating policy offline from the expert’s demonstrations is prone to be a significant yet challenging problem. Despite great success, most methods assume that the data are uncorrupted with no latent confounders. However, such unobserved confounders could appear in many real-world applications, resulting in sub-optimal policies. Thus, in this paper, we propose an integrated two-stage algorithm to conduct the task of offline causal imitation learning, allowing the existence of latent confouders. In Stage 1, we aim at determining whether these latent variables are present or not, embracing a causal discovery method based on the conditional independence tests. In Stage 2, we adopt behavioral cloning or a variant of instrumental variable regression method for both the confounded and unconfounded cases, to eliminate the possible confounding influences. Experiments on the robotic arm control task verified the efficacy performances in both confounded and unconfounded situations.

S. Huang and Y. Zeng—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 55.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Imitation Learning as Cause-Effect Reasoning

Multi-agent Imitation Learning with Copulas

I2RL: online inverse reinforcement learning under occlusion

Article 05 November 2020

References

Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Aytar, Y., Pfaff, T., Budden, D., Paine, T., Wang, Z., De Freitas, N.: Playing hard exploration games by watching youTube. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
Bain, M., Sammut, C.: A framework for behavioural cloning. In: Machine Intelligence 15, pp. 103–129 (1995)
Google Scholar
Bareinboim, E., Forney, A., Pearl, J.: Bandits with unobserved confounders: a causal approach. In: Advances in Neural Information Processing Systems 28 (2015)
Google Scholar
Cai, R., Xie, F., Glymour, C., Hao, Z., Zhang, K.: Triad constraints for learning causal structure of latent variables. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Chen, W., Cai, R., Zhang, K., Hao, Z.: Causal discovery in linear non-gaussian acyclic model with multiple latent confounders. In: IEEE Transactions on Neural Networks and Learning Systems (2021)
Google Scholar
Codevilla, F., Müller, M., López, A., Koltun, V., Dosovitskiy, A.: End-to-end driving via conditional imitation learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4693–4700. IEEE (2018)
Google Scholar
Codevilla, F., Santana, E., López, A.M., Gaidon, A.: Exploring the limitations of behavior cloning for autonomous driving. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9329–9338 (2019)
Google Scholar
De Haan, P., Jayaraman, D., Levine, S.: Causal confusion in imitation learning. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Dikkala, N., Lewis, G., Mackey, L., Syrgkanis, V.: Minimax estimation of conditional moment models. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12248–12262 (2020)
Google Scholar
Ding, Y., Florensa, C., Abbeel, P., Phielipp, M.: Goal-conditioned imitation learning. In: Advances in Neural Information Processing Systems 32 (2019)
Google Scholar
Edwards, A., Sahni, H., Schroecker, Y., Isbell, C.: Imitating latent policies from observation. In: International Conference on Machine Learning, pp. 1755–1763. PMLR (2019)
Google Scholar
Entner, D., Hoyer, P.O.: On causal discovery from time series data using FCI. In: Probabilistic Graphical Models, pp. 121–128 (2010)
Google Scholar
Gerhardus, A., Runge, J.: High-recall causal discovery for autocorrelated time series with latent confounders. In: Advances in Neural Information Processing Systems, vol. 33, pp. 12615–12625 (2020)
Google Scholar
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems 29 (2016)
Google Scholar
Hoyer, P.O., Shimizu, S., Kerminen, A.J., Palviainen, M.: Estimation of causal effects using linear non-gaussian causal models with hidden variables. Int. J. Approximate Reasoning 49(2), 362–378 (2008)
Article MathSciNet MATH Google Scholar
Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 1–35 (2017)
Article Google Scholar
Hyvärinen, A., Shimizu, S., Hoyer, P.O.: Causal modelling combining instantaneous and lagged effects: an identifiable model based on non-gaussianity. In: Proceedings of the 25th International Conference on Machine Learning, pp. 424–431 (2008)
Google Scholar
Kumor, D., Zhang, J., Bareinboim, E.: Sequential causal imitation learning with unobserved confounders. In: Advances in Neural Information Processing Systems, vol. 34, pp. 14669–14680 (2021)
Google Scholar
Li, J., Luo, Y., Zhang, X.: Causal reinforcement learning: an instrumental variable approach. Available at SSRN 3792824 (2021)
Google Scholar
Liao, L., Fu, Z., Yang, Z., Wang, Y., Kolar, M., Wang, Z.: Instrumental variable value iteration for causal offline reinforcement learning. arXiv preprint arXiv:2102.09907 (2021)
Malinsky, D., Spirtes, P.: Causal structure learning from multivariate time series in settings with unmeasured confounding. In: Proceedings of 2018 ACM SIGKDD Workshop on Causal Discovery, pp. 23–47. PMLR (2018)
Google Scholar
Niekum, S., Osentoski, S., Konidaris, G., Chitta, S., Marthi, B., Barto, A.G.: Learning grounded finite-state representations from unstructured demonstrations. Int. J. Robot. Res. 34(2), 131–157 (2015)
Article Google Scholar
Peters, J., Janzing, D., Schölkopf, B.: Causal inference on time series using restricted structural equation models. In: Advances in Neural Information Processing Systems 26 (2013)
Google Scholar
Pomerleau, D.A.: Efficient training of artificial neural networks for autonomous navigation. Neural Comput. 3(1), 88–97 (1991)
Article Google Scholar
Ratliff, N., Bagnell, J.A., Srinivasa, S.S.: Imitation learning for locomotion and manipulation. In: 2007 7th IEEE-RAS International Conference on Humanoid Robots, pp. 392–397. IEEE (2007)
Google Scholar
Salehkaleybar, S., Ghassami, A., Kiyavash, N., Zhang, K.: Learning linear non-gaussian causal models in the presence of latent variables. J. Mach. Learn. Res. 21, 39–1 (2020)
MathSciNet MATH Google Scholar
Spirtes, P., Glymour, C.N., Scheines, R., Heckerman, D.: Causation, prediction, and search. MIT press (2000)
Google Scholar
Spirtes, P., Meek, C., Richardson, T.: Causal inference in the presence of latent variables and selection bias. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 499–506 (1995)
Google Scholar
Sun, W., Venkatraman, A., Gordon, G.J., Boots, B., Bagnell, J.A.: Deeply aggrevated: differentiable imitation learning for sequential prediction. In: International Conference on Machine Learning, pp. 3309–3318. PMLR (2017)
Google Scholar
Swamy, G., Choudhury, S., Bagnell, J.A., Wu, Z.S.: Causal imitation learning under temporally correlated noise. arXiv preprint arXiv:2202.01312 (2022)
Torabi, F., Warnell, G., Stone, P.: Behavioral cloning from observation. arXiv preprint arXiv:1805.01954 (2018)
Weichwald, S., et al.: Learning by doing: controlling a dynamical system using causality, control, and reinforcement learning. arXiv preprint arXiv:2202.06052 (2022)
Zeng, Y., Shimizu, S., Cai, R., Xie, F., Yamamoto, M., Hao, Z.: Causal discovery with multi-domain lingam for latent factors. In: 30th International Joint Conference on Artificial Intelligence, IJCAI 2021, pp. 2097–2103. International Joint Conferences on Artificial Intelligence (2021)
Google Scholar
Zhang, J., Kumor, D., Bareinboim, E.: Causal imitation learning with unobserved confounders. Adv. Neural. Inf. Process. Syst. 33, 12263–12274 (2020)
Google Scholar
Zhang, T., et al.: Deep imitation learning for complex manipulation tasks from virtual reality teleoperation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 5628–5635. IEEE (2018)
Google Scholar
Zheng, B., Verma, S., Zhou, J., Tsang, I., Chen, F.: Imitation learning: Progress, taxonomies and opportunities. arXiv preprint arXiv:2106.12177 (2021)

Download references

Author information

Authors and Affiliations

Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
Siyang Huang & Ruichu Cai
Tsinghua University, Beijing, 100084, China
Yan Zeng & Fuchun Sun
Shantou University, Shantou, 515063, Guangdong, China
Zhifeng Hao

Authors

Siyang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Ruichu Cai
View author publications
You can also search for this author in PubMed Google Scholar
Zhifeng Hao
View author publications
You can also search for this author in PubMed Google Scholar
Fuchun Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruichu Cai .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fuchun Sun
Tsinghua University, Beijing, China
Jianmin Li
Tsinghua University, Beijing, China
Huaping Liu
Beihang University, Beijing, China
Zhongyi Chu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, S., Zeng, Y., Cai, R., Hao, Z., Sun, F. (2023). Offline Causal Imitation Learning with Latent Confounders. In: Sun, F., Li, J., Liu, H., Chu, Z. (eds) Cognitive Computation and Systems. ICCCS 2022. Communications in Computer and Information Science, vol 1732. Springer, Singapore. https://doi.org/10.1007/978-981-99-2789-0_19

Download citation

DOI: https://doi.org/10.1007/978-981-99-2789-0_19
Published: 24 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2788-3
Online ISBN: 978-981-99-2789-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Offline Causal Imitation Learning with Latent Confounders

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Imitation Learning as Cause-Effect Reasoning

Multi-agent Imitation Learning with Copulas

I2RL: online inverse reinforcement learning under occlusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Offline Causal Imitation Learning with Latent Confounders

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Imitation Learning as Cause-Effect Reasoning

Multi-agent Imitation Learning with Copulas

I2RL: online inverse reinforcement learning under occlusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation