Abstract
Active inference has emerged as an alternative approach to control problems given its intuitive (probabilistic) formalism. However, despite its theoretical utility, computational implementations have largely been restricted to low-dimensional, deterministic settings. This paper highlights that this is a consequence of the inability to adequately model stochastic transition dynamics, particularly when an extensive policy (i.e., action trajectory) space must be evaluated during planning. Fortunately, recent advancements propose a modified planning algorithm for finite temporal horizons. We build upon this work to assess the utility of active inference for a stochastic control setting. For this, we simulate the classic windy grid-world task with additional complexities, namely: 1) environment stochasticity; 2) learning of transition dynamics; and 3) partial observability. Our results demonstrate the advantage of using active inference, compared to reinforcement learning, in both deterministic and stochastic settings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
First term in Eq. 5 does not contribute to solving the problem addressed in the paper. Here, C only accommodates preference to goal-state. However, for a more informed C i.e with preferences for immediate reward maximisation, the term will influence action selection.
- 3.
References
Friston, K.: The free-energy principle: a unified brain theory? Nat. Rev. Neurosci. 11, 127–138 (2010)
Kaplan, R., Friston, K.J.: Planning and navigation as active inference. Biol. Cybern. 112(4), 323–343 (2018). https://doi.org/10.1007/s00422-018-0753-2
Kuchling, F., Friston, K., Georgiev, G., Levin, M.: Morphogenesis as Bayesian inference: a variational approach to pattern formation and control in complex biological systems. Phys. Life Rev. 33, 88–108 (2019)
Oliver, G., Lanillos, P., Cheng, G.: Active inference body perception and action for humanoid robots. arXiv preprint arXiv:1906.03022 (2019)
Rubin, S., Parr, T., Da Costa, L., Friston, K.: Future climates: Markov blankets and active inference in the biosphere. J. Royal Soc. Interface 17(172), 20200503 (2020)
Deane, G., Miller, M., Wilkinson, S.: Losing ourselves: active inference, depersonalization, and meditation. Front. Psychol. 11, 2893 (2020)
Friston, K.J., Daunizeau, J., Kiebel, S.J.: Reinforcement learning or active inference? PLOS ONE 4(7), e6421 (2009). https://doi.org/10.1371/journal.pone.0006421
Friston, K., Samothrakis, S., Montague, R.: Active inference and agency: optimal control without cost functions. Biol. Cybern. 106(8), 523–541 (2012)
Sajid, N., Ball, P.J., Parr, T., Friston, K.J.: Active inference: demystified and compared Neural Comput. 33(3), 674–712 (2021)
Friston, K., Da Costa, L., Hafner, D., Hesp, C., Parr, T.: Sophisticated inference. Neural Comput. 33(3), 713–763 (2021)
Da Costa, L., Sajid, N., Parr, T., Friston, K., Smith, R.: The relationship between dynamic programming and active inference: the discrete, finite-horizon case. arXiv arXiv:2009.08111 (2020)
Da Costa, L., Parr, T., Sajid, N., Veselic, S., Neacsu, V., Friston, K.: Active inference on discrete state-spaces: a synthesis. arXiv e-prints (2020)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., Pezzulo, G.: Active inference: a process theory. Neural Comput. 29(1), 1–49 (2017)
Fountas, Z., Sajid, N., Mediano, P.A., Friston, K.: Deep active inference agents using Monte-Carlo methods. arXiv preprint arXiv:2006.04176 (2020)
Çatal, O., Nauta, J., Verbelen, T., Simoens, P., Dhoedt, B.: Bayesian policy selection using active inference. arXiv preprint arXiv:1904.08149 (2019)
van der Himst, O., Lanillos, P.: Deep active inference for partially observable MDPs. In: IWAI 2020. CCIS, vol. 1326, pp. 61–71. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64919-7_8
Millidge, B., Tschantz, A., Seth, A.K., Buckley, C.L.: On the relationship between active inference and control as inference. In: IWAI 2020. CCIS, vol. 1326, pp. 3–11. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64919-7_1
Acknowledgments
AP acknowledges research sponsorship from IITB-Monash Research Academy, Mumbai and Department of Biotechnology, Government of India. AR is funded by the Australian Research Council (Refs: DE170100128 & DP200100757) and Australian National Health and Medical Research Council Investigator Grant (Ref: 1194910). AR is a CIFAR Azrieli Global Scholar in the Brain, Mind & Consciousness Program. AR and NS are affiliated with The Wellcome Centre for Human Neuroimaging supported by core funding from Wellcome [203147/Z/16/Z].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Supplementary Information
A Results Level-1 and Level-3 (Non-stochastic Settings)
B Outcome Modalities for POMDPs
In the partially observable setting, we considered two outcome modalities and both of them were the function of ‘side’ and ‘down’ coordinates defined for every state in Fig. 1. Examples of the coordinates and modalities are given below. First outcome modality is the sum of co-ordinates and second modality is the product of coordinates.
These outcome modalities are similar for many states (for e.g., states 2 and 11 have the same outcome modalities (see Table 3)). The results demonstrates the ability of active inference agent to perform optimal inference and planning in the face of ambiguity. One of the output from ‘SPM_MDP_VB_XX.m’ is ‘MDP.P’. ‘MDP.P’ returns the action probabilities an agent will use for a given POMDP as input at each time-step. This distribution was used to conduct multiple trails to evaluate success rate of the active inference agent.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Paul, A., Sajid, N., Gopalkrishnan, M., Razi, A. (2021). Active Inference for Stochastic Control. In: Kamp, M., et al. Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-93736-2_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93735-5
Online ISBN: 978-3-030-93736-2
eBook Packages: Computer ScienceComputer Science (R0)