Abstract
Without a model the application of reinforcement learning to control a dynamic system can be hampered by several shortcomings. The number of trials needed to learn a good policy can be costly and time consuming for robotic applications where data is gathered in real-time. In this paper we describe a variable resolution model-based reinforcement learning approach that distributes sample points in the state-space in proportion to the effect of actions. In this way the base learner economises on storage to approximate an effective model. Our approach is conducive to including background knowledge to speed up learning. We show how different types of background knowledge can used to speed up learning in this setting. In particular, we show good performance for a weak type of background knowledge by initially overgeneralising local experience.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River (1995)
Mitchell, T.M.: Machine Learning. McGraw-Hill, Singapore (1997)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. In: John Whiley & Sons, John Whiley & Sons, Inc., New York (1994)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning: State of the Art. Adaptation, Learning, and Optimization, vol. 12. Springer (2012)
Santamaria, J.C., Sutton, R.S., Ram, A.: Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behavior 6(2) (1998)
Gabel, T., Riedmiller, M.: CBR for State Value Function Approximation in Reinforcement Learning. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 206–221. Springer, Heidelberg (2005)
Jong, N.K., Stone, P.: Compositional Models for Reinforcement Learning. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009, Part I. LNCS, vol. 5781, pp. 644–659. Springer, Heidelberg (2009)
Kuipers, B.: Qualitative simulation. Artificial Intelligence 29, 289–338 (1986)
jMonkeyEngine 3D Game Development SDK (2012), http://jmonkeyengine.org/
Simon, H.A.: Rational choice and the structure of the environment. Psychological Review 63(2), 129–138 (1956)
Moore, A.W.: Efficient memory-based learning for robot control. Technical Report UCAM-CL-TR-209, University of Cambridge, Computer Laboratory (November 1990)
Hengst, B., Lange, M., White, B.: Learning ankle-tilt and foot-placement control for flat-footed bipedal balancing and walking. In: 11th IEEE-RAS International Conference on Humanoid Robots (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hengst, B. (2012). On-Line Model-Based Continuous State Reinforcement Learning Using Background Knowledge. In: Thielscher, M., Zhang, D. (eds) AI 2012: Advances in Artificial Intelligence. AI 2012. Lecture Notes in Computer Science(), vol 7691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35101-3_72
Download citation
DOI: https://doi.org/10.1007/978-3-642-35101-3_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35100-6
Online ISBN: 978-3-642-35101-3
eBook Packages: Computer ScienceComputer Science (R0)