Abstract
Complex control tasks can often be solved by decomposing them into hierarchies of manageable subtasks. Such decompositions require designers to decide how much human knowledge should be used to help learn the resulting components. On one hand, encoding human knowledge requires manual effort and may incorrectly constrain the learner’s hypothesis space or guide it away from the best solutions. On the other hand, it may make learning easier and enable the learner to tackle more complex tasks. This article examines the impact of this trade-off in tasks of varying difficulty. A space laid out by two dimensions is explored: (1) how much human assistance is given and (2) how difficult the task is. In particular, the neuroevolution learning algorithm is enhanced with three different methods for learning the components that result from a task decomposition. The first method, coevolution, is mostly unassisted by human knowledge. The second method, layered learning, is highly assisted. The third method, concurrent layered learning, is a novel combination of the first two that attempts to exploit human knowledge while retaining some of coevolution’s flexibility. Detailed empirical results are presented comparing and contrasting these three approaches on two versions of a complex task, namely robot soccer keepaway, that differ in difficulty of learning. These results confirm that, given a suitable task decomposition, neuroevolution can master difficult tasks. Furthermore, they demonstrate that the appropriate level of human assistance depends critically on the difficulty of the problem.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Balch, T. (2000). TeamBots domain: SoccerBots. http://www-2.cs.cmu.edu/~trb/TeamBots/Domains/SoccerBots.
Brooks, R. A. (1986). A robust layered control system for a mobile Robot. IEEE Journal of Robotics and Automation, RA-2, 14–23.
Bryant, B. D. & Miikkulainen, R. (2003). Neuroevolution for adaptive teams. Proceedings of the 2003 Congress on Evolutionary Computation (Vol. 3) (pp. 2194–2201).
Dayan, P. & Hinton, G. E. (1993). Feudal reinforcement learning. In S. J. Hanson, J. D. Cowan, & C. L. Giles (Eds.), Advances in Neural Information Processing Systems 5 (pp. 271–278). San Mateo, CA: Morgan Kaufmann.
Dietterich, T. G. (1998). The MAXQ method for hierarchical reinforcement learning. International Conference on Machine Learning (pp. 118–126). Morgan Kaufmann.
Ficici, S. G. & Pollack, J. B. (1998). Challenges in coevolutionary learning: Arms-race dynamics, open-endedness, and mediocre stable states. In A. B. Kitano & Talor (Eds.), Proceedings of the Sixth International Conference on Artificial Life (pp. 238–247). Cambridge, Massachusetts, USA: MIT Press.
Gat, E. (1998). Three-layer architectures. In D. Kortenkamp, R. P. Bonasso, & R. Murphy (Eds.), Artificial Intelligence and Mobile Robots (pp. 195–210). Menlo Park, CA: AAAI Press.
Gomez, F. J. (2003). Robust non-linear control through neuroevolution. Ph.D. Thesis, University of Texas at Austin. Technical Report AI-TR-03-303.
Gomez, F. & Miikkulainen, R. (1997). Incremental evolution of complex general behavior. Adaptive Behavior, 5, 317–342.
Gomez, F. & Miikkulainen, R. (1999). Solving non-Markovian control tasks with neuroevolution. Proceedings of the International Joint Conference on Artificial Intelligence (pp. 1356–1361). Denver, CO: Kaufmann.
Gomez, F. & Miikkulainen, R. (2001). Learning robust nonlinear control with neuroevolution. Technical Report AI01-292, The University of Texas at Austin Department of Computer Sciences.
Gomez, F. J. & Miikkulainen, R. (2003). Active guidance for a finless rocket using neuroevolution. In E. Cantu-Paz, J. A. Foster, K. Deb, L. D. Davis, R. Roy, U.-M. OReilly, H.-G. Beyer, R. Standish, G. Kendall, S.Wilson, M. Harman, J.Wegener, K. Dasgupta, M. A. Potter, A. C. Schultz, K. A. Dowsland, & N. J. J.Miller (Eds.), Genetic and Evolutionary Computation—GECCO 2003 (pp. 2084–2095). Chicago: Springer Verlag.
Gruau, F., Whitley, D., & Pyeatt, L. (1996). A comparison between cellular encoding and direct encoding for genetic neural networks. In J. R. Koza, D. E. Goldberg, D. B. Fogel, & R. L. Riolo (Eds.), Genetic Programming 1996: Proceedings of the First Annual Conference (pp. 81–89). MIT Press.
Haynes, T. & Sen, S. (1996). Evolving behavioral strategies in predators and prey. In G. Weiß & S. Sen (Eds.), Adaptation and Learning in Multiagent Systems (pp. 113–126). Berlin: Springer Verlag.
Hsu, W. H. & Gustafson, S. M. (2002). Genetic programming and multi-agent layered learning by reinforcements. Genetic and Evolutionary Computation Conference (pp. 764–771). New York, NY.
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Kuipers, B. & Beeson, P. (2002). Bootstrap learning for place recognition. Proceedings of the Eighteenth National Conference on Artificial Intelligence (pp. 174–180).
Lin, L.-J. (1993). Reinforcement learning for robots using neural networks. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
Maes, P. & Brooks, R. A. (1990). Learning to coordinate behaviors. Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 796–802). Morgan Kaufmann.
Mahadevan, S. & Connell, J. (1991). Scaling reinforcement learning to robotics by exploiting the subsumption architecture. Proceedings of the Eighth International Workshop on Machine Learning (pp. 328–332).
Moriarty, D. E. & Miikkulainen, R. (1996). Efficient reinforcement learning through symbiotic evolution. Machine Learning, 22, 11–32.
Noda, I., Matsubara, H., Hiraki, K., & Frank, I. (1998). Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence, 12, 233–250.
Pietro, A. D., While, L., & Barone, L. (2002). Learning in RoboCup keepaway using evolutionary algorithms. In W. B. Langdon, E. Cantú-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, & N. Jonoska (Eds.), GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference (pp. 1065–1072). New York: Morgan Kaufmann Publishers.
Potter, M. A. & Jong, K. A. D. (2000). Cooperative coevolution: An architecture for evolving coadapted subcomponents. Evolutionary Computation, 8, 1–29.
Pyeatt, L. & Howe, A. (1998). Learning to race: Experiments with a simulated race car. In D. J. Cook (Ed.), Proceedings of the 11th International Florida Artificial Intelligence Research Society Conference (pp. 357–361). Florida.
Rosin, C. D. & Belew, R. K. (1995). Methods for competitive co-evolution: finding opponents worth beating. In S. Forrest (Ed.), Proceedings of the Sixth International Conference on Genetic Algorithms (pp. 373–380). San Mateo,CA: Morgan Kaufman.
Schaffer, J. D., Whitley, D., & Eshelman, L. J. (1992). Combinations of genetic algorithms and neural networks: A survey of the state of the art. In D. Whitley & J. Schaffer (Eds.), International Workshop on Combinations of Genetic Algorithms and Neural Networks (COGANN-92) (pp. 1–37). IEEE Computer Society Press.
Stanley, K. O. & Miikkulainen, R. (2004). Competitive coevolution through evolutionary complexification. Journal of Artificial Intelligence Research (pp. 63–100).
Stone, P. (2000). Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. MIT Press.
Stone, P., Asada, M., Balch, T., Fujita, M., Kraetzschmar, G., Lund, H., Scerri, P., Tadokoro, S., & Wyeth, G. (2001). Overview of RoboCup-2000. In P. Stone, T. Balch, & G. Kraetszchmar (Eds.), RoboCup-2000: Robot Soccer World Cup IV (pp. 1–28), Berlin: Springer Verlag.
Stone, P. & McAllester, D. (2001). An architecture for action selection in robotic soccer. Proceedings of the Fifth International Conference on Autonomous Agents (pp. 316–323).
Stone, P. & Sutton, R. S. (2001). Scaling reinforcement learning toward RoboCup Soccer. Proceedings of the Eighteenth International Conference on Machine Learning (pp. 537–544). San Francisco, CA: Morgan Kaufmann.
Stone, P. & Sutton, R. S. (2002). Keepaway soccer: a machine learning testbed. In A. Birk, S. Coradeschi, & S. Tadokoro (Eds.), RoboCup-2001: Robot Soccer World Cup V (pp. 214–223). Berlin: Springer Verlag.
Stone, P. & Veloso, M. (1998). A layered approach to learning client behaviors in the RoboCup soccer server. Applied Artificial Intelligence, 12, 165–188.
Stone, P. & Veloso, M. (2000). Layered learning. In R. L. de Mántaras & E. Plaza (Eds.), Machine Learning: ECML 2000 (Proceedings of the Eleventh European Conference on Machine Learning) (pp. 369–381). Barcelona, Catalonia, Spain: Springer Verlag.
Utgoff, P. E. & Stracuzzi, D. J. (2002). Many-layered learning. Neural Computation, 14, 2497–2529.
Whiteson, S. & Stone, P. (2003). Concurrent layered learning. AAMAS 2003: Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems (pp. 193–200).
Whitley, D., Mathias, K., & Fitzhorn, P. (1991). Delta-coding: An iterative search strategy for genetic algorithms. In R. K. Belew & L. B. Booker (Eds.), Proceedings of the Fourth International Conference on Genetic Algorithms (pp. 77–84).
Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87:9, 1423–1447.
Yong, C. H. & Miikkulainen, R. (2001). Cooperative coevolution of multi-agent systems. Technical Report AI01-287, The University of Texas at Austin Department of Computer Sciences.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor
Robert Holte
Rights and permissions
About this article
Cite this article
Whiteson, S., Kohl, N., Miikkulainen, R. et al. Evolving Soccer Keepaway Players Through Task Decomposition. Mach Learn 59, 5–30 (2005). https://doi.org/10.1007/s10994-005-0460-9
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10994-005-0460-9