Learning to Play Soccer From Scratch: Sample-Efficient Emergent Coordination Through Curriculum-Learning and Competition
2021 IEEE/RSJ International Conference on Intelligent Robots and …, 2021•ieeexplore.ieee.org
This work proposes a scheme that allows learning complex multi-agent behaviors in a
sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov
game, and solved using deep reinforcement learning. We propose a basic multi-agent
extension of TD3 for learning the policy of each player, in a decentralized manner. To ease
learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of
learning in multi-agent stages (1v1 and 2v2) uses agents trained in a previous stage as fixed …
sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov
game, and solved using deep reinforcement learning. We propose a basic multi-agent
extension of TD3 for learning the policy of each player, in a decentralized manner. To ease
learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of
learning in multi-agent stages (1v1 and 2v2) uses agents trained in a previous stage as fixed …
This work proposes a scheme that allows learning complex multi-agent behaviors in a sample efficient manner, applied to 2v2 soccer. The problem is formulated as a Markov game, and solved using deep reinforcement learning. We propose a basic multi-agent extension of TD3 for learning the policy of each player, in a decentralized manner. To ease learning, the task of 2v2 soccer is divided in three stages: 1v0, 1v1 and 2v2. The process of learning in multi-agent stages (1v1 and 2v2) uses agents trained in a previous stage as fixed opponents. In addition, we propose using experience sharing, a method that shares experience from a fixed opponent, trained in a previous stage, for training the agent currently learning, and a form of frame-skipping, to raise performance significantly. Our results show that high quality soccer play can be obtained with our approach in just under 40M interactions. A summarized video of the resulting game play can be found in https://youtu.be/pScrKNqfELE.
ieeexplore.ieee.org