[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1329125.1329170acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Transfer via inter-task mappings in policy search reinforcement learning

Published: 14 May 2007 Publication History

Abstract

The ambitious goal of transfer learning is to accelerate learning on a target task after training on a different, but related, source task. While many past transfer methods have focused on transferring value-functions, this paper presents a method for transferring policies across tasks with different state and action spaces. In particular, this paper utilizes transfer via inter-task mappings for policy search methods (TVITM-PS) to construct a transfer functional that translates a population of neural network policies trained via policy search from a source task to a target task. Empirical results in robot soccer Keepaway and Server Job Scheduling show that TVITM-PS can markedly reduce learning time when full inter-task mappings are available. The results also demonstrate that TVITMPS still succeeds when given only incomplete inter-task mappings. Furthermore, we present a novel method for learning such mappings when they are not available, and give results showing they perform comparably to hand-coded mappings.

References

[1]
W. W. Cohen. Fast effective rule induction. In International Conf. on Machine Learning, pages 115--123, 1995.
[2]
M. Colombetti and M. Dorigo. Robot Shaping: Developing Situated Agents through Learning. Technical Report TR-92-040, International Computer Science Institute, Berkeley, CA, 1993.
[3]
F. Fernandez and M. Veloso. Learning by probabilistic reuse of past policies. In Proc. of the 6th International Conference on Autonomous Agents and Multiagent Systems, 2006.
[4]
C. Guestrin, D. Koller, C. Gearhart, and N. Kanodia. Generalizing plans to new environments in relational mdps. In International Joint Conference on Artificial Intelligence (IJCAI-03), Acapulco, Mexico, August 2003.
[5]
M. J. Mataric. Reward functions for accelerated learning. In International Conference on Machine Learning, 1994.
[6]
M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., 1994.
[7]
O. Selfridge, R. S. Sutton, and A. G. Barto. Training and tracking in robotics. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 670--672, 1985.
[8]
V. Soni and S. Singh. Using homomorphisms to transfer options across continuous reinforcement learning domains. In Proceedings of the Twenty First National Conference on Artificial Intelligence, July 2006.
[9]
K. O. Stanley and R. Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2):99--127, 2002.
[10]
P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu. Keepaway soccer: From machine learning testbed to benchmark. In I. Noda, A. Jacoff, A. Bredenfeld, and Y. Takahashi, editors, RoboCup-2005: Robot Soccer World Cup IX, volume 4020, pages 93--105. Springer Verlag, Berlin, 2006.
[11]
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165--188, 2005.
[12]
R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, 1998.
[13]
M. E. Taylor and P. Stone. Behavior transfer for value-function-based reinforcement learning. In F. Dignum, V. Dignum, S. Koenig, S. Kraus, M. P. Singh, and M. Wooldridge, editors, The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, pages 53--59, New York, NY, July 2005. ACM Press.
[14]
M. E. Taylor, S. Whiteson, and P. Stone. Comparing evolutionary and temporal difference methods for reinforcement learning. In Proc. of the Genetic and Evolutionary Computation Conf., pages 1321--28, July 2006.
[15]
L. Torrey, T. Walker, J. Shavlik, and R. Maclin. Using advice to transfer knowledge acquired in one reinforcement learning task to another. In Proceedings of the Sixteenth European Conference on Machine Learning, 2005.
[16]
W. E. Walsh, G. Tesauro, J. O. Kephart, and R. Das. Utility functions in autonomic systems. In Proc. of the International Conf. on Autonomic Computing, pages 70--77, 2004.
[17]
S. Whiteson and P. Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7(May):877--917, 2006.
[18]
D. Whitley and J. Schaffer, editors. International Workshop on Combinations of Genetic Algorithms and Neural Networks (COGANN-92), Los Alamitos, CA, 1992. IEEE Computer Society Press.
[19]
I. H. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2005.
[20]
X. Yao. Evolving artificial neural networks. Proceedings of the IEEE, 87(9):1423--1447, 1999.

Cited By

View all
  • (2024)A Transfer Reinforcement Learning Approach for Capacity Sharing in Beyond 5G NetworksFuture Internet10.3390/fi1612043416:12(434)Online publication date: 21-Nov-2024
  • (2024)Transfer Learning for Dynamical Systems Models via Autoencoders and GANs2024 American Control Conference (ACC)10.23919/ACC60939.2024.10644658(8-14)Online publication date: 10-Jul-2024
  • (2024)Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.343555812(114552-114572)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
May 2007
1585 pages
ISBN:9788190426275
DOI:10.1145/1329125
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IFAAMAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

AAMAS07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Transfer Reinforcement Learning Approach for Capacity Sharing in Beyond 5G NetworksFuture Internet10.3390/fi1612043416:12(434)Online publication date: 21-Nov-2024
  • (2024)Transfer Learning for Dynamical Systems Models via Autoencoders and GANs2024 American Control Conference (ACC)10.23919/ACC60939.2024.10644658(8-14)Online publication date: 10-Jul-2024
  • (2024)Knowledge Transfer for Cross-Domain Reinforcement Learning: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.343555812(114552-114572)Online publication date: 2024
  • (2024)Enhancing reinforcement learning‐based ramp metering performance at freeway uncertain bottlenecks using curriculum learningIET Intelligent Transport Systems10.1049/itr2.1249418:10(1863-1878)Online publication date: 13-Feb-2024
  • (2024)Enhancing Learning Efficiency in FACL: A Novel Fuzzy Rule Transfer Method for Transfer LearningInternational Journal of Fuzzy Systems10.1007/s40815-023-01662-326:4(1215-1232)Online publication date: 17-Feb-2024
  • (2022)Transfer-Reinforcement-Learning-Based rescheduling of differential power grids considering security constraintsApplied Energy10.1016/j.apenergy.2021.118121306(118121)Online publication date: Jan-2022
  • (2022)Neuroevolution of Spiking Neural P SystemsApplications of Evolutionary Computation10.1007/978-3-031-02462-7_28(435-451)Online publication date: 15-Apr-2022
  • (2021)Enhancing Transferability of Deep Reinforcement Learning-Based Variable Speed Limit Control Using Transfer LearningIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2020.299059822:7(4684-4695)Online publication date: Jul-2021
  • (2021)Multi-agent machine learning in self-organizing systemsInformation Sciences: an International Journal10.1016/j.ins.2021.09.013581:C(194-214)Online publication date: 1-Dec-2021
  • (2021)Abstraction in Data-Sparse Task TransferArtificial Intelligence10.1016/j.artint.2021.103551(103551)Online publication date: Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media