Computer Science > Machine Learning

arXiv:2410.24108 (cs)

[Submitted on 31 Oct 2024]

Title:Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers

Authors:Kai Yan, Alexander G. Schwing, Yu-Xiong Wang

Abstract:Decision Transformers have recently emerged as a new and compelling paradigm for offline Reinforcement Learning (RL), completing a trajectory in an autoregressive way. While improvements have been made to overcome initial shortcomings, online finetuning of decision transformers has been surprisingly under-explored. The widely adopted state-of-the-art Online Decision Transformer (ODT) still struggles when pretrained with low-reward offline data. In this paper, we theoretically analyze the online-finetuning of the decision transformer, showing that the commonly used Return-To-Go (RTG) that's far from the expected return hampers the online fine-tuning process. This problem, however, is well-addressed by the value function and advantage of standard RL algorithms. As suggested by our analysis, in our experiments, we hence find that simply adding TD3 gradients to the finetuning process of ODT effectively improves the online finetuning performance of ODT, especially if ODT is pretrained with low-reward offline data. These findings provide new directions to further improve decision transformers.

Comments:	Accepted as NeurIPS 2024 spotlight. 33 pages, 26 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.24108 [cs.LG]
	(or arXiv:2410.24108v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.24108

Submission history

From: Kai Yan [view email]
[v1] Thu, 31 Oct 2024 16:38:51 UTC (27,114 KB)

Computer Science > Machine Learning

Title:Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators