Computer Science > Machine Learning

arXiv:2409.06985 (cs)

[Submitted on 11 Sep 2024]

Title:Enhancing Cross-domain Pre-Trained Decision Transformers with Adaptive Attention

Authors:Wenhao Zhao, Qiushui Xu, Linjie Xu, Lei Song, Jinyu Wang, Chunlai Zhou, Jiang Bian

Abstract:Recently, the pre-training of decision transformers (DT) using a different domain, such as natural language text, has generated significant attention in offline reinforcement learning (Offline RL). Although this cross-domain pre-training approach achieves superior performance compared to training from scratch in environments required short-term planning ability, the mechanisms by which pre-training benefits the fine-tuning phase remain unclear. Furthermore, we point out that the cross-domain pre-training approach hinders the extraction of distant information in environments like PointMaze that require long-term planning ability, leading to performance that is much worse than training DT from scratch. This work first analyzes these issues and found that Markov Matrix, a component that exists in pre-trained attention heads, is the key to explain the significant performance disparity of pre-trained models in different planning abilities. Inspired by our analysis, we propose a general method GPT-DTMA, which equips a pre-trained DT with Mixture of Attention (MoA), to enable adaptive learning and accommodating diverse attention requirements during fine-tuning. Extensive experiments demonstrate that the effectiveness of GPT-DTMA: it achieves superior performance in short-term environments compared to baselines, and in long-term environments, it mitigates the negative impact caused by Markov Matrix, achieving results comparable to those of DT trained from scratch.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2409.06985 [cs.LG]
	(or arXiv:2409.06985v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2409.06985

Submission history

From: Qiushui Xu [view email]
[v1] Wed, 11 Sep 2024 03:18:34 UTC (2,154 KB)

Computer Science > Machine Learning

Title:Enhancing Cross-domain Pre-Trained Decision Transformers with Adaptive Attention

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Enhancing Cross-domain Pre-Trained Decision Transformers with Adaptive Attention

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators