Computer Science > Computation and Language

arXiv:2404.18543 (cs)

[Submitted on 29 Apr 2024]

Title:Time Machine GPT

Authors:Felix Drinkall, Eghbal Rahimikia, Janet B. Pierrehumbert, Stefan Zohren

Abstract:Large language models (LLMs) are often trained on extensive, temporally indiscriminate text corpora, reflecting the lack of datasets with temporal metadata. This approach is not aligned with the evolving nature of language. Conventional methods for creating temporally adapted language models often depend on further pre-training static models on time-specific data. This paper presents a new approach: a series of point-in-time LLMs called Time Machine GPT (TiMaGPT), specifically designed to be nonprognosticative. This ensures they remain uninformed about future factual information and linguistic changes. This strategy is beneficial for understanding language evolution and is of critical importance when applying models in dynamic contexts, such as time-series forecasting, where foresight of future information can prove problematic. We provide access to both the models and training datasets.

Comments:	NAACL Findings 2024
Subjects:	Computation and Language (cs.CL); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
MSC classes:	I.2.1, I.2.7
Cite as:	arXiv:2404.18543 [cs.CL]
	(or arXiv:2404.18543v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.18543

Submission history

From: Felix Drinkall [view email]
[v1] Mon, 29 Apr 2024 09:34:25 UTC (7,842 KB)

Computer Science > Computation and Language

Title:Time Machine GPT

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Time Machine GPT

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators