8000 GitHub - lqzxt/Time-R1: Time-R1 is a two-stage reinforcement fine-tuning framework that trains large language models to perform slow-thinking, step-by-step reasoning for accurate and explainable time series forecasting.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
/ Time-R1 Public

Time-R1 is a two-stage reinforcement fine-tuning framework that trains large language models to perform slow-thinking, step-by-step reasoning for accurate and explainable time series forecasting.

Notifications You must be signed in to change notification settings

lqzxt/Time-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

📖 Abstract

Time-R1 introduces the study of slow-thinking reasoning for time series forecasting. We propose a two-stage reinforcement fine-tuning framework combining supervised warmup and policy optimization with GRIP, a group-based sampling strategy for multi-step reasoning. Our model significantly improves forecasting accuracy across diverse datasets, demonstrating the effectiveness of training LLMs for structured temporal reasoning.

This repository contains the official code for our paper:

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs > Yucong Luo, Yitong Zhou, Mingyue Cheng, Jiahao Wang, Daoyu Wang, Jintao Zhang, Tingyue Pan

Updates/News:

🚩 News (Jun. 2025): The final version of the paper was polished and successfully submitted to arXiv.

🚩 News (May 2025): The Time-R1 repository was officially released and fully open-sourced on GitHub.

🌟 Overview

Overview of the Time-R1.

Large Language Models (LLMs) demonstrate impressive capabilities but often lack time series reasoning for forecasting tasks. Ours addresses this by introducing a novel two-stage reinforcement fine-tuning (RFT) curriculum, guided by a custom-designed multi-objective reward framework that explicitly shapes temporal reasoning. Our approach progressively develops:

  1. (Stage 1: SFT for Warmup Adaption) Foundational skills through supervised fine-tuning, where LLMs learn temporal analysis using synthetic CoT data, ensuring proper structure and formatting.
  2. (Stage 2: RL for Exploring Effective Reasoning Patterns) Advanced forecasting via RL, with rewards based on ground truth alignment, multi-horizon accuracy, and domain principles. GRIP (Group-based Relative Importance for Policy Optimization) enhances reasoning paths through non-uniform sampling and adaptive weighting.

GRIP Optimization Framework.

Experiments show that Time-R1 significantly improves forecasting accuracy and generalization across multiple real-world datasets.

📚 Released Resources

⚙️ Key Features

  • Slow-Thinking Time Series Reasoning: Trains LLMs to perform deliberate, step-by-step temporal analysis for forecasting tasks.
  • Two-Stage RFT Framework: Combines warm-up supervised fine-tuning (SFT) with reinforcement learning (RL) for progressive capability building.
  • GRIP: Group-based Reward Optimization: Introduces non-uniform sampling and adaptive weighting to enhance reasoning path exploration and model robustness.
  • Fine-Grained Multi-Objective Rewards: Designed to improve temporal coherence, multi-horizon accuracy, and alignment with domain-specific forecasting principles.
  • Strong Forecasting Performance: Extensive experiments on real-world datasets demonstrate significant improvements over baseline methods through the slow-thinking paradigm.

🚀 Quick Start

Installation

We recommend using Python 3.10+ and setting up a clean environment via conda, with system pre-requisites including CUDA ≥ 12.4 and cuDNN ≥ 9.8.0 before training or inference.

  • CUDA: Version ≥ 12.4
  • cuDNN: Version ≥ 9.8.0
conda create -n time-r1 python==3.10
conda activate time-r1

git clone https://github.com/lqzxt/Time-R1.git

# Install verl framework
cd Time-R1
pip install --no-deps -e .
pip install -r requirements.txt

🛠️ Training

Time-R1 RL Training

# Run training
bash scripts/time-r1.sh

📈 Evaluation

cd Time-R1/eval
python main.py

🙏Acknowledgements

  • 🧠 Verl: A flexible and efficient RLHF framework used for reinforcement learning with GRIP.
  • 🦙 LLaMA Factory: Streamlined interface for supervised fine-tuning and RLHF on open LLMs.
  • 🔢 Qwen2.5 Models: Open-source LLMs that serve as our forecasting backbone.
  • 🔧 vLLM: High-throughput LLM inference engine used during RL rollout.

About

Time-R1 is a two-stage reinforcement fine-tuning framework that trains large language models to perform slow-thinking, step-by-step reasoning for accurate and explainable time series forecasting.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0