🤖 Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning 🚀

Introduction 🌟

Agentic RAG‑R1 is an open‑source initiative to build an Agentic Retrieval‑Augmented Generation (RAG) system by endowing a base language model with autonomous search & reasoning skills through reinforcement learning (currently using the GRPO algorithm).

Chinese Language Version:

English Language Version:

What is Agentic RAG? 💡

Agentic RAG combines two powerful concepts:

Retrieval‑Augmented Generation (RAG): Combines generative power with on‑the‑fly retrieval from external knowledge bases, ensuring factual and up‑to‑date answers.
Agentic AI: Gives the model the ability to decide when to retrieve, what to retrieve, and how to weave the retrieved evidence into its reasoning.

Architecture 🏗️

Our architecture is inspired by TC‑RAG and features an agent memory stack that orchestrates the full deliberation loop, supporting the following actions:

Plan (❌)
Reasoning (✅)
Backtrack (✅)
Summary (✅)
Tool Observation – wiki/document/knowledge‑graph search, etc. (✅)
Conclusion (✅)

Training Strategy 🧠

Motivated by DeepSeek-R1, we apply GRPO (Generalized Relevance Policy Optimization) to reinforce the agent's choice of reasoning steps and retrieval actions, effectively boosting both search depth and answer quality.

Rollout Generation 🔄

Installation 🛠️

We use conda to manage the environment. Follow these steps to set up:

conda create -n AgenticRAG python=3.11 -y
conda activate AgenticRAG 
pip install -r requirements.txt

Tools Environment (Optional) 🧰

We provide our search tool repository ArtSearch as the search engine, which supports retrieval of information from Wikipedia. You can follow the instructions in that repository to deploy a local instance of the search system.

Folder Structure 📁

.
├── ArtSearch                 # Search tool integration
├── checkpoints               # Model checkpoints
├── examples                  # Example use cases
├── experiments
│   ├── evaluation            # Evaluation scripts and results
│   └── training              # Training configurations
├── README.md
├── requirements.txt
├── script
│   ├── evaluation            # Evaluation scripts
│   ├── run_server.sh         # Server deployment script
│   └── training              # Training scripts
├── service
│   ├── chat_client.py        # Client for interacting with the model
│   └── chat_server.py        # Server for hosting the model
├── src
│   ├── config                # Configuration files
│   ├── data                  # Data processing utilities
│   ├── evaluation            # Evaluation metrics and tools
│   ├── models                # Model definitions
│   ├── train.py              # Main training script
│   └── utils                 # Utility functions

Quick Start ⚡

Follow the steps below to get up and running with Agentic RAG‑R1.

Before you start, rename file ".env_format" to ".env" and fill the necessary os enviroment variables.

Training

Zero‑2 Mode

./script/training/train_zero2.sh

Zero‑3 Mode

./script/training/train_zero3.sh

Inference

Example Mode

comming soon~

Server Mode

Launch the chat server:

./script/run_server.sh

Features ✨

LoRA Tuning Support 🔧: Fine-tune efficiently with Low-Rank Adaptation
Model Quant Support 💻: Support model quant to nf4 and ..
Custom Agent Tools 🛠️: Integrate your own tools and personal RAG datasets
Distributed Training 🌐: Support for Deepspeed Zero 2 Stage and Zero 3 Stage
Efficient Resource Usage 💻: Support for models up to 32B parameters using only 2 A100 GPUs
Tool Calling Reward 🎯: Enhanced reward model that includes:
- Accuracy reward
- Format reward
- RAG accuracy reward using the RAGAS framework
The total reward is calculated as:

$$r_{total} = r_{accuracy} + r_{format} + r_{rag}$$
TCRAG Integration 🔗: Use TCRAG as the rollout generator

Results 📊

Experiment Log on Qwen 2.5-7B-Instruct

We have made our training logs publicly available at: SwanLab Training Log

Results on MedQA Test Set 🏥

Our Qwen 2.5-7B-Instruct model was evaluated on the MedQA test set using Qwen‑2.5‑72B as the judge:

Configuration	Format Accuracy	Answer Accuracy
Before fine-tuning	39%	84%
Before fine-tuning + search	56%	79%
After fine-tuning (200 steps) + search	92%	87%

Roadmap 🗺️

Add more tools
[Additional planned features]

Acknowledgements 🙏

The concept of Agentic-RAG-R1 is inspired by Deepseek-R1 and TC-RAG. We sincerely appreciate the efforts of these teams for their contributions to open-source research and development. This work is in the same period as work with Search-R1 and ReSearch.

Contributors📝

Supervisors: Junfeng Zhao, Xu Chu, Yasha Wang

Affiliation: Key Laboratory of High Confidence Software Technologies (Peking University), School of Computer Science, Peking University, China

Citation 📝

If you use this work in your research, please cite:

@misc{Agentic_RAG_R1,
  title       = {Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning},
  author      = {Xinke Jiang, Jiaran Gao, Rihong Qiu, Wentao Zhang, Yue Fang, Hongxin Ding, Yifan Dai},
  year        = {2025},
  howpublished= {\url{https://github.com/jiangxinke/Agentic-RAG-R1}},
  note        = {GitHub repository},
}

🌟 Star History

License 📄

This project is licensed under the Apache License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning 🚀

Table of Contents

Introduction 🌟

What is Agentic RAG? 💡

Architecture 🏗️

Training Strategy 🧠

Rollout Generation 🔄

Installation 🛠️

Tools Environment (Optional) 🧰

Folder Structure 📁

Quick Start ⚡

Training

Inference

Features ✨

Results 📊

Experiment Log on Qwen 2.5-7B-Instruct

Results on MedQA Test Set 🏥

Roadmap 🗺️

Acknowledgements 🙏

Contributors📝

Citation 📝

🌟 Star History

License 📄

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
ArtSearch @ 6d2c304		ArtSearch @ 6d2c304
examples		examples
script		script
service		service
src		src
.env_format		.env_format
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt
run		run
setup.py		setup.py

License

jiangxinke/Agentic-RAG-R1

Folders and files

Latest commit

History

Repository files navigation

🤖 Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning 🚀

Table of Contents

Introduction 🌟

What is Agentic RAG? 💡

Architecture 🏗️

Training Strategy 🧠

Rollout Generation 🔄

Installation 🛠️

Tools Environment (Optional) 🧰

Folder Structure 📁

Quick Start ⚡

Training

Inference

Features ✨

Results 📊

Experiment Log on Qwen 2.5-7B-Instruct

Results on MedQA Test Set 🏥

Roadmap 🗺️

Acknowledgements 🙏

Contributors📝

Citation 📝

🌟 Star History

License 📄

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Languages

Packages