GraphRAG is a popular 🔥🔥🔥 and powerful 💪💪💪 RAG system! 🚀💡 Inspired by systems like Microsoft's, graph-based RAG is unlocking endless possibilities in AI.
Our project focuses on modularizing and decoupling these methods 🧩 to unveil the mystery 🕵️♂️🔍✨ behind them and share fun and valuable insights! 🤩💫 Our project🔨 is included in Awesome Graph-based RAG.
-
If you find our work helpful, please kindly cite our paper.
-
Download the datasets GraphRAG-dataset
# Clone the repository from GitHub
git clone https://github.com/JayLZhou/GraphRAG.git
cd GraphRAG
You can run different GraphRAG methods by specifying the corresponding configuration file (.yaml
).
python main.py -opt Option/Method/RAPTOR.yaml -dataset_name your_dataset
The following methods are available, and each can be run using the same command format:
python main.py -opt Option/Method/<METHOD>.yaml -dataset_name your_dataset
Replace <METHOD>
with one of the following:
Dalk
GR
LGraphRAG
(Local search in GraphRAG)GGraphRAG
(Global search in GraphRAG)HippoRAG
KGP
LightRAG
RAPTOR
ToG
For example, to run GraphRAG
:
python main.py -opt Option/Method/GraphRAG.yaml -dataset_name your_dataset
Ensure you have the required dependencies installed (The default experiment name is digimon):
conda env create -f experiment.yml -n your_experiment_name
GraphRAG supports both cloud-based and local deployment of LLMs:
- Cloud-based models: OpenAI (e.g.,
gpt-4
,gpt-3.5-turbo
) - Locally deployed models:
Ollama
andLlamaFactory
To use a local model, set api_type
to open_llm
in the configuration file.
llm:
api_type: "openai/open_llm" # Options: "openai" or "open_llm" (For Ollama and LlamaFactory)
model: "YOUR_LOCAL_MODEL_NAME"
base_url: "YOUR_LOCAL_URL" # Change this for local models
api_key: "YOUR_API_KEY" # Not required for local models
For LlamaFactory
or Ollama
, ensure the model is correctly installed and running in your local environment.
You can refer to the Readme of LlamaFactory
llm:
api_type: "open_llm" # Options: "openai" or "open_llm" (For Ollama and LlamaFactory)
model: "YOUR_LOCAL_MODEL_NAME"
base_url: "YOUR_LOCAL_URL" # Change this for local models
api_key: "ANY_THING_IS_OKAY" # Not required for local models
We select the following Graph RAG methods:
Based on the entity and relation, we categorize the graph into the following types:
- Chunk Tree: A tree structure formed by document content and summary.
- Passage Graph: A relational network composed of passages, tables, and other elements within documents.
- KG: knowledge graph (KG) is constructed by extracting entities and relationships from each chunk, which contains only entities and relations, is commonly represented as triples.
- TKG: A textual knowledge graph (TKG) is a specialized KG (following the same construction step as KG), which enriches entities with detailed descriptions and type information.
- RKG: A rich knowledge graph (RKG), which further incorporates keywords associated with relations.
The criteria for the classification of graph types are as follows:
Graph Attributes | Chunk Tree | Passage Graph | KG | TKG | RKG |
---|---|---|---|---|---|
Original Content | ✅ | ✅ | ❌ | ❌ | ❌ |
Entity Name | ❌ | ❌ | ✅ | ✅ | ✅ |
Entity Type | ❌ | ❌ | ❌ | ✅ | ✅ |
Entity Description | ❌ | ❌ | ❌ | ✅ | ✅ |
Relation Name | ❌ | ❌ | ✅ | ❌ | ✅ |
Relation keyword | ❌ | ❌ | ❌ | ❌ | ✅ |
Relation Description | ❌ | ❌ | ❌ | ✅ | ✅ |
Edge Weight | ❌ | ❌ | ✅ | ✅ | ✅ |
The retrieval stage lies the key role
‼️ in the entire GraphRAG process. ✨ The goal is to identify query-relevant content that supports the generation phase, enabling the LLM to provide more accurate responses.
💡💡💡 After thoroughly reviewing all implementations, we've distilled them into a set of 16 operators 🧩🧩. Each method then constructs its retrieval module by combining one or more of these operators 🧩.
We classify the operators into five categories, each offering a different way to retrieve and structure relevant information from graph-based data.
Retrieve entities (e.g., people, places, organizations) that are most relevant to the given query.
Name | Description | Example Methods |
---|---|---|
VDB | Select top-k nodes from the vector database | G-retriever, RAPTOR, KGP |
RelNode | Extract nodes from given relationships | LightRAG |
PPR | Run PPR on the graph, return top-k nodes with PPR scores | FastGraphRAG |
Agent | Utilizes LLM to find the useful entities | ToG |
Onehop | Selects the one-hop neighbor entities of the given entities | LightRAG |
Link | Return top-1 similar entity for each given entity | HippoRAG |
TF-IDF | Rank entities based on the TF-IFG matrix | KGP |
Extracting useful relationships for the given query.
Name | Description | Example Methods |
---|---|---|
VDB | Retrieve relationships by vector-database | LightRAG、G-retriever |
Onehop | Selects relationships linked by one-hop neighbors of the given selected entities | Local Search for MS GraphRAG |
Aggregator | Compute relationship scores from entity PPR matrix, return top-k | FastGraphRAG |
Agent | Utilizes LLM to find the useful entities | ToG |
Retrieve the most relevant text segments (chunks) related to the query.
Name | Description | Example Methods |
---|---|---|
Aggregator | Use the relationship scores and the relationship-chunk interactions to select the top-k chunks | HippoRAG |
FromRel | Return chunks containing given relationships | LightRAG |
Occurrence | Rank top-k chunks based on occurrence of both entities in relationships | Local Search for MS GraphRAG |
Extract a relevant subgraph for the given query
Name | Description | Example Methods |
---|---|---|
KhopPath | Find k-hop paths with start and endpoints in the given entity set | DALK |
Steiner | Compute Steiner tree based on given entities and relationships | G-retriever |
AgentPath | Identify the most relevant 𝑘-hop paths to a given question, by using LLM to filter out the irrelevant paths | TOG |
Identify high-level information, which is only used for MS GraphRAG.
Name | Description | Example Methods |
---|---|---|
Entity | Detects communities containing specified entities | Local Search for MS GraphRAG |
Layer | Returns all communities below a required layer | Global Search for MS GraphRAG |
You can freely 🪽 combine those operators 🧩 to create more and more GraphRAG methods.
Below, we present some examples illustrating how existing algorithms leverage these operators.
Name | Operators |
---|---|
HippoRAG | Chunk (Aggregator) |
LightRAG | Chunk (FromRel) + Entity (RelNode) + Relationship (VDB) |
FastGraphRAG | Chunk (Aggregator) + Entity (PPR) + Relationship (Aggregator) |
- Detailed readme
- Support RoG, PathRAG, etc.
- Provide a docker image for easy deployment.
- Support more LLMs, such as AZURE.
If you find this work useful, please consider citing our papers:
@article{zhou2025depth,
title={In-depth Analysis of Graph-based RAG in a Unified Framework},
author={Zhou, Yingli and Su, Yaodong and Sun, Youran and Wang, Shu and Wang, Taotao and He, Runyuan and Zhang, Yongwei and Liang, Sicong and Liu, Xilin and Ma, Yuchi and others},
journal={arXiv preprint arXiv:2503.04338},
year={2025}
}