This project implements a FastAPI server that integrates with Anthropic's Claude API to provide a chat interface enhanced with Retrieval-Augmented Generation (RAG) capabilities. It includes both a backend server and a simple frontend for interacting with Claude. Additionally, it features a powerful CLI tool for document processing and embedding generation using ChromaDB.
- Prerequisites
- Setup
- Configuration
- Running the Server
- Document Processing CLI
- Important Notes
- Development
- License
- Python 3.7+
- An Anthropic API key
- ChromaDB installed and configured
- SentenceTransformers for embedding generation
-
Clone this repository:
git clone https://github.com/moradology/impact-explorer.git cd impact-explorer
-
Setup Python virtual environment
The following will set up a virtual environment and the development dependencies:
python3 -m venv .venv source .venv/bin/activate make install-dev
-
Install dependencies from
pyproject.toml
:make install
If you're developing the project, install with editable mode:
make install-dev
Before running the server or using the CLI, you need to set several environment variables.
-
Anthropic API Key
Set your Anthropic API key as an environment variable:
-
On Unix or macOS:
export ANTHROPIC_API_KEY=your_api_key_here
-
On Windows (Command Prompt):
set ANTHROPIC_API_KEY=your_api_key_here
-
On Windows (PowerShell):
$env:ANTHROPIC_API_KEY = "your_api_key_here"
-
-
ChromaDB Path
Set the path where your ChromaDB database is stored:
-
On Unix or macOS:
export CHROMA_PATH=/path/to/your/chromadb/
-
On Windows (Command Prompt):
set CHROMA_PATH=C:\path\to\your\chromadb
-
On Windows (PowerShell):
$env:CHROMA_PATH = "C:\path\to\your\chromadb"
-
-
Embedding Model Name
Set the name of the embedding model you are using:
-
On Unix or macOS:
export EMBEDDING_MODEL=all-distilroberta-v1
-
On Windows (Command Prompt):
set EMBEDDING_MODEL=all-distilroberta-v1
-
On Windows (PowerShell):
$env:EMBEDDING_MODEL = "all-distilroberta-v1"
Note: The embedding model specified here must match the one used during the document embedding process for ChromaDB.
-
-
Start the FastAPI server:
make run
The server will start running on
http://localhost:8000
. -
Navigate to the Frontend:
Open your browser and go to
http://localhost:8000
. Thetemplates/index.html
will be hosted at the application root (/
). -
Chat Interface:
Use the chat interface to interact with Claude. The server uses Retrieval-Augmented Generation (RAG) to enhance responses with relevant information from your documents.
The project includes a Command Line Interface (CLI) tool for processing documents, generating embeddings, and storing them in a ChromaDB database. This is essential for preparing data for Retrieval-Augmented Generation (RAG) tasks.
- Support for various chunking strategies (e.g., sliding window)
- Flexible embedding model selection
- Integration with ChromaDB for efficient storage and retrieval
- Detailed output statistics for processed documents
To use the document processing CLI:
python src/impact_explorer/cli/document_processor.py --input <input_file_or_directory> --output <output_dir> --chunking-strategy <strategy> [strategy-specific-args] --embedding-model <model_name>
Example command:
python src/impact_explorer/cli/document_processor.py \
--input documents/mobydick.txt \
--output ./moby_db/ \
--chunking-strategy sliding-window \
--chunk_size 100 \
--overlap 20 \
--embedding-model all-distilroberta-v1
This command will:
- Process
documents/mobydick.txt
- Use a sliding window strategy with a window size of 100 and overlap of 20
- Generate embeddings using the
all-distilroberta-v1
model - Store the results in ChromaDB in the
./moby_db/
directory
Important: Ensure that the --embedding-model
parameter matches the EMBEDDING_MODEL
environment variable used by the server.
After processing, the CLI will display statistics about the generated ChromaDB collection, including:
- Total number of documents
- Embedding dimensions
- Metadata fields
- Sample documents
- Document length statistics
The processed data is stored in ChromaDB and can be loaded for RAG queries on FastAPI startup.
-
Embedding Model Consistency: The embedding model used in both the CLI and the server must be the same. This ensures that the embeddings generated for the user's queries are compatible with those stored in ChromaDB.
-
Example:
If you used
all-distilroberta-v1
for generating embeddings in the CLI, set:export EMBEDDING_MODEL=all-distilroberta-v1
-
-
Retrieval-Augmented Generation (RAG):
The server uses the user's query to retrieve the most relevant documents from ChromaDB and includes them in the prompt sent to the LLM. This enhances the response with specific information from your document collection.
To run the server in development mode with auto-reload:
make run