This project compares the performance of Anthropic Claude 3.5 with and without caching using Weights & Biases Weave. It analyzes long context scenarios using transcripts from the ThursdAI.news podcast.
To get started with this project, follow these steps:
-
Create a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install the requirements:
pip install -q weave set-env-colab-kaggle-dotenv tqdm ipywidgets requests anthropic
-
Set up environment variables: Copy the
.env.example
file to a new file named.env
:cp .env.example .env
Then, open the
.env
file and add your API keys:- Add your Weights & Biases API key from https://wandb.ai/authorize to
WANDB_API_KEY
- Add your Anthropic API key from https://www.anthropic.com or https://console.anthropic.com to
ANTHROPIC_API_KEY
Your
.env
file should look something like this:WANDB_API_KEY=your_wandb_api_key_here ANTHROPIC_API_KEY=your_anthropic_api_key_here
Note: Keep your
.env
file private and never commit it to version control. - Add your Weights & Biases API key from https://wandb.ai/authorize to
-
Run the Jupyter Notebook: Open and run the
evaluate_claude_long_context_caching.ipynb
notebook to compare Claude 3.5 performance with and without caching.
evaluate_claude_long_context_caching.ipynb
: Main Jupyter notebook for running the comparisondata/*.md
: Transcript files used for analysis (not included in this repository).env
: File for storing your API keys (create this from.env.example
)README.md
: This file, containing project information and setup instructions
This project requires access to the Anthropic API and Weights & Biases. Make sure you have the necessary permissions and API keys before running the notebook.