LLM Applications

An end-to-end guide for scaling and serving LLM application in production.

This repo currently contains one such application: a retrieval-augmented generation (RAG) app for answering questions about supplied information. By default, the app uses the Ray documentation as the source of information. This app first indexes the documentation in a vector database and then uses an LLM to generate responses for questions that got augmented with relevant info retrieved from the index.

Setup

Compute

Start a new Anyscale workspace on staging using an g3.8xlarge head node on an AWS cloud.
Use the default_cluster_env_2.6.2_py39 cluster environment.

Repository

First, clone this repository.

git clone https://github.com/ray-project/llm-applications.git .

Environment

Then set up the environment correctly by specifying the values in your .env file, and installing the dependencies:

cp ./envs/.env_template .envs
source .envs
pip install --user -r requirements.txt
pre-commit install
pre-commit autoupdate

Data

Our data is already ready at /efs/shared_storage/pcmoritz/docs.ray.io/en/master/ (on Staging) but if you wanted to load it yourself, run this bash command:

bash scrape-docs.sh

Vector DB

Local installation with brew on MacOS

brew install postgresql
brew install pgvector
psql -c "CREATE USER postgres WITH SUPERUSER;"
# pragma: allowlist nextline secret
psql -c "ALTER USER postgres with password 'postgres';"
psql -c "CREATE EXTENSION vector;"
psql -f migrations/initial.sql
python app/index.py create-index

bash setup-pgvector.sh
sudo -u postgres psql -f migrations/initial.sql
python app/index.py create-index

Query

Just a sample and uses the current index that's been created.

import json
from app.query import QueryAgent
query = "What is the default batch size for map_batches?"
system_content = "Your job is to answer a question using the additional context provided."
agent = QueryAgent(
    embedding_model="thenlper/gte-base",
    llm="meta-llama/Llama-2-7b-chat-hf",
    max_context_length=4096,
    system_content=system_content,
)
result = agent.get_response(query=query)
print(json.dumps(result, indent=2))

Experiments

Generate responses

python app/main.py generate-responses \
    --system-content "Answer the {query} using the additional {context} provided."

Evaluate responses

python app/main.py evaluate-responses \
    --system-content """
    Your job is to rate the quality of our generated answer {generated_answer}
    given a query {query} and a reference answer {reference_answer}.
    Your score has to be between 1 and 5.
    You must return your response in a line with only the score.
    Do not return answers in any other format.
    On a separate line provide your reasoning for the score as well.
    """

Dashboard

export APP_PORT=8501
echo https://$APP_PORT-port-$ANYSCALE_SESSION_DOMAIN
streamlit run dashboard/Home.py

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
app		app
dashboard		dashboard
datasets		datasets
envs		envs
experiments		experiments
migrations		migrations
notebooks		notebooks
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
scrape-docs.sh		scrape-docs.sh
setup-pgvector.sh		setup-pgvector.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Applications

Setup

Compute

Repository

Environment

Data

Vector DB

Query

Experiments

Generate responses

Evaluate responses

Dashboard

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

ray-project/llm-applications

Folders and files

Latest commit

History

Repository files navigation

LLM Applications

Setup

Compute

Repository

Environment

Data

Vector DB

Query

Experiments

Generate responses

Evaluate responses

Dashboard

TODO

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages