Chatbot

Chatbot is a Python library designed to make it easy for Large Language Models (LLMs) to interact with your data. It is built on top of LangChain and LangGraph 8000 and provides agents and high-level assistants for natural language querying and data visualization.

Note

This library is still under active development. Expect breaking changes, incomplete features, and limited documentation.

Installation

Clone the repository and install it (you can also use poetry or uv instead of pip).

git clone https://github.com/basedosdados/chatbot.git
cd chatbot
pip install .

Assistants

SQLAssistant

The SQLAssistant allows LLMs to interact with your database so you can ask questions about it. All it needs is a LangChain Chat Model, a Context Provider and a Prompt Formatter. The context provider is responsible for providing context about your data to the SQL Agent and the prompt formatter is responsible for building a system prompt for SQL generation.

We provide a default BigQueryContextProvider for retrieving metadata directly from Google BigQuery and a default SQLPromptFormatter. You can supply your own implementation of a context provider and a prompt formatter for custom behaviour.

from langchain.chat_models import init_chat_model

from chatbot.assistants import SQLAssistant
from chatbot.contexts import BigQueryContextProvider
from chatbot.formatters import SQLPromptFormatter

model = init_chat_model("gpt-4o", temperature=0)

# you must point the GOOGLE_APPLICATION_CREDENTIALS
# env variable to your service account JSON file.
context_provider = BigQueryContextProvider(
    billing_project="your billing project",
    query_project="your query project",
)

prompt_formatter = SQLPromptFormatter()

assistant = SQLAssistant(model, context_provider, prompt_formatter)

response = assistant.invoke("hello! what can you tell me about our database?")

You can optionally use a PostgresSaver checkpointer to add short-term memory to your assistant and a VectorStore for few-shot prompting during SQL queries generation:

from langchain.chat_models import init_chat_model

from langchain_postgres import PGVector
from langgraph.checkpoint.postgres import PostgresSaver

from chatbot.assistants import SQLAssistant
from chatbot.contexts import BigQueryContextProvider
from chatbot.formatters import SQLPromptFormatter

model = init_chat_model("gpt-4o", temperature=0)

# you must point the GOOGLE_APPLICATION_CREDENTIALS
# env variable to your service account JSON file.
context_provider = BigQueryContextProvider(
    billing_project="your billing project",
    query_project="your query project",
)

# it could be any combination of
# a langchain vector store and an embedding model
vector_store = PGVector(
    connection="your connection string",
    collection_name="your collection name",
    embedding=OpenAIEmbeddings(
      model="text-embedding-3-small",
    ),
)

prompt_formatter = SQLPromptFormatter(vector_store)

DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres

with PostgresSaver.from_conn_strin(DB_URI) as checkpointer:
    checkpointer.setup()

    assistant = SQLAssistant(
        model=model,
        context_provider=context_provider,
        prompt_formatter=prompt_formatter,
        checkpointer=checkpointer,
    )

    response = assistant.invoke(
        message="hello! what can you tell me about our database?",
        thread_id="some uuid"
    )

An async version is also available: AsyncSQLAssistant.

SQLVizAssistant

SQLVizAssistant extends SQLAssistant by not only retrieving data but also preparing it for visualization. It determines which variables should be plotted to each axis, suggests suitable chart types, and defines metadata such as titles, labels, and legends, without actually rendering the chart. It requires a LangChain Chat Model, a Context Provider, and two separate Prompt Formatters: one for SQL queries generation and another for guiding data preprocessing for visualization.

We provide a default VizPromptFormatter, which is used internally by the Visualization Agent during data preprocessing.

from langchain.chat_models import init_chat_model

from chatbot.assistants import SQLAssistant
from chatbot.contexts import BigQueryContextProvider
from chatbot.formatters import SQLPromptFormatter, VizPromptFormatter

model = init_chat_model("gpt-4o", temperature=0)

# you must point the GOOGLE_APPLICATION_CREDENTIALS
# env variable to your service account JSON file.
context_provider = BigQueryContextProvider(
    billing_project="your billing project",
    query_project="your query project",
)

sql_prompt_formatter = SQLPromptFormatter()
viz_prompt_formatter = VizPromptFormatter()

assistant = SQLVizAssistant(
  model, context_provider, sql_prompt_formatter, viz_prompt_formatter
)

response = assistant.invoke("hello! what can you tell me about our database?")

You can also optionally use a PostgresSaver checkpointer to add short-term memory to your assistant, and provide langchain vector stores for few-shot prompting during both SQL generation and data preprocessing for visualization:

from langchain.chat_models import init_chat_model

from langchain_postgres import PGVector
from langgraph.checkpoint.postgres import PostgresSaver

from chatbot.assistants import SQLAssistant
from chatbot.contexts import BigQueryContextProvider
from chatbot.formatters import SQLPromptFormatter, VizPromptFormatter

model = init_chat_model("gpt-4o", temperature=0)

# you must point the GOOGLE_APPLICATION_CREDENTIALS
# env variable to your service account JSON file.
context_provider = BigQueryContextProvider(
    billing_project="your billing project",
    query_project="your query project",
)

# it could be any combination of
# a langchain vector store and an embedding model
sql_vector_store = PGVector(
    connection="your connection string",
    collection_name="your sql collection name",
    embedding=OpenAIEmbeddings(
      model="text-embedding-3-small",
    ),
)

viz_vector_store = PGVector(
    connection="your connection string",
    collection_name="your viz collection name",
    embedding=OpenAIEmbeddings(
      model="text-embedding-3-small",
    ),
)

sql_prompt_formatter = SQLPromptFormatter(sql_vector_store)
viz_prompt_formatter = VizPromptFormatter(viz_vector_store)

DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres

with PostgresSaver.from_conn_strin(DB_URI) as checkpointer:
    checkpointer.setup()

    assistant = SQLAssistant(
        model=model,
        context_provider=context_provider,
        sql_prompt_formatter=sql_prompt_formatter,
        viz_prompt_formatter=viz_prompt_formatter,
        checkpointer=checkpointer,
    )

    response = assistant.invoke(
        message="hello! what can you tell me about our database?",
        thread_id="some uuid"
    )

An async version is also available: AsyncSQLVizAssistant.

Extensibility

Under the hood, both assistants rely on composable agents:

SQLAgent – Handles database metadata retrieval, query generation and execution.
VizAgent – Handles visualization reasoning.
RouterAgent – Orchestrates SQL querying and data visualization via a multi-agent workflow..

There is also an implementation of a simple ReActAgent with support to custom system prompts and short-term memory, to which you can add an arbitrary set of tools.

You can directly use these agents or use them to create your own workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
.github/workflows		.github/workflows
chatbot		chatbot
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chatbot

Installation

Assistants

SQLAssistant

SQLVizAssistant

Extensibility

About

Uh oh!

Releases 11

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

basedosdados/chatbot

Folders and files

Latest commit

History

Repository files navigation

Chatbot

Installation

Assistants

SQLAssistant

SQLVizAssistant

Extensibility

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages