GitHub - Dradebo/pixeltable: Pixeltable — AI Data infrastructure providing a declarative, incremental approach for multimodal workloads.

Declarative Data Infrastructure for Multimodal AI Apps

Pixeltable is the only Python framework that provides incremental storage, transformation, indexing, and orchestration of your multimodal data.

😩 Maintaining Production-Ready Multimodal AI Apps is Still Too Hard

Building robust AI applications, especially multimodal ones, requires stitching together numerous tools:

ETL pipelines for data loading and transformation.
Vector databases for semantic search.
Feature stores for ML models.
Orchestrators for scheduling.
Model serving infrastructure for inference.
Separate systems for parallelization, caching, versioning, and lineage tracking.

This complex "data plumbing" slows down development, increases costs, and makes applications brittle and hard to reproduce.

💾 Installation

pip install pixeltable

Pixeltable is a database. It stores metadata and computed results persistently, typically in a .pixeltable directory in your workspace. See configuration options for your setup.

✨ What is Pixeltable?

With Pixeltable, you define your entire data processing and AI workflow declaratively using computed columns on tables. Pixeltable's engine then automatically handles:

Data Ingestion & Storage: References files (images, videos, audio, docs) in place, handles structured data.
Transformation & Processing: Applies any Python function (UDFs) or built-in operations (chunking, frame extraction) automatically.
AI Model Integration: Runs inference (embeddings, object detection, LLMs) as part of the data pipeline.
Indexing & Retrieval: Creates and manages vector indexes for fast semantic search alongside traditional filtering.
Incremental Computation: Only recomputes what's necessary when data or code changes, saving time and cost.
Versioning & Lineage: Automatically tracks data and schema changes for reproducibility.

Focus on your application logic, not the infrastructure.

🚀 Key Features

Unified Multimodal Interface: pxt.Image, pxt.Video, pxt.Audio, pxt.Document, etc. – manage diverse data consistently.

t = pxt.create_table(
  'media', 
  {
      'img': pxt.Image, 
      'video': pxt.Video
  }
)

Declarative Computed Columns: Define processing steps once; they run automatically on new/updated data.

t.add_computed_column(
  classification=huggingface.vit_for_image_classification(
      t.image
  )
)

Built-in Vector Search: Add embedding indexes and perform similarity searches directly on tables/views.

t.add_embedding_index(
  'img', 
  embedding=clip.using(
      model_id='openai/clip-vit-base-patch32'
  )
)

sim = t.img.similarity("cat playing with yarn")

On-the-Fly Data Views: Create virtual tables using iterators for efficient processing without data duplication.

frames = pxt.create_view(
  'frames', 
  videos, 
  iterator=FrameIterator.create(
      video=videos.video, 
      fps=1
  )
)

Seamless AI Integration: Built-in functions for OpenAI, Anthropic, Hugging Face, CLIP, YOLOX, and more.

t.add_computed_column(
  response=openai.chat_completions(
      messages=[{"role": "user", "content": t.prompt}]
  )
)

Bring Your Own Code: Extend Pixeltable with simple Python User-Defined Functions.

@pxt.udf
def format_prompt(context: list, question: str) -> str:
    return f"Context: {context}\nQuestion: {question}"

Agentic Workflows / Tool Calling: Register @pxt.udf or @pxt.query functions as tools and orchestrate LLM-based tool use (incl. multimodal).

# Example tools: a UDF and a Query function for RAG
tools = pxt.tools(get_weather_udf, search_context_query)

# LLM decides which tool to call; Pixeltable executes it
t.add_computed_column(
     tool_output=invoke_tools(tools, t.llm_tool_choice)
)

Persistent & Versioned: All data, metadata, and computed results are automatically stored.

t.revert()  # Revert to a previous version
stored_table = pxt.get_table('my_existing_table')  # Retrieve persisted table

SQL-like Python Querying: Familiar syntax combined with powerful AI capabilities.

results = (
  t.where(t.score > 0.8)
  .order_by(t.timestamp)
  .select(t.image, score=t.score)
  .limit(10)
  .collect()
)

💡 Key Examples

(See the Full Quick Start or Notebook Gallery for more details)

1. Multimodal Data Store and Data Transformation (Computed Column):

pip install pixeltable

import pixeltable as pxt

# Create a table
t = pxt.create_table(
    'films', 
    {'name': pxt.String, 'revenue': pxt.Float, 'budget': pxt.Float}, 
    if_exists="replace"
)

t.insert([
  {'name': 'Inside Out', 'revenue': 800.5, 'budget': 200.0},
  {'name': 'Toy Story', 'revenue': 1073.4, 'budget': 200.0}
])

# Add a computed column for profit - runs automatically!
t.add_computed_column(profit=(t.revenue - t.budget), if_exists="replace")

# Query the results
print(t.select(t.name, t.profit).collect())
# Output includes the automatically computed 'profit' column

2. Object Detection with YOLOX:

pip install pixeltable pixeltable-yolox

import PIL
import pixeltable as pxt
from yolox.models import Yolox
from yolox.data.datasets import COCO_CLASSES

t = pxt.create_table('image', {'image': pxt.Image}, if_exists='replace')

# Insert some images
prefix = 'https://upload.wikimedia.org/wikipedia/commons'
paths = [
    '/1/15/Cat_August_2010-4.jpg',
    '/e/e1/Example_of_a_Dog.jpg',
    '/thumb/b/bf/Bird_Diversity_2013.png/300px-Bird_Diversity_2013.png'
]
t.insert({'image': prefix + p} for p in paths)

@pxt.udf
def detect(image: PIL.Image.Image) -> list[str]:
    model = Yolox.from_pretrained("yolox_s")
    result = model([image])
    coco_labels = [COCO_CLASSES[label] for label in result[0]["labels"]]
    return coco_labels

t.add_computed_column(classification=detect(t.image))

print(t.select().collect())

3. Image Similarity Search (CLIP Embedding Index):

pip install pixeltable sentence-transformers

import pixeltable as pxt
from pixeltable.functions.huggingface import clip

# Create image table and add sample images
images = pxt.create_table('my_images', {'img': pxt.Image}, if_exists='replace')
images.insert([
    {'img': 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/68/Orange_tabby_cat_sitting_on_fallen_leaves-Hisashi-01A.jpg/1920px-Orange_tabby_cat_sitting_on_fallen_leaves-Hisashi-01A.jpg'},
    {'img': 'https://upload.wikimedia.org/wikipedia/commons/d/d5/Retriever_in_water.jpg'}
])

# Add CLIP embedding index for similarity search
images.add_embedding_index(
    'img',
    embedding=clip.using(model_id='openai/clip-vit-base-patch32')
)

# Text-based image search
query_text = "a dog playing fetch"
sim_text = images.img.similarity(query_text)
results_text = images.order_by(sim_text, asc=False).limit(3).select(
    image=images.img, similarity=sim_text
).collect()
print("--- Text Query Results ---")
print(results_text)

# Image-based image search
query_image_url = 'https://upload.wikimedia.org/wikipedia/commons/thumb/7/7a/Huskiesatrest.jpg/2880px-Huskiesatrest.jpg'
sim_image = images.img.similarity(query_image_url)
results_image = images.order_by(sim_image, asc=False).limit(3).select(
    image=images.img, similarity=sim_image
).collect()
print("--- Image URL Query Results ---")
print(results_image)

4. Multimodal/Incremental RAG Workflow (Document Chunking & LLM Call):

pip install pixeltable openai spacy sentence-transformers

python -m spacy download en_core_web_sm

import pixeltable as pxt
import pixeltable.functions as pxtf
from pixeltable.functions import openai, huggingface
from pixeltable.iterators import DocumentSplitter

# Manage your tables by directories
directory = "my_docs"
pxt.drop_dir(directory, if_not_exists="ignore", force=True)
pxt.create_dir("my_docs")

# Create a document table and add a PDF
docs = pxt.create_table(f'{directory}.docs', {'doc': pxt.Document})
docs.insert([{'doc': 'https://github.com/pixeltable/pixeltable/raw/release/docs/resources/rag-demo/Jefferson-Amazon.pdf'}])

# Create chunks view with sentence-based splitting
chunks = pxt.create_view(
    'doc_chunks',
    docs,
    iterator=DocumentSplitter.create(document=docs.doc, separators='sentence')
)

# Explicitly create the embedding function object
embed_model = huggingface.sentence_transformer.using(model_id='all-MiniLM-L6-v2')
# Add embedding index using the function object
chunks.add_embedding_index('text', string_embed=embed_model)

# Define query function for retrieval - Returns a DataFrame expression
@pxt.query
def get_relevant_context(query_text: str, limit: int = 3):
    sim = chunks.text.similarity(query_text)
    # Return a list of strings (text of relevant chunks)
    return chunks.order_by(sim, asc=False).limit(limit).select(chunks.text)

# Build a simple Q&A table
qa = pxt.create_table(f'{directory}.qa_system', {'prompt': pxt.String})

# 1. Add retrieved context (now a list of strings)
qa.add_computed_column(context=get_relevant_context(qa.prompt))

# 2. Format the prompt with context
qa.add_computed_column(
    final_prompt=pxtf.string.format(
        """
        PASSAGES: 
        {0}
        
        QUESTION: 
        {1}
        """, 
        qa.context, 
        qa.prompt
    )
)

# 4. Generate the answer using the well-formatted prompt column
qa.add_computed_column(
    answer=openai.chat_completions(
        model='gpt-4o-mini',
        messages=[{
            'role': 'user',
            'content': qa.final_prompt
        }]
    ).choices[0].message.content
)

# Ask a question and get the answer
qa.insert([{'prompt': 'What can you tell me about Amazon?'}])
print("--- Final Answer ---")
print(qa.select(qa.answer).collect())

📚 Notebook Gallery

Explore Pixeltable's capabilities interactively:

Topic	Notebook	Topic	Notebook
Fundamentals		Integrations
10-Min Tour		OpenAI
Tables & Ops		Anthropic
UDFs		Together AI
Embedding Index		Label Studio
External Files		Mistral
Use Cases		Sample Apps
RAG Demo		Multimodal Agent
Object Detection		Image/Text Search
Audio Transcription		Discord Bot

🔮 Roadmap (2025)

Cloud Infrastructure and Deployment

We're working on a hosted Pixeltable service that will:

Enable Multimodal Data Sharing of Pixeltable Tables and Views
Provide a persistent cloud instance
Turn Pixeltable workflows (Tables, Queries, UDFs) into API endpoints/MCP Servers

🤝 Contributing

We love contributions! Whether it's reporting bugs, suggesting features, improving documentation, or submitting code changes, please check out our Contributing Guide and join the Discussions or our Discord Server.

🏢 License

Pixeltable is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 792 Commits
.github		.github
docs		docs
pixeltable		pixeltable
scripts		scripts
tests		tests
tool		tool
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Declarative Data Infrastructure for Multimodal AI Apps

😩 Maintaining Production-Ready Multimodal AI Apps is Still Too Hard

💾 Installation

✨ What is Pixeltable?

🚀 Key Features

💡 Key Examples

📚 Notebook Gallery

🔮 Roadmap (2025)

Cloud Infrastructure and Deployment

🤝 Contributing

🏢 License

About

Uh oh!

Releases

Packages

Languages

License

Dradebo/pixeltable

Folders and files

Latest commit

History

Repository files navigation

Declarative Data Infrastructure for Multimodal AI Apps

😩 Maintaining Production-Ready Multimodal AI Apps is Still Too Hard

💾 Installation

✨ What is Pixeltable?

🚀 Key Features

💡 Key Examples

📚 Notebook Gallery

🔮 Roadmap (2025)

Cloud Infrastructure and Deployment

🤝 Contributing

🏢 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages