FastFlight is a framework built on Apache Arrow Flight, designed to simplify high-performance data transfers while improving usability, integration, and developer experience.
It addresses common challenges with native Arrow Flight, such as opaque request formats, debugging difficulties, complex async management, and REST API incompatibility. FastFlight makes it easier to adopt Arrow Flight in existing systems.
β
Typed Param Classes β All data requests are defined via structured, type-safe parameter classes. Easy to debug and
validate.
β
Service Binding via param_type
β Clean and explicit mapping from param class β data service. Enables dynamic
routing and REST support.
β
Async & Streaming Ready β async for
support with non-blocking batch readers. Ideal for high-throughput
systems.
β
REST + Arrow Flight β Use FastAPI to expose Arrow Flight services as standard REST endpoints (e.g., /stream
).
β
Plug-and-Play Data Sources β Includes a DuckDB demo example to help you get started quicklyβextending to other
sources (SQL, CSV, etc.) is straightforward.
β
Built-in Registry & Validation β Automatic binding discovery and safety checks. Fail early if service is
missing.
β
Pandas / PyArrow Friendly β Streamlined APIs for transforming results into pandas DataFrame or Arrow Table.
β
CLI-First β Unified command line to launch, test, and inspect services.
FastFlight is ideal for high-throughput data systems, real-time querying, log analysis, and financial applications.
pip install "fastflight[all]"
or use uv
uv add "fastflight[all]"
# Start both FastFlight and REST API servers
fastflight start-all --flight-location grpc://0.0.0.0:8815 --rest-host 0.0.0.0 --rest-port 8000
This launches both gRPC and REST servers, allowing you to use REST APIs while streaming data via Arrow Flight.
# Example REST API call to DuckDB demo service
curl -X POST "http://localhost:8000/fastflight/stream" \
-H "Content-Type: application/json" \
-d '{
"type": "fastflight.demo_services.duckdb_demo.DuckDBParams",
"database_path": ":memory:",
"query": "SELECT 1 as test_column",
"parameters": []
}'
FastFlight provides a command-line interface (CLI) for easy management of Arrow Flight and REST API servers.
# Start only the FastFlight server
fastflight start-flight-server --flight-location grpc://0.0.0.0:8815
# Start only the REST API server
fastflight start-rest-server --rest-host 0.0.0.0 --rest-port 8000 --flight-location grpc://0.0.0.0:8815
fastflight start-all --flight-location grpc://0.0.0.0:8815 --rest-host 0.0.0.0 --rest-port 8000
Important: When using the /stream
REST endpoint, ensure the type
field is included in the request body for
proper service routing.
# Development setup (both servers in one container)
docker-compose --profile dev up
# Production setup (separated services)
docker-compose up
# Background mode
docker-compose up -d
# Run both servers
docker run -p 8000:8000 -p 8815:8815 fastflight:latest start-all
# Run only FastFlight server
docker run -p 8815:8815 fastflight:latest start-flight-server
# Run only REST API server
docker run -p 8000:8000 fastflight:latest start-rest-server
See Docker Guide for complete deployment options and configuration.
For comprehensive examples, see the examples/
directory which includes:
- Multi-Protocol Demo:
examples/multi_protocol_demo/
- Complete demonstration of FastFlight with both gRPC and REST interfaces - Benchmark Tools:
examples/benchmark/
- Performance measurement and analysis comparing sync vs async operations
from fastflight import FastFlightBouncer
from fastflight.demo_services.duckdb_demo import DuckDBParams
# Create client
client = FastFlightBouncer("grpc://localhost:8815")
# Define query parameters
params = DuckDBParams(
database_path=":memory:",
query="SELECT 1 as test_column, 'hello' as message",
parameters=[]
)
# Fetch data as Arrow Table
table = client.get_pa_table(params)
print(f"Received {len(table)} rows")
# Convert to Pandas DataFrame
df = table.to_pandas()
print(df)
import asyncio
from fastflight import FastFlightBouncer
async def stream_data():
client = FastFlightBouncer("grpc://localhost:8815")
async for batch in client.aget_record_batches(params):
print(f"Received batch with {batch.num_rows} rows")
# Process batch incrementally
asyncio.run(stream_data())
- Data Service Developer Guide β Guide for implementing custom data services
- CLI Guide β Detailed CLI usage instructions
- Docker Deployment β Container deployment and Docker Compose guide
- Error Handling β Comprehensive error handling and resilience patterns
- Technical Details β In-depth implementation details and architecture
- FastAPI Integration β REST API integration guide
FastFlight supports extending to custom data sources. See Data Service Developer Guide for implementation details.
β
Structured Ticket System (Completed)
β
Async & Streaming Support (Completed)
β
REST API Adapter (Completed)
β
CLI Support (Completed)
β
Enhanced Error Handling & Resilience (Completed)
π Support for More Data Sources (SQL, NoSQL, Kafka) (In Progress)
π Performance Benchmarking Tools (In Progress)
π Production Monitoring & Observability (Planned)
Contributions are welcome! If you have suggestions or improvements, feel free to submit an Issue or PR. π
This project is licensed under the MIT License.
π Ready to accelerate your data transfers? Get started today!