10000 GitHub - cning112/fastflight: FastFlight is a high-performance data transfer framework using Apache Arrow Flight for efficient, modular, and pluggable data streaming with optional FastAPI integration for HTTP-based access.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

FastFlight is a high-performance data transfer framework using Apache Arrow Flight for efficient, modular, and pluggable data streaming with optional FastAPI integration for HTTP-based access.

License

Notifications You must be signed in to change notification settings

cning112/fastflight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Ask DeepWiki

FastFlight πŸš€

FastFlight is a framework built on Apache Arrow Flight, designed to simplify high-performance data transfers while improving usability, integration, and developer experience.

It addresses common challenges with native Arrow Flight, such as opaque request formats, debugging difficulties, complex async management, and REST API incompatibility. FastFlight makes it easier to adopt Arrow Flight in existing systems.

✨ Key Advantages

βœ… Typed Param Classes – All data requests are defined via structured, type-safe parameter classes. Easy to debug and validate.
βœ… Service Binding via param_type – Clean and explicit mapping from param class β†’ data service. Enables dynamic routing and REST support.
βœ… Async & Streaming Ready – async for support with non-blocking batch readers. Ideal for high-throughput systems.
βœ… REST + Arrow Flight – Use FastAPI to expose Arrow Flight services as standard REST endpoints (e.g., /stream).
βœ… Plug-and-Play Data Sources – Includes a DuckDB demo example to help you get started quicklyβ€”extending to other sources (SQL, CSV, etc.) is straightforward.
βœ… Built-in Registry & Validation – Automatic binding discovery and safety checks. Fail early if service is missing.
βœ… Pandas / PyArrow Friendly – Streamlined APIs for transforming results into pandas DataFrame or Arrow Table.
βœ… CLI-First – Unified command line to launch, test, and inspect services.

FastFlight is ideal for high-throughput data systems, real-time querying, log analysis, and financial applications.


πŸš€ Quick Start

1️⃣ Install FastFlight

pip install "fastflight[all]"

or use uv

uv add "fastflight[all]"

2️⃣ Start the Server

# Start both FastFlight and REST API servers
fastflight start-all --flight-location grpc://0.0.0.0:8815 --rest-host 0.0.0.0 --rest-port 8000

This launches both gRPC and REST servers, allowing you to use REST APIs while streaming data via Arrow Flight.

3️⃣ Test with Demo Service

# Example REST API call to DuckDB demo service
curl -X POST "http://localhost:8000/fastflight/stream" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "fastflight.demo_services.duckdb_demo.DuckDBParams",
    "database_path": ":memory:",
    "query": "SELECT 1 as test_column",
    "parameters": []
  }'

🎯 Using the CLI

FastFlight provides a command-line interface (CLI) for easy management of Arrow Flight and REST API servers.

Start Individual Services

# Start only the FastFlight server
fastflight start-flight-server --flight-location grpc://0.0.0.0:8815

# Start only the REST API server
fastflight start-rest-server --rest-host 0.0.0.0 --rest-port 8000 --flight-location grpc://0.0.0.0:8815

Start Both Services

fastflight start-all --flight-location grpc://0.0.0.0:8815 --rest-host 0.0.0.0 --rest-port 8000

Important: When using the /stream REST endpoint, ensure the type field is included in the request body for proper service routing.


🐳 Docker Deployment

Quick Start with Docker Compose

# Development setup (both servers in one container)
docker-compose --profile dev up

# Production setup (separated services)
docker-compose up

# Background mode
docker-compose up -d

Manual Docker Commands

# Run both servers
docker run -p 8000:8000 -p 8815:8815 fastflight:latest start-all

# Run only FastFlight server
docker run -p 8815:8815 fastflight:latest start-flight-server

# Run only REST API server
docker run -p 8000:8000 fastflight:latest start-rest-server

See Docker Guide for complete deployment options and configuration.


πŸ’‘ Usage Examples

For comprehensive examples, see the examples/ directory which includes:

Python Client Example

from fastflight import FastFlightBouncer
from fastflight.demo_services.duckdb_demo import DuckDBParams

# Create client
client = FastFlightBouncer("grpc://localhost:8815")

# Define query parameters
params = DuckDBParams(
    database_path=":memory:",
    query="SELECT 1 as test_column, 'hello' as message",
    parameters=[]
)

# Fetch data as Arrow Table
table = client.get_pa_table(params)
print(f"Received {len(table)} rows")

# Convert to Pandas DataFrame
df = table.to_pandas()
print(df)

Async Streaming Example

import asyncio
from fastflight import FastFlightBouncer


async def stream_data():
    client = FastFlightBouncer("grpc://localhost:8815")

    async for batch in client.aget_record_batches(params):
        print(f"Received batch with {batch.num_rows} rows")
        # Process batch incrementally


asyncio.run(stream_data())

πŸ“– Documentation


πŸ›  Custom Data Services

FastFlight supports extending to custom data sources. See Data Service Developer Guide for implementation details.


πŸ›  Future Plans

βœ… Structured Ticket System (Completed)
βœ… Async & Streaming Support (Completed)
βœ… REST API Adapter (Completed)
βœ… CLI Support (Completed)
βœ… Enhanced Error Handling & Resilience (Completed)
πŸ”„ Support for More Data Sources (SQL, NoSQL, Kafka) (In Progress)
πŸ”„ Performance Benchmarking Tools (In Progress)
πŸ”„ Production Monitoring & Observability (Planned)

Contributions are welcome! If you have suggestions or improvements, feel free to submit an Issue or PR. πŸš€


πŸ“œ License

This project is licensed under the MIT License.


πŸš€ Ready to accelerate your data transfers? Get started today!

About

FastFlight is a high-performance data transfer framework using Apache Arrow Flight for efficient, modular, and pluggable data streaming with optional FastAPI integration for HTTP-based access.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  
0