Hapax

A lightweight HTTP server for Large Language Model (LLM) interactions, built with Go.

Version

v0.0.11

Features

HTTP server with completion endpoint (/v1/completions)
Health check endpoint (/health)
Configurable server settings (port, timeouts, etc.)
Clean shutdown handling
Comprehensive test suite with mock LLM implementation
Token validation with tiktoken
- Automatic token counting
- Context length validation
- Max tokens validation
Middleware architecture:
- Request ID tracking
- Request timing metrics
- Panic recovery
- CORS support
- API key authentication
- Rate limiting (token bucket)
- Prometheus metrics collection
Enhanced error handling:
- Structured JSON error responses
- Request ID tracking in errors
- Zap-based logging with context
- Custom error types for different scenarios
- Seamless error middleware integration
Dynamic routing:
- Version-based routing (v1, v2)
- Route-specific middleware
- Health check endpoints
- Header validation
Provider management:
- Multiple provider support (OpenAI, Anthropic, etc.)
- Provider health monitoring
- Automatic failover to backup providers
- Configurable health check intervals
- Provider-specific configuration

Installation

go get github.com/teilomillet/hapax

Configuration

Hapax uses YAML for configuration. Here's an example configuration file:

server:
  port: 8080
  read_timeout: 30s
  write_timeout: 30s
  max_header_bytes: 1048576  # 1MB
  shutdown_timeout: 30s

routes:
  - path: "/completions"
    handler: "completion"
    version: "v1"
    methods: ["POST"]
    middleware: ["auth", "ratelimit"]
    headers:
      Content-Type: "application/json"
    health_check:
      enabled: true
      interval: 30s
      timeout: 5s
      threshold: 3
      checks:
        api: "http"

  - path: "/health"
    handler: "health"
    version: "v1"
    methods: ["GET"]
    health_check:
      enabled: true
      interval: 15s
      timeout: 2s
      threshold: 2
      checks:
        system: "tcp"

llm:
  provider: ollama
  model: llama2
  endpoint: http://localhost:11434
  system_prompt: "You are a helpful assistant."
  max_context_tokens: 4096  # Maximum context length for your model
  options:
    temperature: 0.7
    max_tokens: 2000

logging:
  level: info  # debug, info, warn, error
  format: json # json, text

Configuration Options

Server Configuration

port: HTTP server port (default: 8080)
read_timeout: Maximum duration for reading request body (default: 30s)
write_timeout: Maximum duration for writing response (default: 30s)
max_header_bytes: Maximum size of request headers (default: 1MB)
shutdown_timeout: Maximum duration to wait for graceful shutdown (default: 30s)

LLM Configuration

provider: LLM provider name (e.g., "ollama", "openai")
model: Model name (e.g., "llama2", "gpt-4")
endpoint: API endpoint URL
system_prompt: Default system prompt for conversations
max_context_tokens: Maximum context length in tokens (model-dependent)
options: Provider-specific options
- temperature: Sampling temperature (0.0 to 1.0)
- max_tokens: Maximum tokens to generate

Logging Configuration

level: Log level (debug, info, warn, error)
format: Log format (json, text)

Quick Start

package main

import (
    "context"
    "log"

    "github.com/teilomillet/hapax"
    "github.com/teilomillet/gollm"
    "go.uber.org/zap"
)

func main() {
    // Initialize logger (optional, defaults to production config)
    logger, _ := zap.NewProduction()
    defer logger.Sync()
    hapax.SetLogger(logger)

    // Create an LLM instance (using gollm)
    llm := gollm.New()

    // Create a completion handler
    handler := hapax.NewCompletionHandler(llm)

    // Create a router
    router := hapax.NewRouter(handler)

    // Use default configuration
    config := hapax.DefaultConfig()

    // Create and start server
    server := hapax.NewServer(config, router)
    if err := server.Start(context.Background()); err != nil {
        log.Fatal(err)
    }
}

API Endpoints

POST /v1/completions

Generate completions using the configured LLM.

Request:

{
    "prompt": "Your prompt here"
}

Response:

{
    "completion": "LLM generated response"
}

Error Responses:

400 Bad Request: Invalid JSON or missing prompt
405 Method Not Allowed: Wrong HTTP method
500 Internal Server Error: LLM error

GET /health

Check server health status.

Response:

{
    "status": "ok"
}

Error Handling

Hapax provides structured error handling with JSON responses:

{
    "type": "validation_error",
    "message": "Invalid request format",
    "request_id": "req_123abc",
    "details": {
        "field": "prompt",
        "error": "required"
    }
}

Error types include:

validation_error: Request validation failures
provider_error: LLM provider issues
rate_limit_error: Rate limiting
internal_error: Unexpected server errors

Docker Support

The application comes with full Docker support, making it easy to deploy and run in containerized environments.

Features

Multi-stage Build: Optimized container size with separate build and runtime stages
Security: Runs as non-root user with minimal runtime dependencies
Health Checks: Built-in health monitoring for container orchestration
Prometheus Integration: Ready-to-use metrics endpoint for monitoring
Docker Compose: Complete setup with Prometheus integration

Running with Docker

Build and run using Docker:

docker build -t hapax .
docker run -p 8080:8080 hapax

Or use Docker Compose for the full stack with Prometheus:

docker compose up -d

Container Health

The container includes health checks that monitor:

HTTP server availability
Application readiness
Basic functionality

Access the health status:

Health endpoint: http://localhost:8080/health
Metrics endpoint: http://localhost:8080/metrics
Prometheus: http://localhost:9090

Testing

The project includes a comprehensive test suite with a mock LLM implementation that can be used for testing LLM-dependent code:

import "github.com/teilomillet/hapax/mock_test"

// Create a mock LLM with custom response
llm := &MockLLM{
    GenerateFunc: func(ctx context.Context, p *gollm.Prompt) (string, error) {
        return "Custom response", nil
    },
}

Run the tests:

go test ./...

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
config		config
errors		errors
examples		examples
server		server
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
config.example.yaml		config.example.yaml
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
prometheus.yml		prometheus.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hapax

Version

Features

Installation

Configuration

Configuration Options

Server Configuration

LLM Configuration

Logging Configuration

Quick Start

API Endpoints

POST /v1/completions

GET /health

Error Handling

Docker Support

Features

Running with Docker

Container Health

Testing

License

Contributing

About

Uh oh!

Releases

Packages

Languages

License

teilomillet/hapax

Folders and files

Latest commit

History

Repository files navigation

Hapax

Version

Features

Installation

Configuration

Configuration Options

Server Configuration

LLM Configuration

Logging Configuration

Quick Start

API Endpoints

POST /v1/completions

GET /health

Error Handling

Docker Support

Features

Running with Docker

Container Health

Testing

License

Contributing

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages