🎤 A production-ready serverless Alexa skill backend that seamlessly integrates with multiple generative AI providers, enabling natural conversations with OpenAI GPT, Google Gemini, Anthropic Claude, Cloudflare AI, and more through your Alexa device.
- Multi-Provider AI Support: Seamlessly switch between OpenAI, Google, Anthropic, and Cloudflare models
- Asynchronous Processing: Handles Alexa's timeout constraints with intelligent queue management
- Image Generation: Create images with DALL-E, Stable Diffusion, and Google Imagen
- Interactive Games: Built-in number guessing and battleship games
- Translation Support: Real-time language translation capabilities
- Production Ready: Complete with observability, error handling, and retry mechanisms
- Cost Effective: Leverage Cloudflare Workers AI for budget-friendly inference
- Architecture Overview
- Supported Models
- Alexa Intents & Phrases
- Quick Start
- Detailed Setup Guide
- Examples
- Troubleshooting
- API Integration Details
- Contributing
The skill uses an asynchronous architecture to handle the Alexa 8-second timeout constraint:
- User prompts the Alexa skill
- Alexa invokes the Lambda function with the user's intent
- Lambda pushes the request to an SQS queue
- A separate Lambda processes the request using the selected AI model
- The response is placed on a response SQS queue
- The original Lambda polls for the response
Caution
Due to Alexa's ~8 second timeout constraint:
- If no response is received within ~7 seconds, Alexa responds with "your response will be available shortly!"
- Users can retrieve delayed responses by saying "last response"
đź’ˇ The architecture uses AWS Lambda functions with SQS queues to handle Alexa's timeout constraints while providing access to multiple AI providers. View/Edit Diagram
Provider | Model | Alias | Internal Reference |
---|---|---|---|
OpenAI | o1-mini | gpt |
CHAT_MODEL_GPT |
OpenAI | gpt-4o | g. p. t. version number four |
CHAT_MODEL_GPT_V4 |
gemini-2.0-flash-exp | gemini |
CHAT_MODEL_GEMINI |
|
Anthropic | claude-opus-4-20250514 | opus |
CHAT_MODEL_OPUS |
Anthropic | claude-sonnet-4-20250514 | sonnet |
CHAT_MODEL_SONNET |
Cloudflare | llama-4-scout-17b-16e-instruct | llama |
CHAT_MODEL_META |
Cloudflare | llama-2-13b-chat-awq | awq |
CHAT_MODEL_AWQ |
Cloudflare | deepseek-r1-distill-qwen-32b | qwen |
CHAT_MODEL_QWEN |
Cloudflare | openchat-3.5-0106 | open chat |
CHAT_MODEL_OPEN |
Cloudflare | sqlcoder-7b-2 | sql |
CHAT_MODEL_SQL |
Provider | Model | Alias | Internal Reference |
---|---|---|---|
OpenAI | dall-e-3 | dallas |
IMAGE_MODEL_DALL_E_3 |
OpenAI | dall-e-2 | dallas v2 |
IMAGE_MODEL_DALL_E_2 |
Cloudflare | stable-diffusion-xl-base-1.0 | stable |
IMAGE_MODEL_STABLE_DIFFUSION |
"imagen-3.0-generate-002" | gemini image |
IMAGE_MODEL_GEMINI |
- Special model for translations:
CHAT_MODEL_TRANSLATIONS
Intent | Example Phrases | Description |
---|---|---|
AutoCompleteIntent | "question {prompt}" | Main intent for asking questions to the AI |
SystemAutoCompleteIntent | "system {prompt}" | Set a system message context for the AI |
LastResponseIntent | "last response" | Retrieve delayed responses from previous queries |
Intent | Example Phrases | Description |
---|---|---|
Model | "model <MODEL_ALIAS_HERE>" | Switch to the desired LLM |
Intent | Example Phrases | Description |
---|---|---|
ImageIntent | "image {prompt}" | Generate images using AI models |
Intent | Example Phrases | Description |
---|---|---|
RandomFactIntent | "random fact" | Get a random fact from the model |
Guess | "guess {number}" | Play a number guessing game |
Battleship | "battleship {x} {y}" | Play battleship game |
BattleshipStatus | "battleship status" | Get current battleship game status |
Intent | Example Phrases | Description |
---|---|---|
TranslateIntent | "translate {source_lang} to {target_lang} {text}" | Translate between two ISO 639-1 language codes. uses model m2m100-1.2b from Meta provided by Cloudflare |
SystemMessageIntent | "system {prompt}" | Get a prompt with combined system role message |
SystemContextIntent | "set system message {prompt}" | Set the message added as context for the role of the model subsequent queries |
Purge | "purge" | Clear the response queue |
Intent | Example Phrases | Description |
---|---|---|
AMAZON.HelpIntent | "help" | Get help on available commands |
AMAZON.CancelIntent | "cancel" "menu" |
Cancel current operation |
AMAZON.StopIntent | "stop" "exit" |
End the skill session |
AMAZON.FallbackIntent | (triggered on unrecognized input) | Handle unrecognized commands |
-
Clone the repository
git clone https://github.com/jackmcguire1/alexa-chatgpt.git cd alexa-chatgpt
-
Set up minimal environment variables
export OPENAI_API_KEY=your_openai_api_key export S3_BUCKET_NAME=your_deployment_bucket
-
Deploy to AWS
sam build && sam deploy --guided
-
Create Alexa Skill
- Go to Alexa Developer Console
- Create new skill with "Custom" model
- Copy the Lambda ARN from deployment output
- Set as endpoint in Alexa skill
- Git
- Go 1.21+
- golangCI-Lint
- AWS CLI
- AWS SAM CLI
- AWS Account
- OpenAI API Account
- Google Cloud Account (for Gemini)
- Anthropic API Account (for Claude)
- Cloudflare Account (for Workers AI)
# Required
export HANDLER=main
export OPENAI_API_KEY=your_openai_api_key
export ANTHROPIC_API_KEY=your_anthropic_api_key
export CLOUDFLARE_ACCOUNT_ID=your_cloudflare_account_id
export CLOUDFLARE_API_KEY=your_cloudflare_api_key
# Google Service Account (base64 encoded JSON)
export GEMINI_API_KEY=base64_encoded_service_account_json
# AWS S3 Bucket for SAM deployment
export S3_BUCKET_NAME=your_s3_bucket_name
Configure AWS CLI with your credentials:
aws configure
# Set:
# - AWS Access Key ID
# - AWS Secret Access Key
# - Default region: us-east-1
-
Create Alexa Skill
- Create a new Alexa skill in the Alexa Developer Console
- Set invocation name (e.g., "my assistant")
-
Configure Intents
- Add all custom intents from the Alexa Intents section
- For AutoCompleteIntent, add a slot named
prompt
with typeAMAZON.SearchQuery
- Configure sample utterances for each intent
-
Build and Deploy Backend
# Set architecture variables export ARCH=GOARCH=arm64 export LAMBDA_RUNTIME=provided.al2023 export LAMBDA_HANDLER=bootstrap export LAMBDA_ARCH=arm64 # Build the SAM application sam build --parameter-overrides \ Runtime=$LAMBDA_RUNTIME \ Handler=$LAMBDA_HANDLER \ Architecture=$LAMBDA_ARCH # Deploy to AWS sam deploy --stack-name alexa-chatgpt \ --s3-bucket $S3_BUCKET_NAME \ --parameter-overrides \ Runtime=$LAMBDA_RUNTIME \ Handler=$LAMBDA_HANDLER \ Architecture=$LAMBDA_ARCH \ OpenAIApiKey=$OPENAI_API_KEY \ GeminiApiKey=$GEMINI_API_KEY \ AnthropicApiKey=$ANTHROPIC_API_KEY \ CloudflareAccountId=$CLOUDFLARE_ACCOUNT_ID \ CloudflareApiKey=$CLOUDFLARE_API_KEY \ --capabilities CAPABILITY_IAM
-
Connect Lambda to Alexa
# Get the Lambda ARN sam list stack-outputs --stack-name alexa-chatgpt
- Copy the
ChatGPTLambdaArn
value - In Alexa Developer Console, set this ARN as the Default Endpoint
- Copy the
-
Test Your Skill
- "Alexa, open [your invocation name]"
- "Question what is the weather today?"
- "Model gemini" (to switch models)
- "Last response" (to get delayed responses)
User: "Alexa, open my assistant"
Alexa: "Hi, let's begin our conversation!"
User: "Question what is machine learning?"
Alexa: [AI responds with explanation]
User: "Model gemini"
Alexa: "Ok"
User: "Question explain quantum computing"
Alexa: [Gemini responds]
User: "Image a sunset over mountains"
Alexa: "Your image will be ready shortly!"
User: "Last response"
Alexa: "Image generated and uploaded to S3"
User: "Model which"
Alexa: "I am using the text-model gpt and image-model dallas"
User: "Model available"
Alexa: "The available chat models are: gpt, gemini, opus, sonnet, llama, awq, qwen, open chat, sql..."
- Models: o1-mini, gpt-4o
- Used for general conversation and DALL-E image generation
- Requires
OPENAI_API_KEY
- Model: Gemini Pro
- GOOGLE Service account JSON must be base64 encoded
- Requires
GEMINI_API_KEY
- Models: Claude 3 Opus, Claude 3 Sonnet
- Premium conversational AI models
- Requires
ANTHROPIC_API_KEY
- Models: Llama 2, Qwen, OpenChat, SQLCoder
- Cost-effective AI inference
- Requires
CLOUDFLARE_ACCOUNT_ID
andCLOUDFLARE_API_KEY
This occurs when the AI takes longer than 7 seconds to respond. Simply say "last response" to retrieve it.
- Check API key configuration in environment variables
- Verify the model name in your voice command
- Check CloudWatch logs for detailed error messages
# Clean and rebuild
sam delete --stack-name alexa-chatgpt
sam build --use-container
sam deploy --guided
If you encounter rate limits:
- Switch to a different model temporarily
- Implement request throttling in your usage
- Consider upgrading your API plan
# View Lambda logs
sam logs -n ChatGPTLambda --stack-name alexa-chatgpt --tail
# Check SQS queue status
aws sqs get-queue-attributes --queue-url <your-queue-url> --attribute-names All
# Test locally
sam local start-lambda
- Use Cloudflare Workers AI for faster response times
- Enable Lambda Reserved Concurrency to reduce cold starts
- Optimize prompt length to reduce processing time
- Pre-warm Lambda functions for consistent performance
This project welcomes contributions! Please feel free to submit pull requests or open issues for bugs and feature requests.
# Install dependencies
go mod download
# Run tests
go test ./...
# Build locally
ARCH=arm64 make build
This project is licensed under the MIT License - see the LICENSE file for details.
All donations are appreciated!
- OpenAI for GPT models and DALL-E
- Google for Gemini and Vertex AI
- Anthropic for Claude models
- Cloudflare for Workers AI platform
- AWS for serverless infrastructure
- The open-source community for continuous support