Alexa-ChatGPT

🎤 A production-ready serverless Alexa skill backend that seamlessly integrates with multiple generative AI providers, enabling natural conversations with OpenAI GPT, Google Gemini, Anthropic Claude, Cloudflare AI, and more through your Alexa device.

🌟 Key Features

Multi-Provider AI Support: Seamlessly switch between OpenAI, Google, Anthropic, and Cloudflare models
Asynchronous Processing: Handles Alexa's timeout constraints with intelligent queue management
Image Generation: Create images with DALL-E, Stable Diffusion, and Google Imagen
Interactive Games: Built-in number guessing and battleship games
Translation Support: Real-time language translation capabilities
Production Ready: Complete with observability, error handling, and retry mechanisms
Cost Effective: Leverage Cloudflare Workers AI for budget-friendly inference

Architecture Overview

The skill uses an asynchronous architecture to handle the Alexa 8-second timeout constraint:

User prompts the Alexa skill
Alexa invokes the Lambda function with the user's intent
Lambda pushes the request to an SQS queue
A separate Lambda processes the request using the selected AI model
The response is placed on a response SQS queue
The original Lambda polls for the response

Caution

Due to Alexa's ~8 second timeout constraint:

If no response is received within ~7 seconds, Alexa responds with "your response will be available shortly!"
Users can retrieve delayed responses by saying "last response"

Infrastructure Diagrams

DrawIO

DrawIO Infrastructure File

Xray Trace Map

💡 The architecture uses AWS Lambda functions with SQS queues to handle Alexa's timeout constraints while providing access to multiple AI providers. View/Edit Diagram

Supported Models

Chat Models

Provider	Model	Alias	Internal Reference
OpenAI	o1-mini	`gpt`	`CHAT_MODEL_GPT`
OpenAI	gpt-4o	`g. p. t. version number four`	`CHAT_MODEL_GPT_V4`
Google	gemini-2.0-flash-exp	`gemini`	`CHAT_MODEL_GEMINI`
Anthropic	claude-opus-4-20250514	`opus`	`CHAT_MODEL_OPUS`
Anthropic	claude-sonnet-4-20250514	`sonnet`	`CHAT_MODEL_SONNET`
Cloudflare	llama-4-scout-17b-16e-instruct	`llama`	`CHAT_MODEL_META`
Cloudflare	llama-2-13b-chat-awq	`awq`	`CHAT_MODEL_AWQ`
Cloudflare	deepseek-r1-distill-qwen-32b	`qwen`	`CHAT_MODEL_QWEN`
Cloudflare	openchat-3.5-0106	`open chat`	`CHAT_MODEL_OPEN`
Cloudflare	sqlcoder-7b-2	`sql`	`CHAT_MODEL_SQL`

Image Generation Models

Provider	Model	Alias	Internal Reference
OpenAI	dall-e-3	`dallas`	`IMAGE_MODEL_DALL_E_3`
OpenAI	dall-e-2	`dallas v2`	`IMAGE_MODEL_DALL_E_2`
Cloudflare	stable-diffusion-xl-base-1.0	`stable`	`IMAGE_MODEL_STABLE_DIFFUSION`
Google	"imagen-3.0-generate-002"	`gemini image`	`IMAGE_MODEL_GEMINI`

Translation Model

Special model for translations: CHAT_MODEL_TRANSLATIONS

Alexa Intents & Phrases

Core Conversation Intents

Intent	Example Phrases	Description
AutoCompleteIntent	"question {prompt}"	Main intent for asking questions to the AI
SystemAutoCompleteIntent	"system {prompt}"	Set a system message context for the AI
LastResponseIntent	"last response"	Retrieve delayed responses from previous queries

Model Management

Intent	Example Phrases	Description
Model	"model <MODEL_ALIAS_HERE>"	Switch to the desired LLM

Image Generation

Intent	Example Phrases	Description
ImageIntent	"image {prompt}"	Generate images using AI models

Games & Entertainment

Intent	Example Phrases	Description
RandomFactIntent	"random fact"	Get a random fact from the model
Guess	"guess {number}"	Play a number guessing game
Battleship	"battleship {x} {y}"	Play battleship game
BattleshipStatus	"battleship status"	Get current battleship game status

Utility Intents

Intent	Example Phrases	Description
TranslateIntent	"translate {source_lang} to {target_lang} {text}"	Translate between two ISO 639-1 language codes. uses model `m2m100-1.2b` from Meta provided by Cloudflare
SystemMessageIntent	"system {prompt}"	Get a prompt with combined system role message
SystemContextIntent	"set system message {prompt}"	Set the message added as context for the role of the model subsequent queries
Purge	"purge"	Clear the response queue

Built-in Alexa Intents

Intent	Example Phrases	Description
AMAZON.HelpIntent	"help"	Get help on available commands
AMAZON.CancelIntent	"cancel" "menu"	Cancel current operation
AMAZON.StopIntent	"stop" "exit"	End the skill session
AMAZON.FallbackIntent	(triggered on unrecognized input)	Handle unrecognized commands

Quick Start

🚀 Deploy in 5 Minutes

Clone the repository

git clone https://github.com/jackmcguire1/alexa-chatgpt.git
cd alexa-chatgpt

Set up minimal environment variables

export OPENAI_API_KEY=your_openai_api_key
export S3_BUCKET_NAME=your_deployment_bucket

Deploy to AWS
```
sam build && sam deploy --guided
```
Create Alexa Skill
- Go to Alexa Developer Console
- Create new skill with "Custom" model
- Copy the Lambda ARN from deployment output
- Set as endpoint in Alexa skill

Detailed Setup Guide

Prerequisites

Git
Go 1.21+
golangCI-Lint
AWS CLI
AWS SAM CLI
AWS Account
OpenAI API Account
Google Cloud Account (for Gemini)
Anthropic API Account (for Claude)
Cloudflare Account (for Workers AI)

Environment Variables

# Required
export HANDLER=main
export OPENAI_API_KEY=your_openai_api_key
export ANTHROPIC_API_KEY=your_anthropic_api_key
export CLOUDFLARE_ACCOUNT_ID=your_cloudflare_account_id
export CLOUDFLARE_API_KEY=your_cloudflare_api_key

# Google Service Account (base64 encoded JSON)
export GEMINI_API_KEY=base64_encoded_service_account_json

# AWS S3 Bucket for SAM deployment
export S3_BUCKET_NAME=your_s3_bucket_name

AWS CLI Configuration

Configure AWS CLI with your credentials:

aws configure
# Set:
# - AWS Access Key ID
# - AWS Secret Access Key
# - Default region: us-east-1

Deployment Steps

Create Alexa Skill
- Create a new Alexa skill in the Alexa Developer Console
- Set invocation name (e.g., "my assistant")
Configure Intents
- Add all custom intents from the Alexa Intents section
- For AutoCompleteIntent, add a slot named prompt with type AMAZON.SearchQuery
- Configure sample utterances for each intent

Build and Deploy Backend

# Set architecture variables
export ARCH=GOARCH=arm64
export LAMBDA_RUNTIME=provided.al2023
export LAMBDA_HANDLER=bootstrap
export LAMBDA_ARCH=arm64

# Build the SAM application
sam build --parameter-overrides \
  Runtime=$LAMBDA_RUNTIME \
  Handler=$LAMBDA_HANDLER \
  Architecture=$LAMBDA_ARCH

# Deploy to AWS
sam deploy --stack-name alexa-chatgpt \
  --s3-bucket $S3_BUCKET_NAME \
  --parameter-overrides \
    Runtime=$LAMBDA_RUNTIME \
    Handler=$LAMBDA_HANDLER \
    Architecture=$LAMBDA_ARCH \
    OpenAIApiKey=$OPENAI_API_KEY \
    GeminiApiKey=$GEMINI_API_KEY \
    AnthropicApiKey=$ANTHROPIC_API_KEY \
    CloudflareAccountId=$CLOUDFLARE_ACCOUNT_ID \
    CloudflareApiKey=$CLOUDFLARE_API_KEY \
  --capabilities CAPABILITY_IAM

Connect Lambda to Alexa
```
# Get the Lambda ARN
sam list stack-outputs --stack-name alexa-chatgpt
```
- Copy the ChatGPTLambdaArn value
- In Alexa Developer Console, set this ARN as the Default Endpoint
Test Your Skill
- "Alexa, open [your invocation name]"
- "Question what is the weather today?"
- "Model gemini" (to switch models)
- "Last response" (to get delayed responses)

Examples

Basic Conversation

User: "Alexa, open my assistant"
Alexa: "Hi, let's begin our conversation!"

User: "Question what is machine learning?"
Alexa: [AI responds with explanation]

User: "Model gemini"
Alexa: "Ok"

User: "Question explain quantum computing"
Alexa: [Gemini responds]

Image Generation

User: "Image a sunset over mountains"
Alexa: "Your image will be ready shortly!"

User: "Last response"
Alexa: "Image generated and uploaded to S3"

Model Management

User: "Model which"
Alexa: "I am using the text-model gpt and image-model dallas"

User: "Model available"
Alexa: "The available chat models are: gpt, gemini, opus, sonnet, llama, awq, qwen, open chat, sql..."

API Integration Details

OpenAI Integration

Models: o1-mini, gpt-4o
Used for general conversation and DALL-E image generation
Requires OPENAI_API_KEY

Google Vertex AI / Gemini Integration

Model: Gemini Pro
GOOGLE Service account JSON must be base64 encoded
Requires GEMINI_API_KEY

Anthropic Integration

Models: Claude 3 Opus, Claude 3 Sonnet
Premium conversational AI models
Requires ANTHROPIC_API_KEY

Cloudflare Workers AI Integration

Models: Llama 2, Qwen, OpenChat, SQLCoder
Cost-effective AI inference
Requires CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_KEY

Troubleshooting

Common Issues

"Your response will be available shortly!"

This occurs when the AI takes longer than 7 seconds to respond. Simply say "last response" to retrieve it.

Model not responding

Check API key configuration in environment variables
Verify the model name in your voice command
Check CloudWatch logs for detailed error messages

Deployment failures

# Clean and rebuild
sam delete --stack-name alexa-chatgpt
sam build --use-container
sam deploy --guided

API Rate Limits

If you encounter rate limits:

Switch to a different model temporarily
Implement request throttling in your usage
Consider upgrading your API plan

Debug Commands

# View Lambda logs
sam logs -n ChatGPTLambda --stack-name alexa-chatgpt --tail

# Check SQS queue status
aws sqs get-queue-attributes --queue-url <your-queue-url> --attribute-names All

# Test locally
sam local start-lambda

Performance Optimization

Response Time Improvements

Use Cloudflare Workers AI for faster response times
Enable Lambda Reserved Concurrency to reduce cold starts
Optimize prompt length to reduce processing time
Pre-warm Lambda functions for consistent performance

Contributing

This project welcomes contributions! Please feel free to submit pull requests or open issues for bugs and feature requests.

Development Setup

# Install dependencies
go mod download

# Run tests
go test ./...

# Build locally
ARCH=arm64 make build

License

This project is licensed under the MIT License - see the LICENSE file for details.

Donations

All donations are appreciated!

Acknowledgments

OpenAI for GPT models and DALL-E
Google for Gemini and Vertex AI
Anthropic for Claude models
Cloudflare for Workers AI platform
AWS for serverless infrastructure
The open-source community for continuous support

Name		Name	Last commit message	Last commit date
Latest commit History 365 Commits
.github		.github
cmd		cmd
events		events
images		images
internal		internal
.gitignore		.gitignore
.travis.yaml		.travis.yaml
LICENSE.md		LICENSE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
skill.json		skill.json
template.yaml		template.yaml

License

jackmcguire1/alexa-chatgpt

Folders and files

Latest commit

History

Repository files navigation

Alexa-ChatGPT

🌟 Key Features

Table of Contents

Architecture Overview

Infrastructure Diagrams

DrawIO

Xray Trace Map

Supported Models

Chat Models

Image Generation Models

Translation Model

Alexa Intents & Phrases

Core Conversation Intents

Model Management

Image Generation

Games & Entertainment

Utility Intents

Built-in Alexa Intents

Quick Start

🚀 Deploy in 5 Minutes

Detailed Setup Guide

Prerequisites

Environment Variables

AWS CLI Configuration

Deployment Steps

Examples

Basic Conversation

Image Generation

Model Management

API Integration Details

OpenAI Integration

Google Vertex AI / Gemini Integration

Anthropic Integration

Cloudflare Workers AI Integration

Troubleshooting

Common Issues

"Your response will be available shortly!"

Model not responding

Deployment failures

API Rate Limits

Debug Commands

Performance Optimization

Response Time Improvements

Contributing

Development Setup

License

Donations

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages