Notetaker AI

Intelligent Transcription & Summarization for Professionals

📋 Table of Contents

🔍 About
🚀 Getting Started
📝 Usage
🖥️ Demo
🐳 Docker Setup
🗺️ Roadmap
👥 Contributors
📄 License

🔍 About The Project

Notetaker AI transforms how professionals handle meetings, interviews, and consultations with advanced audio-to-text capabilities. It combines precise transcription with intelligent summarization to create concise, structured notes that save time and enhance documentation accuracy.

✨ Key Features

🎙️ Smart Transcription: Convert audio to text with exceptional accuracy, including optional speaker diarization and time alignment
📊 Multiple Summary Formats: Generate summaries in various formats to fit different professional needs:
- 📝 Text – Simple, readable plain-text format
- 📋 SOAP – Structured clinical format (Subjective, Objective, Assessment, Plan)
- 🏥 PKI HL7 CDA – Standards-compliant summary for healthcare interoperability
- 🩺 Therapy Assessment – Custom format for structured evaluation of therapist performance across key professional competencies
⏳ Long-form Audio Support: Designed to handle recordings of over 1 hour
⚙️ Flexible Deployment: Can be deployed fully locally, using local AI models for full data control, or using wavaliable external integrations
⚙️ Multiple access points: Run as an API-only service or with an intuitive Gradio UI for interactive use
🚄 GPU Acceleration: Leverage GPU hardware for faster processing of large audio files
🔧 Customizable: Configure to your specific requirements with extensive environment variables

demo.mp4

(back to top)

🚀 Getting Started

Follow these steps to set up Notetaker AI in your environment.

Prerequisites

Python: 3.12 or higher
Poetry: For dependency management (Installation Guide)
FFmpeg: Required for audio processing
CUDA Toolkit: 12.2+ recommended (only if using GPU acceleration)
Hugging Face Access: You'll need access to these gated models:
- Speaker Diarization
- Segmentation

Installation

Clone the repository:

git clone https://github.com/the-momentum/notetaker
cd notetaker

Install dependencies:

# For API only (recommended for production)
poetry install --without demo --without dev

# With demo interface (for testing and demonstration)
poetry install --with demo --without dev

(back to top)

📝 Usage

Configuration

Set up environment variables:
```
cp .env.example .env
```
Edit the .env file with your specific configuration.
Start the application:
```
./run.sh
```
The API will be available at http://localhost:8001 by default.
Access the API documentation:
- Swagger UI: http://localhost:8001/docs
- ReDoc: http://localhost:8001/redoc

Environment Variables

Variable	Description	Example Value
PROJECT_NAME	Name used for logging and display	`Notetaker AI`
BACKEND_CORS_ORIGINS	Allowed CORS origins	`["http://localhost:8000"]`
HOST	Host address for API availability	`0.0.0.0`
PORT	Port for the API server	`8001`
OLLAMA_URL	Base URL for Ollama server	`http://localhost:11434`
LLM_MODEL	LLM model name	`llama3.2`
USE_LOCAL_MODELS	Whether to use local models	`True`
WHISPER_MODEL	Whisper model type	`turbo`
WHISPER_DEVICE	Device for running Whisper	`cpu` or `cuda`
WHISPER_COMPUTE_TYPE	Compute type for Whisper	`int8`
WHISPER_BATCH_SIZE	Batch size for processing	`16`
HF_API_KEY	Hugging Face API key	`hf_...`
OPENAI_API_KEY	OpenAI API key	`sk-proj-...`

⚠️ Note: The transcription output length depends on the selected model's token limit. If the transcription is too long, it may be truncated or cause errors. Choose a model appropriate for the expected transcription length to ensure complete results.

(back to top)

🖥️ Demo

The interactive Gradio demo provides a user-friendly interface to experience Notetaker AI's capabilities without writing code.

Running the Demo

Install demo dependencies (if not already done):
```
poetry install --with demo --without dev
```
Configure the demo: Update demo/.env.demo with your API base URL.
Launch the integrated demo:
```
./run.sh --demo
```
This starts both the API and Gradio interface.
Or run the demo separately (if API is already running):
```
poetry run python demo/ui.py
```

The demo will be available at http://localhost:7860.

Demo Features

📁 Upload or Record: Submit audio files or record directly in your browser
⚙️ Configure Options: Set parameters for transcription and summarization
📊 Format Selection: Choose between different summary formats
⏱️ Real-time Processing: Watch as your audio is transcribed and summarized
💾 Download Results: Save output as JSON for further use

(back to top)

🐳 Docker Setup

For consistent deployment across environments, use our Docker setup.

Quick Commands

# Build the Docker images
just docker-build

# Rebuild without using cache
just docker-rebuild

# Run the API only
just docker-up

# Run API with Gradio demo
just docker-demo

Access Points

API: http://localhost:8001
API Documentation:
- Swagger UI: http://localhost:8001/docs
- ReDoc: http://localhost:8001/redoc
Gradio Demo (if enabled): http://localhost:7860

(back to top)

🗺️ Roadmap

We're continuously enhancing Notetaker AI with new capabilities. Here's what's on the horizon:

OpenAI API Integration: Direct connection to Whisper via OpenAI API
Expanded LLM Support: Integration with additional LLM providers
Enhanced Note Formats: More specialized formats and improved customization options
Performance Optimizations: Faster processing for large audio files

Have a suggestion? We'd love to hear from you! Contact us or contribute directly.

👥 Contributors

(back to top)

📄 License

Distributed under the MIT License. See LICENSE for more information.

Built with ❤️ by Momentum • Turning conversations into structured knowledge

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
app		app
demo		demo
docker		docker
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
justfile		justfile
logging.conf		logging.conf
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run.sh		run.sh
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Notetaker AI

📋 Table of Contents

🔍 About The Project

✨ Key Features

🚀 Getting Started

Prerequisites

Installation

📝 Usage

Configuration

Environment Variables

🖥️ Demo

Running the Demo

Demo Features

🐳 Docker Setup

Quick Commands

Access Points

🗺️ Roadmap

👥 Contributors

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

the-momentum/notetaker

Folders and files

Latest commit

History

Repository files navigation

Notetaker AI

📋 Table of Contents

🔍 About The Project

✨ Key Features

🚀 Getting Started

Prerequisites

Installation

📝 Usage

Configuration

Environment Variables

🖥️ Demo

Running the Demo

Demo Features

🐳 Docker Setup

Quick Commands

Access Points

🗺️ Roadmap

👥 Contributors

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages