This project provides a graphical user interface (GUI) for real-time audio transcription using Whisper.
Whisper Stream GUI is a user-friendly application that allows you to transcribe audio in real-time using the Whisper model. It is designed to be easy to use and provides a simple interface for streaming audio and viewing transcriptions.
2025-03-20.11.19.47.mov
To install and run Whisper Stream GUI, follow these steps:
-
Prerequisites:
- Python 3.9 or later installed on your system. You can download Python from https://www.python.org/.
pip
package installer.pip
is usually included with Python installations.- PyTorch version greater than 1.0.0 (You can install the latest version). Follow the installation instructions on https://pytorch.org/get-started/locally/ to install the correct version for your system.
-
Install Whisper: Follow the instructions on the Whisper repository to install it: https://github.com/openai/whisper.
-
Recommended: Install Conda (Optional but Recommended): It is recommended to use Conda to manage your Python environment. You can download Conda from https://www.anaconda.com/products/distribution.
-
Create Conda Environment (Optional but Recommended): If you choose to use Conda, create a new environment for this project:
conda create -n whisper python=3.9 conda activate whisper
-
Clone the repository: You can clone it to your local machine.
git clone https://github.com/felixszeto/whisper-stream-gui.git cd whisper-stream-gui
-
Install Dependencies: Navigate to the project directory in your terminal and install the required Python packages using pip.
pip install -r requirements.txt
-
Run the GUI: Execute the
start_gui.bat
batch file to launch the Whisper Stream GUI application.start_gui.bat
This will start the Gradio interface in your web browser.
HTTPS Certificate Configuration (Required for Microphone Access): HTTPS configuration with SSL certificates is required for browsers to allow microphone access due to security restrictions for microphone usage over non-secure HTTP. To configure HTTPS, you need to set up SSL certificates:
- Modify the
app.py
script to specify the paths to your SSL certificate files. - Find the
app.launch()
function inapp.py
. - Adjust the
ssl_keyfile
andssl_certfile
parameters to the correct paths of your key and certificate files. - Ensure that the certificate files (
key.pem
andchain.pem
by default) are placed in thessl/
directory, or update the paths accordingly in the script.
- Modify the
The default model is faster-whisper-large-v3
.
Please download the faster-whisper-large-v3
model files from Hugging Face and place them in the faster-whisper-large-v3/
folder.
You can change the Whisper model used for transcription by modifying the app.py
file.
- Open the
app.py
file. - Locate the line
model = whisper.load_model("tiny")
. - Replace
"tiny"
with the desired model size. Available models are:tiny
,tiny_en
,base
,base_en
,small
,small_en
,medium
,medium_en
,large
, andturbo
.
Size | Parameters | English-only model | Multilingual model | Required VRAM | Relative speed |
---|---|---|---|---|---|
tiny | 39 M | tiny.en | tiny | ~1 GB | ~10x |
base | 74 M | base.en | base | ~1 GB | ~7x |
small | 244 M | small.en | small | ~2 GB | ~4x |
medium | 769 M | medium.en | medium | ~5 GB | ~2x |
large | 1550 M | N/A | large | ~10 GB | 1x |
turbo | 809 M | N/A | turbo | ~6 GB | ~8x |
- Once the GUI is running in your browser, you can select your audio input source.
- Start streaming audio.
- The transcribed text will be displayed in real-time in the GUI.
Project file structure:
whisper-stream-gui/
├── .gitignore
├── app.py
├── LICENSE
├── readme_zh.md
├── README.md
├── requirements.txt
├── silero_vad.jit
├── start_gui.bat
├── vad.py
├── css/
│ ├── all.min.css
│ └── bulma.min.css
├── faster-whisper-large-v3/
├── js/
│ └── socket.io.js
├── ssl/
└── templates/
└── index.html
app.py
: This Python script contains the Gradio UI application code.start_gui.bat
: This batch file is used to start the Gradio GUI application.ssl/
: This directory may contain SSL certificate files if the GUI is configured to run over HTTPS.
For any issues or questions, please refer to the project documentation or contact the maintainers.