Stars
NVIDIA Nemo Parakeet TDT 0.6B V2 Audio to Text Python Script
GeoAI: Artificial Intelligence for Geospatial Data
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
Document to Markdown OCR library with Llama 3.2 vision
A Jupyter notebook whisper interface library
Record voice notes & transcribe, summarize, and get tasks
Streamlit component allowing to record audio from the user's microphone and/or perform speech to text easily
[WACV 2023] Information and scripts for the CropAndWeed Dataset
Integrate OpenAI's speech-to-text Whisper with your computer's keyboard