Lists (1)
Sort Name ascending (A-Z)
Stars
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
Noise suppression plugin based on Xiph's RNNoise
DFloat11: Lossless LLM Compression for Efficient GPU Inference
Dolphin is a multilingual, multitask ASR model jointly trained by DataoceanAI and Tsinghua University.
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
On-device wake word detection powered by deep learning
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Iranian/Persian Datasets. دیتاستهای فارسی و ایرانی
A collection of inspiring lists, repos, datasets, models, tools and more for Persian language speech to text(stt) and text to speech(tts) .
A collection of Persian text-to-speech models using implementations and techniques.
Port of OpenAI's Whisper model in C/C++
🏡 Open source home automation that puts local control and privacy first.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.
🧠 Leon is your open-source personal assistant.
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
Markdown and RST files of the articles on https://www.pythonforthelab.com
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
The dataset for drone based detection and tracking is released, including both image/video, and annotations.
Whisper realtime streaming for long speech-to-text transcription and translation