Stars
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatioβ¦
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNβ¦
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
A state-of-the-art semi-supervised method for image recognition
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
Efficient, scalable and enterprise-grade CPU/GPU inference server for π€ Hugging Face transformer models π
Python library for downloading, loading & working with sound datasets
Convert images of LaTex math equations into LaTex code.
Distributed Asynchronous Hyperparameter Optimization in Python
Self-Supervised Speech Pre-training and Representation Learning Toolkit
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
π A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
An open-source NLP research library, built on PyTorch.
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
torch-optimizer -- collection of optimizers for Pytorch
Production First and Production Ready End-to-End Speech Recognition Toolkit
A curated list of awesome self-supervised methods
On-device wake word detection powered by deep learning
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
π§βπ« 60+ Implementations/tutorials of deep learning papers with side-by-side notes π; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gaβ¦
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."