Stars
Maid is a cross-platform Flutter app for interfacing with GGUF / llama.cpp models locally, and with Ollama and OpenAI models remotely.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
Port of OpenAI's Whisper model in C/C++
zero-shot voice conversion & singing voice conversion, with real-time support
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
152334H / DL-Art-School
Forked from neonbjb/DL-Art-SchoolTorToiSe fine-tuning with DLAS
A multi-voice TTS system trained with an emphasis on quality
unofficial vits2-TTS implementation in pytorch
Application of MB-iSTFT-VITS components to vits2_pytorch
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.