A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, …

Python 325 42 Updated Sep 24, 2022

thorstenMueller / Thorsten-Voice

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Python 610 53 Updated Jan 8, 2025

huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Python 20,092 2,823 Updated May 7, 2025

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 144,109 28,894 Updated May 10, 2025

rishikksh20 / hifigan-denoiser

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Python 214 44 Updated Apr 8, 2021

Emotional-Text-to-Speech / dl-for-emo-tts

💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈

Jupyter Notebook 448 44 Updated Jun 26, 2024

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 28,909 5,945 Updated May 10, 2025

AndreevP / wvmos

MOS score prediction by fine-tuned wav2vec2.0 model

Python 157 21 Updated Oct 20, 2022

AI-Unicamp / TTS-Objective-Metrics

Objective metrics used in several text-to-speech (TTS) papers.

Python 48 9 Updated Apr 22, 2022

nkrao220 / accent-classification

Accent Classification in Speech

Python 25 5 Updated Jul 24, 2019

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,743 549 Updated Mar 24, 2025

Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"

Python 1,729 126 Updated Nov 26, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 14,187 2,840 Updated May 10, 2025

Edresson / YourTTS

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Jupyter Notebook 969 84 Updated Nov 4, 2024

miguelvalente / whisperer

Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.

Jupyter Notebook 137 12 Updated Aug 14, 2023

NVIDIA / radtts

Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, Diverse Synthesis, and Generative Modeling and Fine-Grained …

Roff 286 40 Updated Apr 6, 2023

HLTSingapore / Emotional-Speech-Data

This is the GitHub page for publicly available emotional speech data.

346 25 Updated Jan 6, 2022

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 31,411 6,514 Updated Jan 9, 2025

speechbrain / speechbrain

A PyTorch-based Speech Toolkit

Python 9,800 1,486 Updated May 8, 2025

riffusion / riffusion-hobby

Stable diffusion for real-time music generation

Python 3,675 438 Updated Jul 22, 2024

diff-usion / Awesome-Diffusion-Models

A collection of resources and papers on Diffusion Models

HTML 11,704 977 Updated Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aya Aya-AlJafari

Achievements

Achievements

Block or report Aya-AlJafari

Stars

Camb-ai / MARS5-TTS

nvidia-riva / python-clients

Edresson / Wav2Vec-Wrapper

kmario23 / KenLM-training

openai / whisper

ggml-org / llama.cpp

SYSTRAN / faster-whisper

jaywalnut310 / vits

coqui-ai / Trainer

keonlee9420 / Comprehensive-Transformer-TTS