Stars
A python package to analyze and compare voices with deep learning
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730
Audio style transfer with shallow random parameters CNN.
PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
Official Code for Assem-VC @ICASSP2022
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Units.
This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"