DonkeyHang

Rx.Hang DonkeyHang

Audio Effects / DL

Starred repositories

sooftware / conformer

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Python 1,027 186 Updated Dec 22, 2023

thewh1teagle / kokoro-onnx

TTS with kokoro and onnx runtime

Python 1,966 190 Updated May 10, 2025

antgroup / echomimic

[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,853 426 Updated Dec 10, 2024

lmxue / Audio-FLAN

Audio-FLAN

149 4 Updated Mar 6, 2025

yeyupiaoling / VoiceprintRecognition-Pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the …

Python 988 143 Updated Apr 29, 2025

jixiaozhong / Sonic

Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"

Python 2,697 229 Updated May 7, 2025

CarlWangChina / SaMoye-SVC

dog-can-sing-song

Python 23 4 Updated May 11, 2025

grazder / DeepFilterNet

Forked from Rikorose/DeepFilterNet

Noise supression using deep filtering

Python 27 4 Updated May 23, 2024

wwmm / easyeffects

Limiter, compressor, convolver, equalizer and auto volume and many other plugins for PipeWire applications

C++ 7,245 292 Updated May 13, 2025

wangzhaode / mnn-asr

mnn asr demo.

C++ 16 1 Updated Mar 24, 2025

kaiidams / soundstream-pytorch

Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint

Python 68 9 Updated Jun 25, 2023

google-deepmind / librispeech-long

LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation with Spoken Language Models" (arXiv 2024).

65 1 Updated Dec 28, 2024

facebookresearch / SONAR

SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.

Python 760 83 Updated Apr 1, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 3,466 341 Updated Apr 27, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 20,774 2,427 Updated Apr 30, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,759 215 Updated Apr 30, 2025

AaronZ345 / TCSinger

PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control

Python 342 40 Updated May 11, 2025

ant-research / MagicQuill

[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System

Python 3,383 339 Updated Apr 15, 2025

edwko / OuteTTS

Interface for OuteTTS models.

Python 1,222 103 Updated Apr 29, 2025

JusperLee / TDANet

An efficient speech separation method

Python 274 34 Updated Apr 11, 2024

hayeong0 / DDDM-VC

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

Python 225 23 Updated Jul 31, 2024