-
-
DNSMOSPro Public
Forked from fcumlin/DNSMOSProOfficial implementation of DNSMOS Pro (accepted at INTERSPEECH 2024).
Python MIT License UpdatedMar 17, 2025 -
S3Tokenizer Public
Forked from xingchensong/S3TokenizerReverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Python Apache License 2.0 UpdatedDec 27, 2024 -
InspireMusic Public
Forked from FunAudioLLM/InspireMusicInspireMusic: A fundamental toolkit for music, song and audio generation.
Python Apache License 2.0 UpdatedDec 11, 2024 -
Music-Source-Separation-Training Public
Forked from ZFTurbo/Music-Source-Separation-TrainingRepository for training models for music source separation.
Python MIT License UpdatedDec 3, 2024 -
speech-trident Public
Forked from ga642381/speech-tridentAwesome speech/audio LLMs, representation learning, and codec models
UpdatedDec 2, 2024 -
-
MuseTalk Public
Forked from TMElyralab/MuseTalkMuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Python Other UpdatedNov 15, 2024 -
python-audio-separator Public
Forked from nomadkaraoke/python-audio-separatorEasy to use stem (e.g. instrumental/vocals) separation from CLI or as a python package, using a variety of amazing pre-trained models (primarily from UVR)
Python MIT License UpdatedNov 4, 2024 -
bitsandbytes Public
Forked from bitsandbytes-foundation/bitsandbytesAccessible large language models via k-bit quantization for PyTorch.
Python MIT License UpdatedOct 22, 2024 -
DiariZen Public
Forked from BUTSpeechFIT/DiariZenA toolkit for speaker diarization.
Jupyter Notebook MIT License UpdatedOct 21, 2024 -
-
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python MIT License UpdatedOct 16, 2024 -
to-jyutping Public
Forked from CanCLID/to-jyutping粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool
TypeScript BSD 2-Clause "Simplified" License UpdatedSep 30, 2024 -
ToJyutping Public
Forked from CanCLID/ToJyutping粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool
Python BSD 2-Clause "Simplified" License UpdatedSep 24, 2024 -
ctc-forced-aligner Public
Forked from MahmoudAshraf97/ctc-forced-alignerText to speech alignment using CTC forced alignment
Python UpdatedSep 22, 2024 -
NeMo-text-processing Public
Forked from NVIDIA/NeMo-text-processingNeMo text processing for ASR and TTS
Python Apache License 2.0 UpdatedSep 19, 2024 -
BigCodec Public
Forked from Aria-K-Alethia/BigCodecOfficial implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
Python MIT License UpdatedSep 19, 2024 -
super-monotonic-align Public
Forked from supertone-inc/super-monotonic-alignPython MIT License UpdatedSep 14, 2024 -
LangSegment Public
Forked from JaccoSu/juntaosun_LangSegmentIt is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
Python UpdatedSep 7, 2024 -
LLaMA-Factory Public
Forked from hiyouga/LLaMA-FactoryEfficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Python Apache License 2.0 UpdatedSep 2, 2024 -
text-labeler Public
Forked from fishaudio/text-labelerA simple svs labeling tool
TypeScript Apache License 2.0 UpdatedAug 19, 2024 -
SimpleSpeech Public
Forked from yangdongchao/SimpleSpeechThe open source code for SimpleSpeech series
Python UpdatedAug 19, 2024 -
DeepFilterNet Public
Forked from Rikorose/DeepFilterNetNoise supression using deep filtering
Python Other UpdatedJul 31, 2024 -
-
open_clip Public
Forked from mlfoundations/open_clipAn open source implementation of CLIP.
Python Other UpdatedJul 4, 2024 -
AudioLDM2 Public
Forked from haoheliu/AudioLDM2Text-to-Audio/Music Generation
Python Other UpdatedJun 27, 2024 -
LibriTTS-P Public
Forked from line/LibriTTS-PLibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
UpdatedJun 13, 2024 -
mamba Public
Forked from state-spaces/mambaMamba SSM architecture
Python Apache License 2.0 UpdatedJun 5, 2024 -
lina-speech Public
Forked from theodorblackbird/lina-speechlina-speech : linear attention based text-to-speech
Jupyter Notebook Other UpdatedJun 3, 2024