-
Kimi-Audio Public
Forked from MoonshotAI/Kimi-AudioKimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Python UpdatedApr 28, 2025 -
Orpheus-TTS Public
Forked from canopyai/Orpheus-TTSTTS Towards Human-Sounding Speech
Python Apache License 2.0 UpdatedMar 23, 2025 -
silentcipher Public
Forked from SesameAILabs/silentcipherDeep Audio Watermarking : 音频水印
Python MIT License UpdatedMar 17, 2025 -
Spark-TTS Public
Forked from SparkAudio/Spark-TTSSpark-TTS Inference Code
Python Apache License 2.0 UpdatedMar 5, 2025 -
async_cosyvoice Public
Forked from qi-hua/async_cosyvoice使用vllm加速cosyvoice2的推理
Jupyter Notebook Apache License 2.0 UpdatedMar 2, 2025 -
TTS-LLaSA_training Public
Forked from zhenye234/LLaSA_trainingLLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
Python Other UpdatedFeb 14, 2025 -
unsloth-LLM-finetuning Public
Forked from unslothai/unslothFinetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory
Python Apache License 2.0 UpdatedFeb 10, 2025 -
Qwen-Agent Public
Forked from QwenLM/Qwen-AgentAgent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Python Other UpdatedJan 24, 2025 -
MiniCPM-o Public
Forked from OpenBMB/MiniCPM-o多模态语音大模型:MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Python Apache License 2.0 UpdatedJan 17, 2025 -
google-research Public
Forked from google-research/google-researchGoogle Research
Jupyter Notebook Apache License 2.0 UpdatedJan 9, 2025 -
data-Thorsten-Voice Public
Forked from thorstenMueller/Thorsten-Voicespeech data: Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.
Python Creative Commons Zero v1.0 Universal UpdatedJan 8, 2025 -
openai-cookbook Public
Forked from openai/openai-cookbookExamples and guides for using the OpenAI API
MDX MIT License UpdatedJan 8, 2025 -
vector-quantize-pytorch Public
Forked from lucidrains/vector-quantize-pytorchVector (and Scalar) Quantization, in Pytorch
Python MIT License UpdatedJan 7, 2025 -
pycorrector Public
Forked from shibing624/pycorrectorpycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
Python Apache License 2.0 UpdatedDec 26, 2024 -
versa Public
Forked from wavlab-speech/versaVersatile Evaluation of Speech and Audio
Python Apache License 2.0 UpdatedDec 25, 2024 -
WavChat Public
Forked from jishengpeng/WavChatA Survey of Spoken Dialogue Models (60 pages)
UpdatedNov 28, 2024 -
snac Public
Forked from hubertsiuzdak/snacaudio codec: Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Python MIT License UpdatedNov 19, 2024 -
streaming-ChatTTS Public
Forked from pengzhendong/streaming-ChatTTSJupyter Notebook Apache License 2.0 UpdatedOct 30, 2024 -
GLM-4-Voice Public
Forked from THUDM/GLM-4-VoiceGLM-4-Voice | 端到端中英语音对话模型, TTS 效果不错
Python Apache License 2.0 UpdatedOct 30, 2024 -
spiritlm Public
Forked from facebookresearch/spiritlm保留情感的音频LLM:Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
Python Other UpdatedOct 28, 2024 -
SNAC-Vocos Public
Forked from hertz-pj/SNAC-VocosA trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.
Python UpdatedOct 28, 2024 -
midi-fluidsynth Public
Forked from FluidSynth/fluidsynthmidi 播放: Software synthesizer based on the SoundFont 2 specifications
C GNU Lesser General Public License v2.1 UpdatedOct 20, 2024 -
amt-apc Public
Forked from misya11p/amt-apc音乐: 自动钢琴翻唱: AMT-APC: AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
Python MIT License UpdatedOct 19, 2024 -
qa-mdt Public
Forked from ivcylc/OpenMusic文本到音乐生成: 241010-SOTA Text-to-music (TTM) Generation (OpenMusic)
Python MIT License UpdatedOct 9, 2024 -
ml-depth-pro Public
Forked from apple/ml-depth-pro苹果-深度图-估计-Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
Python Other UpdatedOct 5, 2024 -
tiktoken-openai Public
Forked from openai/tiktokentiktoken is a fast BPE tokeniser for use with OpenAI's models.
Python MIT License UpdatedOct 3, 2024 -
GOT-OCR2.0 Public
Forked from Ucas-HaoranWei/GOT-OCR2.02024-好用的 ocr 工具: Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Python UpdatedSep 25, 2024 -
moshi_chat_LLM Public
Forked from kyutai-labs/moshi流式、实时对话 LLM
Python Apache License 2.0 UpdatedSep 19, 2024 -
godot Public
Forked from godotengine/godotGodot Engine – Multi-platform 2D and 3D game engine
C++ MIT License UpdatedSep 18, 2024 -
FluxMusic Public
Forked from feizc/FluxMusicText-to-Music Generation with Rectified Flow Transformers
Python Other UpdatedSep 6, 2024