Stars
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
GPT-4o-level, real-time spoken dialogue system.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Fully open reproduction of DeepSeek-R1
🇨🇳 Chinese sticker pack,More joy / 表情包的博物馆, Github最有毒的仓库, 中国表情包大集合, 聚欢乐~
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MindSpore online courses: Step into LLM
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
A timeline of the latest AI models for audio generation, starting in 2023!