AIGC
Stable Diffusion web UI
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
kaldi-asr/kaldi is the official location of the Kaldi project.
SoftVC VITS Singing Voice Conversion
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
12 Weeks, 24 Lessons, AI for All!
Open source real-time translation app for Android that runs locally
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Unofficial Implementation of Animate Anyone by Novita AI
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.
适用于MacOS上快速调用Ollama\Dify\Xinference的AI模型界面。/Interface for quickly invoking Ollama\Dify\Xinference AI models on MacOS.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.