Stars
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
A high-throughput and memory-efficient inference and serving engine for LLMs
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
分享一些好用的 Dify DSL 工作流程,自用、学习两相宜。 Sharing some Dify workflows.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
Dify-Plus 是 Dify 的企业级增强版,集成了基于 gin-vue-admin 的管理中心,并针对企业场景进行了功能优化。 🚀 Dify-Plus = 管理中心 + Dify 二开 。 特别说明: 本项目为开源社区的二次开发成果,严格遵循 Dify 原项目的版权许可协议,未涉及原项目许可的多租户功能及 logo 等版权信息。如有相关需求,请直接联系 Dify 官方获取授权与支持。
TTSFM is a reverse-engineered API server that mirrors OpenAI's TTS service, providing a compatible interface for text-to-speech conversion with multiple voice options.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A generative speech model for daily dialogue.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
UniVRM is a gltf-based VRM format implementation for Unity. English is here https://vrm.dev/en/ . 日本語 はこちら https://vrm.dev/
使用unity实现AI聊天相关功能。目前这个库包含了对chatgpt、chatglm等大语言模型的api调用的代码实现以及实现了微软Azure以及百度AI的语音服务功能,语音服务均采用web api实现,支持Windows/WebGL/Android等平台
MFCC-based LipSync plug-in for Unity using Job System and Burst Compiler
fay是一个帮助数字人(2.5d、3d、移动、pc、网页)或大语言模型(openai兼容、deepseek)连通业务系统的mcp框架。
Real time interactive streaming digital human
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…