Stars
Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.
This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment.
電腦用漢字粵語拼音表 / Cantonese Pronunciation List of the Characters for Computers
A free, open-source, offline Cantonese Dictionary for Windows, Mac, and Linux. Qt, SQLite. C++ and Python.
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Lightweight coding agent that runs in your terminal
PyTorch入门教程,在线阅读地址:https://datawhalechina.github.io/thorough-pytorch/
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Your AI Operator for Web, Android, Automation & Testing.
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
Production-ready platform for agentic workflow development.
Easily train a good VC model with voice data <= 10 mins!
Memory for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
OpenAI 接口管理 & 分发系统,改自songquanpeng/one-api。支持更多模型,加入统计页面,完善非openai模型的函数调用。
Official inference repo for FLUX.1 models
🍂 A .NET library for manipulating PowerPoint presentations
Office PowerPoint(.pptx) file to JSON | 将 PPTX 文件转为可读的 JSON 数据
AI 智能生成 PPT,通过主题/文件/网址等方式生成PPT,支持原生图表、动画、3D特效等复杂PPT的解析和渲染,支持用户自定义模板,支持智能添加动画,可在线体验。AI generates PowerPoint Presentation, Supports parsing and rendering of complex PPT features such as native charts…
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Multilingual Voice Understanding Model
佳明开发示例程序、话题问题集锦。Samples collection for Garmin connect IQ development.
A PyTorch implementation of EfficientNet