Stars
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Janus-Series: Unified Multimodal Understanding and Generation Models
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
Python实用教程,包括:Python基础,Python高级特性,面向对象编程,多线程,数据库,数据科学,Flask,爬虫开发教程。
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO、GRPO。
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
This is the official implementation of our mesh-based neural network (MESH2IR) to generate acoustic impulse responses (IRs) for indoor 3D scenes represented using a mesh.
An Open-Sourced LLM-empowered Foundation TTS System
DeepEP: an efficient expert-parallel communication library
Solve Visual Understanding with Reinforced VLMs
FlashMLA: Efficient MLA decoding kernels
OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.
Making large AI models cheaper, faster and more accessible
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
User-friendly Desktop Client App for AI Models/LLMs (GPT, Claude, Gemini, Ollama...)
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Neural Generalized Cross Correlations https://arxiv.org/abs/2208.04654
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]
SoundTouch library compiled for iOS http://www.surina.net/soundtouch/index.html