-
diudiustudio
- Beijing
- http://www.diudiustudio.com/
Starred repositories
Artisian control of Skywalker Electric Coffee Roaster via Arduino.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Open-source framework and platform for building real-time, multimodal, low-latency conversational voice AI agents. It features a workflow builder and supports C, C++, Go, Python, JavaScript, and Ty…
This is the Personality Core for GLaDOS, the first steps towards a real-life implementation of the AI from the Portal series by Valve.
🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.
A simple Python Pydantic model for Honkai: Star Rail parsed data from the Mihomo API.
A feature-rich command-line audio/video downloader
Python library for Reinforcement Learning.
An introductory series to Reinforcement Learning (RL) with comprehensive step-by-step tutorials.
Source codes for the book "Reinforcement Learning: Theory and Python Implementation"
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,50余万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
State-of-the-art 2D and 3D Face Analysis Project
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
🪄 Create rich visualizations with AI
Pytorch Implementation of "SMITE: Segment Me In TimE" (ICLR 2025)
A collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
Speech To Speech: an effort for an open-sourced and modular GPT4-o
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)