-
airbyte Public
Forked from airbytehq/airbyteThe leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Python Other UpdatedApr 30, 2025 -
LLMs-from-scratch Public
Forked from rasbt/LLMs-from-scratchImplement a ChatGPT-like LLM in PyTorch from scratch, step by step
Jupyter Notebook Other UpdatedApr 20, 2025 -
DB-GPT Public
Forked from eosphoros-ai/DB-GPTAI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
Python MIT License UpdatedApr 3, 2025 -
easy-dataset Public
Forked from ConardLi/easy-datasetA powerful tool for creating fine-tuning datasets for LLM
JavaScript UpdatedMar 28, 2025 -
distilabel Public
Forked from argilla-io/distilabelDistilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Python Apache License 2.0 UpdatedMar 3, 2025 -
ComfyUI Public
Forked from comfyanonymous/ComfyUIThe most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Python GNU General Public License v3.0 UpdatedMar 1, 2025 -
data-juicer Public
Forked from modelscope/data-juicerData processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Python Apache License 2.0 UpdatedFeb 28, 2025 -
echomimic_v2 Public
Forked from antgroup/echomimic_v2[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Python Apache License 2.0 UpdatedFeb 27, 2025 -
SkyThought Public
Forked from NovaSky-AI/SkyThoughtSky-T1: Train your own O1 preview model within $450
Python Apache License 2.0 UpdatedFeb 18, 2025 -
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedFeb 17, 2025 -
MiniCPM-o Public
Forked from OpenBMB/MiniCPM-oMiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Python Apache License 2.0 UpdatedFeb 17, 2025 -
open-thoughts Public
Forked from open-thoughts/open-thoughtsOpen Thoughts: Fully Open Data Curation for Thinking Models
Python Apache License 2.0 UpdatedFeb 17, 2025 -
s1 Public
Forked from simplescaling/s1s1: Simple test-time scaling
Python Apache License 2.0 UpdatedFeb 12, 2025 -
leedl-tutorial Public
Forked from datawhalechina/leedl-tutorial《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
Jupyter Notebook Other UpdatedFeb 10, 2025 -
ClearerVoice-Studio Public
Forked from modelscope/ClearerVoice-StudioAn AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Python Apache License 2.0 UpdatedJan 27, 2025 -
awesome-LLM-resourses Public
Forked from WangRongsheng/awesome-LLM-resources🧑🚀 全世界最好的LLM资料总结 | Summary of the world's best LLM resources.
Apache License 2.0 UpdatedJan 21, 2025 -
rule-engine-front-open Public
Forked from rule-engine/rule-engine-front-open🔥🔥🔥📌 规则引擎前端 📌 RuleEngine 基于web可视化配置,简单高效快捷。
Vue Apache License 2.0 UpdatedDec 3, 2024 -
rule-engine-open Public
Forked from rule-engine/rule-engine-open🔥🔥🔥📌 规则引擎开源版 📌 RuleEngine 基于web可视化配置,简单高效快捷。业务逻辑实现不再依赖于代码开发,可零代码实现复杂业务逻辑!
Java Apache License 2.0 UpdatedSep 22, 2024 -
build-your-own-x Public
Forked from codecrafters-io/build-your-own-xMaster programming by recreating your favorite technologies from scratch.
Markdown UpdatedSep 3, 2024 -
arms-template-plugin Public
Forked from sionsxie/arms-template-pluginarms框架的模板插件
Kotlin UpdatedAug 28, 2024 -
learn-nlp-with-transformers Public
Forked from datawhalechina/learn-nlp-with-transformerswe want to create a repo to illustrate usage of transformers in chinese
Shell UpdatedAug 18, 2024 -
surya Public
Forked from VikParuchuri/suryaOCR, layout analysis, reading order, line detection in 90+ languages
Python GNU General Public License v3.0 UpdatedJul 8, 2024 -
omniparse Public
Forked from adithya-s-k/omniparseIngest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
Python GNU General Public License v3.0 UpdatedJul 5, 2024 -
rag-omni Public
Forked from Logistic98/rag-omni基于BM25、BGE、OpenAI Embedding检索算法的检索增强生成RAG示例,支持OpenAI风格的大模型服务
Python UpdatedJun 10, 2024 -
DeepKE Public
Forked from zjunlp/DeepKE[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
Python MIT License UpdatedMay 14, 2024 -
OpenMetadata Public
Forked from open-metadata/OpenMetadataOpen Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
TypeScript Apache License 2.0 UpdatedApr 17, 2024 -
dinky Public
Forked from DataLinkDC/dinkyDinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
Java Apache License 2.0 UpdatedApr 1, 2024 -
label-studio Public
Forked from HumanSignal/label-studioLabel Studio is a multi-type data labeling and annotation tool with standardized output format
JavaScript Apache License 2.0 UpdatedMar 19, 2024 -
CMB Public
Forked from FreedomIntelligence/CMBCMB, A Comprehensive Medical Benchmark in Chinese
Python Apache License 2.0 UpdatedMar 14, 2024 -
text2vec Public
Forked from shibing624/text2vectext2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。
Python Apache License 2.0 UpdatedFeb 21, 2024