Stars
Solve Visual Understanding with Reinforced VLMs
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Production-ready platform for agentic workflow development.
🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.
A Comprehensive Toolkit for High-Quality PDF Content Extraction
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment
A collection of awesome video generation studies.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Question and Answer based on Anything.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Utilize the unlimited free GPT-3.5-Turbo API service provided by the login-free ChatGPT Web.
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Generative Agents: Interactive Simulacra of Human Behavior