Stars
Data annotation toolbox supports image, audio and video data.
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools,…
JHipster is a development platform to quickly generate, develop, & deploy modern web applications & microservice architectures.
对官网16.1版本破解,可用永久免费试用。破解最新版的Enterprise Architect 16。Enterprise Architect是一个全面的UML分析和设计工具,用于系统、软件和业务建模。16.1版本有官方中文。
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Flame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured tr…
Toolkit for linearizing PDFs for LLM datasets/training
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
软件工程常用文档模板及示例:可行性分析报告、开发计划、需求分析文档、概要设计文档、详细设计文档、用户操作手册、测试计划、测试分析报告、开发进度报告、项目开发总结报告、软件维护手册等
R1-onevision, a visual language model capable of deep CoT reasoning.
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Get your documents ready for gen AI
A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of differen…
Python tool for converting files and office documents to Markdown.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A ruby gem to liberate content from Microsoft Word documents