breezedeus

Breezedeus breezedeus

善意的AI生产幸福❤️

437 followers · 22 following

Achievements

x3 x3

Achievements

x3 x3

Lists (2)

Sort

🔮 Future ideas

1 repository

✨ Inspiration

1 repository

Stars

martin226 / vibe-draw

🎨 Turn your roughest sketches into stunning 3D worlds by vibe drawing

TypeScript 1,803 262 Updated Mar 25, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,394 2,246 Updated May 13, 2025

joanrod / star-vector

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 3,785 199 Updated Apr 15, 2025

microsoft / GUI-Agent-RL

Python 26 2 Updated Apr 29, 2025

allenai / olmocr

Toolkit for linearizing PDFs for LLM datasets/training

Python 12,359 855 Updated May 13, 2025

CherryHQ / cherry-studio

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

TypeScript 26,176 2,250 Updated May 14, 2025

deepseek-ai / DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,810 1,738 Updated Feb 26, 2025

MiniMax-AI / MiniMax-01

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,610 196 Updated May 12, 2025

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,547 661 Updated Feb 10, 2025

aialt / awesome-mobile-agents

✨✨Latest Papers and Datasets on Mobile and PC GUI Agent

124 8 Updated Nov 29, 2024

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 22,047 1,850 Updated Mar 26, 2025

WangRongsheng / awesome-LLM-resources

🧑‍🚀 全世界最好的LLM资料总结（Agent框架、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.

5,165 512 Updated May 13, 2025

BMPixel / moffee

moffee: Make Markdown Ready to Present

Python 1,181 53 Updated Nov 22, 2024

modelscope / MemoryScope

Python 467 42 Updated Feb 17, 2025

huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 4,017 441 Updated Apr 15, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,316 283 Updated Nov 5, 2024

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,355 738 Updated May 4, 2025

TencentQQGYLab / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 5,805 641 Updated Mar 19, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,408 1,401 Updated Mar 3, 2025

netdcy / FlowVision

Waterfall-style image viewer for macOS, offering a smooth and immersive browsing experience.

Swift 763 20 Updated May 11, 2025

character-ai / prompt-poet

Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.

Python 1,056 92 Updated Mar 21, 2025

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 15,413 1,734 Updated Dec 25, 2024