8000 breezedeus (Breezedeus) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View breezedeus's full-sized avatar

Block or report breezedeus

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🎨 Turn your roughest sketches into stunning 3D worlds by vibe drawing

TypeScript 1,803 262 Updated Mar 25, 2025

Fully open reproduction of DeepSeek-R1

Python 24,394 2,246 Updated May 13, 2025

StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and te…

Python 3,785 199 Updated Apr 15, 2025
Python 26 2 Updated Apr 29, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 12,359 855 Updated May 13, 2025

🍒 Cherry Studio is a desktop client that supports for multiple LLM providers.

TypeScript 26,176 2,250 Updated May 14, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,810 1,738 Updated Feb 26, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,610 196 Updated May 12, 2025

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,547 661 Updated Feb 10, 2025

✨✨Latest Papers and Datasets on Mobile and PC GUI Agent

124 8 Updated Nov 29, 2024

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 22,047 1,850 Updated Mar 26, 2025

🧑‍🚀 全世界最好的LLM资料总结(Agent框架、辅助编程、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.

5,165 512 Updated May 13, 2025

moffee: Make Markdown Ready to Present

Python 1,181 53 Updated Nov 22, 2024
Python 467 42 Updated Feb 17, 2025

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 4,017 441 Updated Apr 15, 2025

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,316 283 Updated Nov 5, 2024

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 10,355 738 Updated May 4, 2025

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 5,805 641 Updated Mar 19, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,408 1,401 Updated Mar 3, 2025

Waterfall-style image viewer for macOS, offering a smooth and immersive browsing experience.

Swift 763 20 Updated May 11, 2025

Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.

Python 1,056 92 Updated Mar 21, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 15,413 1,734 Updated Dec 25, 2024

Real time interactive streaming digital human

Python 5,552 826 Updated May 1, 2025

Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。

Python 4,143 340 Updated May 7, 2025

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 1,438 113 Updated May 2, 2025

[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention

Python 851 67 Updated Mar 20, 2025

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,723 129 Updated Apr 21, 2025

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 2,158 270 Updated May 11, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 48,840 5,940 Updated May 13, 2025

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 25,149 2,538 Updated May 6, 2025
Next
0