8000 LoYuXr (Yuxuan Luo) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View LoYuXr's full-sized avatar

Highlights

  • Pro

Block or report LoYuXr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MMMG:AMassive,Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Python 12 Updated Jun 17, 2025
Python 2,246 218 Updated Jun 16, 2025

MINT-1T: A one trillion token multimodal interleaved dataset.

817 19 Updated Jul 31, 2024

Examples and guides for using the OpenAI API

MDX 64,908 10,729 Updated Jun 25, 2025

GenEval: An object-focused framework for evaluating text-to-image alignment

HTML 313 21 Updated Mar 3, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,637 686 Updated Jun 25, 2025

SPEAR: A Simulator for Photorealistic Embodied AI Research

C++ 281 23 Updated Jun 26, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 25,349 2,282 Updated Jun 26, 2025

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

Python 4,507 369 Updated Oct 29, 2024

Official codebase for the Paper “Retrieval-Augmented Diffusion Models”

Jupyter Notebook 131 8 Updated Apr 5, 2023

用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, moonshot, PaddleOCR, OpenAI, Llava.

Python 14 3 Updated Sep 15, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,682 672 Updated Feb 10, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 22,206 1,498 Updated Jun 26, 2025

请不要用刷课机刷课

Python 45 5 Updated May 22, 2023

An open-source framework for training large multimodal models.

Python 3,960 306 Updated Aug 31, 2024
Python 8 Updated Sep 1, 2023

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 11,181 1,634 Updated Apr 26, 2025

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 4,619 350 Updated May 29, 2025

😎 Finding duplicate images made easy!

Python 5,408 466 Updated Jun 26, 2025

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 8,417 648 Updated May 29, 2025

Codes for ICLR 2025 Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM

Python 65 1 Updated Apr 19, 2025

ancient-chat-llm: A LLM which is proficient in Chinese culture 古语说: 一个精通中国文化的大模型

Python 43 3 Updated Feb 4, 2024

Kolors Team

Python 4,468 330 Updated Nov 13, 2024

The official Pytorch Implementation for ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation (CVPR 2024)

Python 156 7 Updated Dec 24, 2024

Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction

8 Updated Jun 1, 2025

Compute FID scores with PyTorch.

Python 3,669 526 Updated Jul 3, 2024

Lumina-T2X is a unified framework for Text to Any Modality Generation

Python 2,199 91 Updated Feb 16, 2025

Combining MMOCR with Segment Anything & Stable Diffusion. Automatically detect, recognize and segment text instances, with serval downstream tasks, e.g., Text Removal and Text Inpainting

Python 564 42 Updated Jan 30, 2024
Next
0