8000 AlfredXiangWu (Alfred Xiang Wu) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View AlfredXiangWu's full-sized avatar
🙃
🙃

Block or report AlfredXiangWu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Solve Visual Understanding with Reinforced VLMs

Python 5,041 309 Updated May 11, 2025

SpatialLM: Large Language Model for Spatial Understanding

Python 3,217 250 Updated Mar 28, 2025

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 13,901 1,700 Updated Jun 2, 2025

OpenVLA: An open-source vision-language-action model for robotic manipulation.

Python 2,901 373 Updated Mar 23, 2025

Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥

Python 39,848 3,148 Updated Jun 2, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 48,709 7,732 Updated Jun 2, 2025

ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

Python 479 36 Updated Mar 20, 2025

Illumination Drawing Tools for Text-to-Image Diffusion Models

760 120 Updated May 4, 2025

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 319 20 Updated May 27, 2025

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,129 72 Updated Oct 21, 2024

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 10,243 1,209 Updated Jun 2, 2025

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,310 169 Updated Mar 28, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

559 20 Updated May 8, 2025

This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.

Jupyter Notebook 568 83 Updated Apr 29, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 566 41 Updated May 8, 2024

Effortless data labeling with AI support from Segment Anything and other awesome models.

Python 5,660 629 Updated May 30, 2025

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models

Python 635 30 Updated Dec 23, 2024

✨✨Latest Advances on Multimodal Large Language Models

15,420 998 Updated May 30, 2025

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 50,314 5,908 Updated Sep 18, 2024

A curated list of awesome header-only C++ libraries

3,784 242 Updated Jul 15, 2024

OpenXRLab Structure-from-Motion Toolbox and Benchmark

C++ 208 26 Updated Jul 31, 2024

An awesome PyTorch NeRF library

Python 1,282 105 Updated Jul 23, 2024

✍️ AI powered documentation writer

TypeScript 2,972 141 Updated Feb 10, 2025

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

Python 6,155 662 Updated Apr 20, 2025

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

C++ 2,390 266 Updated May 12, 2025

CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification (ICCV2021)

Python 48 13 Updated Aug 12, 2021

Header-only 4437 C++/python library for fast approximate nearest neighbors

C++ 4,718 711 Updated Apr 20, 2025

[TPAMI 2021] DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition

Python 77 16 Updated Nov 13, 2023

TF-NAS: Rethinking Three Search Freedoms of Latency-Constrained Differentiable Neural Architecture Search (ECCV2020)

Python 76 11 Updated May 23, 2021
Next
0