8000 wondervictor (Tianheng Cheng) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View wondervictor's full-sized avatar
🤡
coding
🤡
coding

Highlights

  • Pro

Organizations

@hustvl @msra-alumni @HRNet @TencentARC

Block or report wondervictor

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"

Python 42 2 Updated Jun 24, 2025
Python 540 21 Updated Jun 23, 2025

Dream 7B, a large diffusion language model

Python 777 34 Updated Jun 18, 2025

Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning

Python 43 2 Updated Jun 10, 2025

Official repository for the paper "Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers"

Python 11 Updated Jun 3, 2025

A version of verl to support tool use

Python 252 15 Updated Jun 18, 2025

Vision Language Models are Biased

Python 44 Updated Jun 20, 2025

procedural reasoning datasets

Python 878 69 Updated Jun 24, 2025

Code release for paper "Test-Time Training Done Right"

Python 159 8 Updated Jun 19, 2025

Open-source Multi-agent Poster Generation from Papers

Python 2,175 118 Updated Jun 17, 2025

Efficient triton implementation of Native Sparse Attention.

Python 169 12 Updated May 23, 2025

Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Python 52 Updated Jun 3, 2025

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Python 1,163 94 Updated Mar 11, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 14,912 2,953 Updated Jun 25, 2025

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models

Python 1,126 52 Updated Jun 13, 2025

Open-source unified multimodal model

Python 4,299 360 Updated Jun 17, 2025

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Python 1,024 98 Updated Jun 12, 2025

所有小初高、大学PDF教材。

Roff 41,053 9,104 Updated May 18, 2025

Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.

Jupyter Notebook 1,257 47 Updated Jun 14, 2025

DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.

TypeScript 14,246 1,706 Updated Jun 19, 2025

TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

Python 50 2 Updated Jun 19, 2025

[CVPR 2025] Official implementation for "Empowering LLMs to Understand and Generate Complex Vector Graphics" https://arxiv.org/abs/2412.11102

Python 527 5 Updated May 22, 2025

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 1,468 62 Updated Jun 5, 2025

Understand and test language model architectures on synthetic tasks.

Python 218 35 Updated Jun 8, 2025

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Python 1,823 136 Updated Mar 13, 2025

A curated collection of resources, tools, and frameworks for developing GUI Agents.

76 2 Updated Jun 20, 2025
10 Updated Mar 6, 2025

A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training

Python 385 20 Updated Jun 24, 2025
Next
0