-
The University of Hong Kong
Highlights
- Pro
Stars
TAPIP3D: Tracking Any Point in Persistent 3D Geometry
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
Official repository for "AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos" (CVPR 2025)
Clash 分流规则,重点分流 AI服务、字节海外AI服务 、 WEB3 应用、教育类APP、开发者常用下载节点等分流服务
【CVPR 2025 Highlight】MonSter: Marry Monodepth to Stereo Unleashes Power
Democratizing Reinforcement Learning for LLMs
Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video” (ECCV 2024)
[SIGGRAPH 2025] One Model to Rig Them All: Diverse Skeleton Rigging with UniRig
Photorealistic Synthetic Dataset for Indoor Scene Understanding
Solve Visual Understanding with Reinforced VLMs
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Script for generating visual multiple-choice question by given image with few panels each panel have a letter. In addition, to text. Test script for interface with GPT,LLAma,Gemini,Claude visual qu…
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
[CVPR 2025] Code for Segment Any Motion in Videos
CycleResearcher: Improving Automated Research via Automated Review
10 Lessons to Get Started Building AI Agents
Aether: Geometric-Aware Unified World Modeling
HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
SpatialLM: Large Language Model for Spatial Understanding
WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes