Stars
[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
[AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
[CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
[CVPR 2025] VideoWorld is a simple generative model that learns purely from unlabeled videos—much like how babies learn by observing their environment.
[RSS25] Official implementation of DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
[WACV 2025 Oral] Transferring Foundation Models for Generalizable Robotic Manipulation
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models