-
SHANGHAI AILAB
- China
-
10:26
(UTC -04:00) - https://yding25.github.io/
Highlights
- Pro
Stars
🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
A Scalable and Hardware-Independent Universal Manipulation Interface
Star-UU-Wang / query_anygrasp
Forked from graspnet/anygrasp_sdkA simple try for anygrasp with text queries
Q-attention (within the ARM system) and coarse-to-fine Q-attention (within C2F-ARM system).
Robotics Toolbox for Python
DORA (Dataflow-Oriented Robotic Architecture) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dat…
Learning-based locomotion control from OpenRobotLab, including Hybrid Internal Model & H-Infinity Locomotion Control
Mobile manipulation research tools for roboticists
《动手学大模型Dive into LLMs》系列编程实践教程
DROID Policy Learning and Evaluation
Using RGB Image as Visual Input for Mapless Robot Navigation
Generating Robotic Simulation Tasks via Large Language Models
CLIPort: What and Where Pathways for Robotic Manipulation
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
OpenMMLab Detection Toolbox and Benchmark
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Codebase for the BestMan Mobile Manipulator Platform
PDDLStream: Integrating Symbolic Planners and Blackbox Samplers
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.
[CVPR 2022] "MonoScene: Monocular 3D Semantic Scene Completion": 3D Semantic Occupancy Prediction from a single image
Mobile manipulation in Habitat