-
Zhejiang University
Highlights
- Pro
Lists (16)
Sort Name ascending (A-Z)
Stars
[CVPR 2024] The official implementation for "SemCity: Semantic Scene Generation with Triplane Diffusion"
[ICLR 2024 spotlight] Official implementation of "InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior".
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
Python code to implement LLM4Teach, a policy distillation approach for teaching reinforcement learning agents with Large Language Model
【三年面试五年模拟】AIGC算法工程师面试秘籍。涵盖AIGC、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、具身智能、元宇宙、AGI等AI行业面试笔试经验与干货知识。
[NeurIPS 2024] DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization
[CVPR 2025] WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
[ICRA 2025] Official implementation of Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation. Project website: http://moma-llm.cs.uni-freiburg.de
GNN-RAG: Graph Neural Retrieval for Large Language Modeling Reasoning
Public repository for Temporal Scene-Object Graph Learning for Object Navigation
Repository for ICRA'24 Paper "Optimal Scene Graph Planning with Large Language Model Guidance"
NaviFormer: A Spatio-Temporal Context-Aware Transformer for Object Navigation (AAAI25)
Code of the paper "Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation"
ICRA2025: OpenGS-SLAM: Open-Set Dense Semantic SLAM with 3D Gaussian Splatting for Object-Level Scene Understanding
Official implementation of “4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models” (CVPR 2025)
A new zero-shot framework to explore and search for the language descriptive targets in unknown environment based on Large Vision Language Model.
OpenSPG is a Knowledge Graph Engine developed by Ant Group in collaboration with OpenKG, based on the SPG (Semantic-enhanced Programmable Graph) framework. Core Capabilities: 1) domain model constr…
A collection of AWESOME things about Graph-Related LLMs.
[CVPR 2025] UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
[ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
这是一本关于SLAM的书稿,希望能清楚的介绍SLAM系统中的使用的几何方法和深度学习方法。书稿最后应该会达到200页左右,书稿每章对应的代码也会被整理出来。
Situational Awareness Database for Instruct-Tuning (SAD-Instruct)