Large Language Models for Navigation (LLM4NAV)

I currently focus on large language models for navigation including

Surveys
Perception
Planning
Control
Deployment
Tools

Last Update: 2025/04/08

Surveys

[2025] Generative Models in Decision Making: A Survey, arXiv [Paper]
[2025] A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond, arXiv [Paper]
[2025] Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives, arXiv [Paper] [Code]
[2025] The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey, arXiv [Paper] [Code]
[2025] A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models, arXiv [Paper]
[2025] Embodied Intelligence: A Synergy of Morphology, Action, Perception and Learning, ACM Computing Surveys [Paper]
[2025] Large Language Models for Multi-Robot Systems: A Survey, arXiv [Paper] [Code]
[2025] Survey on Large Language Model Enhanced Reinforcement Learning: Conceptaxonomy, and Methods, IEEE TNNLS [Paper]
[2025] Humanoid Locomotion and Manipulation: Current Progress and Challenges in Control, Planning, and Learning, arXiv [Paper]
[2025] UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility, arXiv [Paper] [Code]
[2025] A Survey of World Models for Autonomous Driving, arXiv [Paper]
[2025] Towards Large Reasoning Models: A Survey on Scaling LLM Reasoning Capabilities, arXiv [Paper]
[2025] A Survey on Large Language Models with some Insights on their Capabilities and Limitations, arXiv [Paper]
[2025] 基于大模型的具身智能系统综述, 自动化学报 [Paper]
[2025] 具身智能研究的关键问题: 自主感知、行动与进化, 自动化学报 [Paper] [Code]
[2024] 大模型驱动的具身智能: 发展与挑战, 中国科学 [Paper]
[2024] From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities, arXiv [Paper] [Code]
[2024] Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models, TMLR [Paper] [Code]
[2024] Efficient Large Language Models: A Survey, TMLR [Paper] [Code]
[2024] A Survey on Multimodal Large Language Models for Autonomous Driving, WACV [Paper]
[2024] Personalization of Large Language Models: A Survey, arXiv [Paper]
[2024] A Survey on LLM Inference-Time Self-Improvement, arXiv [Paper]
[2024] Embodied Navigation with Multi-modal Information: A Survey from Tasks to Methodology, Information Fusion [Paper]
[2024] Recent Advances in Robot Navigation via Large Language Models: A Review, arXiv [Paper]
[2024] Large Language Models for Robotics: Opportunities, Challenges, and Perspectives, arXiv [Paper]
[2024] Advances in Embodied Navigation Using Large Language Models: A Survey, arXiv [Paper]
[2024] Foundation Models in Robotics: Applications, Challenges, and the Future, IJRR [Paper] [Code]
[2024] A Survey of Large Language Models, arXiv [Paper] [Code]
[2024] ChatGPT for Robotics: Design Principles and Model Abilities, IEEE Access [Paper]
[2023] Large Language Models for Robotics: A Survey, arXiv [Paper]
[2023] LLM4Drive: A Survey of Large Language Models for Autonomous Driving, arXiv [Paper] [Code]

Perception

[2025] Visual-RFT: Visual Reinforcement Fine-Tuning, arXiv [Paper] [Code]
[2025] ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration, arXiv [Paper] [Code]
[2025] LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token, arXiv [Paper] [Code]
[2025] Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling, arXiv [Paper] [Code]
[2024] OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding, arXiv [Paper]
[2024] NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models, AAAI [Paper]
[2024] LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation, arXiv [Paper] [Code]
[2024] OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments, arXiv [Paper]
[2023] Chat with the Environment: Interactive Multimodal Perception using Large Language Models, IROS [Paper]
[2023] VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models, arXiv [Paper]
[2023] Steve-Eye: Equipping LLM-Based Embodied Agents with Visual Perception in Open Worlds, ICLR [Paper]
[2023] LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action, CORL [Paper]
[2022] Flamingo: a Visual Language Model for Few-Shot Learning, NeurIPS [Paper]
[2021] Open-Vocabulary Object Detection via Vision and Language Knowledge Distillation, arXiv [Paper]

Planning

[2025] NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning, IEEE TPAMI [Paper]
[2025] FlexVLN: Flexible Adaptation for Diverse Vision-and-Language Navigation Tasks, arXiv [Paper]
[2025] MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation, arXiv [Paper]
[2025] NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM, arXiv [Paper]
[2025] LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs, arXiv [Paper] [Video]
[2025] Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives, arXiv [Paper] [Video]
[2025] SD++: Enhancing Standard Defnition Mapsby Incorporating Road Knowledge using LLMs, arXiv [Paper]
[2025] FAST: Efficient Action Tokenization for Vision-Language-Action Models, arXiv [Paper] [Video]
[2025] AdaWM: Adaptive World Model based Planning for Autonomous Driving, arXiv [Paper]
[2025] Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving, AAAI [Paper]
[2025] LLM-attacker: Enhancing Closed-loop Adversarial Scenario Generation for Autonomous Driving with Large Language Models, arXiv [Paper] [Video]
[2024] Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs, EMNLP [Paper]
[2024] Mastering Board Games by Externa 9E38 l and Internal Planning with Language Models, arXiv [Paper]
[2024] TopV-Nav: Unlocking the TopView Spatial Reasoning Potential of MLLM for Zero-shot Obiect Navigation, arXiv [Paper]
[2024] The One RING: a Robotic Indoor Navigation Generalist, arXiv [Paper] [Video]
[2024] Asynchronous Large Language Model Enhanced Planner for Autonomous Driving, ECCV [Paper]
[2024] Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving, arXiv [Paper]
[2024] LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, arXiv [Paper]
[2024] SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New Environments, ICAPS [Paper]
[2024] AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers, ICRA [Paper]
[2023] ProgPrompt: Generating Situated Robot Task Plans using Large Language Models, ICRA [Paper]
[2023] Text2Motion: from Natural Language Instructions to Feasible Plans, Autonomous Robots [Paper]
[2023] LLM as A Robotic Brain: Unifying Egocentric Memory and Control, arXiv [Paper]
[2023] PaLM-E: An Embodied Multimodal Language Model, arXiv [Paper]
[2022] Do As I Can, Not As I Say: Grounding Language in Robotic Affordances, arXiv [Paper]
[2022] Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents, ICML [Paper]
[2021] Learning a Decision Module by Imitating Driver’s Control Behaviors, CORL [Paper]
[2021] Neuro-Symbolic Program Search for Autonomous Driving Decision Module Design, CORL [Paper]
[2021] A Lifelong Learning Approach to Mobile Robot Navigation, IEEE RAL [Paper]

Control

[2025] ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model, arXiv [Paper]
[2024] π0: A Vision-Language-Action Flow Model for General Robot Control, arXiv [Paper] [Video]
[2024] NaVILA: Legged Robot Vision-Language-Action Model for Navigation, arXiv [Paper] [Video]
[2024] Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation, arXiv [Paper]
[2024] GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment, arXiv [Paper]
[2024] Probabilistically Correct Language-based Multi-Robot Planning using Conformal Prediction, arXiv [Paper]
[2024] Enhancing the LLM-Based Robot Manipulation Through Human-Robot Collaboration, arXiv [Paper]
[2024] Scalable Multi-Robot Collaboration with Large Language Models: Centralized or Decentralized Systems?, ICRA [Paper]
[2024] LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination, AAMAS [Paper]
[2024] VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View, AAAI [Paper]
[2024] SRLM: Human-in-Loop Interactive Social Robot Navigation with Large Language Model and Deep Reinforcement Learning, arXiv [Paper]
[2024] RoCo: Dialectic Multi-Robot Collaboration with Large Language Models, ICRA [Paper]
[2024] Building Cooperative Embodied Agents Modularly with Large Language Models, ICLR [Paper]
[2024] Lifelong Robot Learning with Human Assisted Language Planners, ICRA [Paper]
[2024] MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning, arXiv [Paper]
[2024] LANCAR: Leveraging Language for Context-Aware Robot Locomotion in Unstructured Environments, IROS [Paper]
[2023] Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model, arXiv [Paper]
[2023] NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning, IROS [Paper]
[2023] Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-Robot Cooperative Exploration, arXiv [Paper]
[2023] Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using Large Language Models, arXiv [Paper]
[2023] Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach, arXiv [Paper]
[2023] LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models, arXiv [Paper]
[2023] ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation, ICML [Paper]
[2023] Code as Policies: Language Model Programs for Embodied Control, ICRA [Paper]
[2022] Multi-Agent Embodied Visual Semantic Navigation With Scene Prior Knowledge, IEEE RAL [Paper]
[2022] Multi-Robot Active Mapping via Neural Bipartite Graph Matching, CVPR [Paper]
[2022] Learning Efficient Multi-agent Cooperative Visual Exploration, ECCV [Paper]

Deployment

Mamba

[2024] Visual Mamba: A Survey and New Outlooks, arXiv [Paper] [Code]

Pruning

Structured Pruning

[2025] Týr-the-Pruner: Unlocking Accurate 50% Structural Pruning for LLMs via Global Sparsity Distribution Optimization, arXiv [Paper]
[2025] Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing, ICLR [Paper]
[2025] Lightweight and Post-Training Structured Pruning for On-Device Large Lanaguage Models, arXiv [Paper]
[2025] FASP: Fast and Accurate Structured Pruning of Large Language Models, arXiv [Paper]
[2024] FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models, NeurIPS [Paper]
[2024] Fluctuation-Based Adaptive Structured Pruning for Large Language Models, AAAI [Paper]
[2024] LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models, ICML [Paper][Code]
[2024] SlimGPT: Layer-wise Structured Pruning for Large Language Models, NeurIPS [Paper]
[2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning, NeurIPS [Paper]
[2024] Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes, arXiv [Paper]
[2024] Compact Language Models via Pruning and Knowledge Distillation, arXiv [Paper]
[2024] A Deeper Look at Depth Pruning of LLMs, ICML [Paper]
[2024] Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models, arXiv [Paper]
[2024] Plug-and-Play: An Efficient Post-training Pruning Method for Large Language Models, ICLR [Paper]
[2024] BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation, arXiv [Paper]
[2024] ShortGPT: Layers in Large Language Models are More Redundant Than You Expect, arXiv [Paper]
[2024] NutePrune: Efficient Progressive Pruning with Numerous Teachers for Large Language Models, arXiv [Paper]
[2024] SliceGPT: Compress Large Language Models by Deleting Rows and Columns, ICLR[Paper] [Code]
[2023] LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery, arXiv [Paper]
[2023] LLM-Pruner: On the Structural Pruning of Large Language Models, NeurIPS [Paper] [Code]
[2023] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning, NeurIPS [Paper] [Code]
[2023] LoRAPrune: Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning, arXiv [Paper]
[2023] LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation, ICML [Paper] [Code]

Unstructured Pruning

[2025] Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization, ICLR [Paper] [Code]
[2024] Fast and Effective Weight Update for Pruned Large Language Models, TMLR [Paper] [Code]
[2024] A Simple and Effective Pruning Approach for Large Language Models, ICLR [Paper] [Code]
[2024] Pruner-Zero: Evolving Symbolic Pruning Metric From Scratch for Large Language Models, ICML [Paper]
[2024] MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models, NeurIPS [Paper]
[2024] Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs, ICLR [Paper]
[2024] A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models, arXiv [Paper]
[2023] SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot, ICML [Paper] [Code]
[2023] One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models, arXiv [Paper]

Tools

LLM-Action [Link]
LLM-Robotics [Link]
LLM-Cookbook [Link]
Efficient-LLMs-Survey [Link]

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Language Models for Navigation (LLM4NAV)

Surveys

Perception

Planning

Control

Deployment

Mamba

Pruning

Structured Pruning

Unstructured Pruning

Tools

About

Releases

Packages

Contributors 2

xianchaoxiu/LLM4NAV

Folders and files

Latest commit

History

Repository files navigation

Large Language Models for Navigation (LLM4NAV)

Surveys

Perception

Planning

Control

Deployment

Mamba

Pruning

Structured Pruning

Unstructured Pruning

Tools

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages