GitHub - xiaomi-mlab/Orion: Official code of ORION

ORION: A Holistic End-to-End Autonomous Driving Framework
by Vision-Language Instructed Action Generation

Haoyu Fu^1*, Diankun Zhang^2*, Zongchuang Zhao^1*, Jianfeng Cui², Dingkang Liang^1†,
Chong Zhang², Dingyuan Zhang¹, Hongwei Xie^2†, Bing Wang², Xiang Bai¹

¹ Huazhong University of Science & Technology, ² Xiaomi EV

(*) Equal contribution. (†) Project leader.

Abstract

End-to-end (E2E) autonomous driving methods still struggle to make correct decisions in interactive closed-loop evaluation due to limited causal reasoning capability. Current methods attempt to leverage the powerful understanding and reasoning abilities of Vision-Language Models (VLMs) to resolve this dilemma. However, the problem is still open that few VLMs for E2E methods perform well in the closed-loop evaluation due to the gap between the semantic reasoning space and the purely numerical trajectory output in the action space. To tackle this issue, we propose ORION, a hOlistic E2E autonomous dRiving framework by vIsion-language instructed actiON generation. ORION uniquely combines a QT-Former to aggregate long-term history context, a Large Language Model (LLM) for driving scenario reasoning, and a generative planner for precision trajectory prediction. ORION further aligns the reasoning space and the action space to implement a unified E2E optimization for both visual question-answering (VQA) and planning tasks. Our method achieves an impressive closed-loop performance of 77.74 Driving Score (DS) and 54.62% Success Rate (SR) on the challenge Bench2Drive datasets, which outperforms state-of-the-art (SOTA) methods by a large margin of 14.28 DS and 19.61% SR.

Overview

News

[2025/04/10] ORION inference code and checkpoint release.

[2025/03/26] ArXiv paper release.

Currently Supported Features

Getting Started

git clone https://github.com/xiaomi-mlab/Orion.git
cd ./ORION
conda create -n orion python=3.8 -y
conda activate orion
pip install torch==2.4.1+cu118 torchvision==0.19.1+cu118 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu118
pip install -v -e .
pip install -r requirements.txt

Preperation

You can refer to here to prepare the Bench2drive dataset.

ORION uses the pretrained 2D llm weights and vision encoder + projector weights provided by Omnidrive

cd /path/to/OmniDrive
mkdir ckpts

The vision encoder + projector weights are extracted from ckpts/pretrain_qformer/, which is pretrained by using llava data.

Open-loop evaluation

You can perform an open-loop evaluation of ORION with the following command

./adzoo/orion/orion_dist_eval.sh adzoo/orion/configs/orion_stage3.py [--PATH_CHECKPOINTS] 1

You also can perform a CoT inference of ORION with (this might be quite slow)

./adzoo/orion/orion_dist_eval.sh adzoo/orion/configs/orion_stage3_cot.py [--PATH_CHECKPOINTS] 1

We recommend inference for ORION on an NVIDIA A100 or other GPUs with more than 32GB of memory (inference in FP32, as default).

Meanwhile, Orion can also perform FP16 inference and achieve almost the same performance. We recommend fp16 inference on a GPU with more than 17GB of memory.

./adzoo/orion/orion_dist_eval.sh adzoo/orion/configs/orion_stage3_fp16.py [--PATH_CHECKPOINTS] 1

Close-loop evaluation

You can refer to here to clone Bench2Drive evaluation tools and prepare CARLA for it.

Follow here to use evaluation tools of Bench2Drive.

Note that you may first verify the correctness of the team agent， you need to set GPU_RANK, TEAM_AGENT, TEAM_CONFIG in the eval scripts.

You can set as following for close-loop evaluation

TEAM_CONFIG=adzoo/orion/configs/orion_stage3_agent.py+[CHECKPOINT_PATH]

Results and Checkpoints

Orion and other baselines

The results of UniAD & VAD are refer to the official results of Bench2DriveZoo

Method	L2 (m) 2s	Driving Score	Success Rate(%)	Config	Download	Eval Json
UniAD-Tiny	0.80	40.73	13.18	config	Hugging Face/Baidu Cloud	Json
UniAD-Base	0.73	45.81	16.36	config	Hugging Face/Baidu Cloud	Json
VAD	0.91	42.35	15.00	config	Hugging Face/Baidu Cloud	Json
ORION	0.68	77.74	54.62	config	Hugging Face	Json

Qalitative visualization & Analysis

We provide some visualization videos and qualitatively analysis for Orion and compared them with TCP-traj, UniAD-Base, VAD-Base at here.

Citation

If this work is helpful for your research, please consider citing:

@article{fu2025orion,
  title={ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation},
  author={Haoyu Fu and Diankun Zhang and Zongchuang Zhao and Jianfeng Cui and Dingkang Liang and Chong Zhang and Dingyuan Zhang and Hongwei Xie and Bing Wang and Xiang Bai},
  journal={arXiv:2503.19755},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
adzoo		adzoo
assets		assets
data		data
docs		docs
mmcv		mmcv
team_code		team_code
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ORION: A Holistic End-to-End Autonomous Driving Framework
by Vision-Language Instructed Action Generation

Abstract

Overview

News

Currently Supported Features

Getting Started

Preperation

Open-loop evaluation

Close-loop evaluation

Results and Checkpoints

Orion and other baselines

Qalitative visualization & Analysis

Citation

About

Releases

Packages

Contributors 2

Languages

License

xiaomi-mlab/Orion

Folders and files

Latest commit

History

Repository files navigation

ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation

Abstract

Overview

News

Currently Supported Features

Getting Started

Preperation

Open-loop evaluation

Close-loop evaluation

Results and Checkpoints

Orion and other baselines

Qalitative visualization & Analysis

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

ORION: A Holistic End-to-End Autonomous Driving Framework
by Vision-Language Instructed Action Generation

Packages