GitHub - DHPR-dataset/DHPR-dataset: [IEEE-TIV 2024] Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Table of Contents

Introduction
Demo
Data Files
Evaluation
Leaderboard
License

Introduction

This repository contains details about the DHPR (Driving Hazard Prediction and Reasoning) dataset. The DHPR dataset was introduced to solve the problem of predicting hazards that drivers may encounter while driving a car. We formulate it as visual abductive reasoning using a single input image captured by car dashcams.

The dataset consists of:

14,975 street scenes
Car speeds
Hazard descriptions
Visual entity descriptions (Oracle Scenario Only)

Demo

Please find more details of the dataset in this Demo.

Data Files

Given the following file tree:

annotation_files
├── anno_train.json
└── anno_val.json

Evaluation

To be updated.

Leaderboard

To submit results, please upload the result file (To be updated).

Leaderboard of Results for the image retrieval (IR) and text retrieval (TR) tasks and the generation task on the DHPR test split. The retrieval tasks are evaluated by the average rank and Recall@1. The generation task is evaluated using BLEU (B4), ROUGE (R), CIDEr (C), SPIDER (S), and the GPT-4 score. For all metrics except the rank metric, higher values indicate better performance. For GPT-4V, we perform a zero-shot evaluation on the test split.

Model	Visual Encoder	IR Rank	IR R@1	TR Rank	TR R@1	Text Decoder	B4	R	C	S	GPT-4
CLIP	ViT-L/14	10.8	24.1%	10.9	24.8%	-	-	-	-	-	-
BLIP	ViT-B/16	15.3	9.3%	15.9	8.1%	BERT	12.6	32.9	34.9	30.3	39.3
BLIP2	ViT-g/14	11.5	19.1%	12.1	19.8%	OPT-6.7B	18.7	42.7	38.9	35.4	50.5
LLaVA-1.5	ViT-L/14	-	-	-	-	LLaMA-2 7B	14.9	36.9	34.5	30.9	56.2
GPT-4V	-	-	-	-	-	GPT-4	0.3	19.0	0.9	7.2	50.0
Ours	ViT-L/14	10.2	24.9%	10.3	26.3%	LLaMA-2 7B	16.9	39.5	49.1	39.6	58.5

License

The dataset used in this paper is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

Citation

@article{10568360,
  author={Charoenpitaks, Korawat and Nguyen, Van-Quang and Suganuma, Masanori and Takahashi, Masahiro and Niihara, Ryoma and Okatani, Takayuki},
  journal={IEEE Transactions on Intelligent Vehicles}, 
  title={Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction}, 
  year={2024},
  volume={},
  number={},
  pages={1-11},
  keywords={Hazards;Cognition;Videos;Automobiles;Accidents;Task analysis;Natural languages;Vision;Language;Reasoning;Traffic Accident Anticipation},
  doi={10.1109/TIV.2024.3417353}
}

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
annotation		annotation
images		images
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Introduction

Demo

Data Files

Evaluation

Leaderboard

License

Citation

About

Uh oh!

Releases

Packages

License

DHPR-dataset/DHPR-dataset

Folders and files

Latest commit

History

Repository files navigation

Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Introduction

Demo

Data Files

Evaluation

Leaderboard

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages