PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

This repository provides the official PyTorch implementation of the following paper:

PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
Cong Chen*^1,2, Mingyu Liu*¹, Chenchen Jing³, Yizhou Zhou ², Fengyun Rao ², Hao Chen¹, Bo Zhang¹, Chunhua Shen^1,3,

> ¹Zhejiang University, China, ²WeChat, Tencent, ³Zhejiang University of Technology
> *Equal Contribution

Overview

This paper aims to address the challenge of hallucinations in Multimodal Large Language Models (MLLMs) particularly for dense image captioning tasks. To address the challenge, we identify the current lack of a metric that finely measures the quality of the caption at the concept level. We hereby introduce HalFscore, a novel metric built upon the language graph that is designed to evaluate both the accuracy and completeness of dense captions at a granular level. Additionally, we identify the root cause of hallucination as the model's over-reliance on its language prior. To address this, we propose PerturboLLaVA, which reduces the model's reliance on the language prior by incorporating adversarially perturbed text during training. This method enhances the model's focus on visual inputs, effectively reducing hallucinations and producing accurate, image-grounded descriptions without incurring additional computational overhead. PerturboLLaVA significantly improves the fidelity of generated captions, outperforming existing approaches in handling multimodal hallucinations and achieving improved performance across general multimodal benchmarks.

Hallucination Evaluation

The diagram of computing HalFscore. We construct the language graph to model both the concepts and their relationships for captions. We can then compare the graphs and identify the hallucinations, omissions and matchings between the two sets of concepts respectively.

Hallucination Mitigation

To mitigate the over-reliance on language priors in multimodal models, we introduce a novel training framework that introduces adaptive, context-specific perturbations in the textual inputs during training. This approach simulates the effect of language priors and forces the model to adjust its responses based on visual data rather than textual biases.

Perturbative Training

Our experiments are conducted based on the settings of LLaVA 1.5, reproduced with Xtuner. We focus on the 160k data related to image understanding in LLaVA 1.5 SFT dataset, using ChatGPT-4 to construct corresponding perturbation texts, which were then inserted into the original conversation data for perturbation training.

The script for the GPT prompt used to construct the perturbation data is augmentation/gpt_prompt.py.

In order to explore the impact of the perturbation degree on model training, we design four different methods for inserting perturbation texts, as in augmentation/combine.py . For each insertion method, to prevent the model from overfitting to the corresponding pattern, we design multiple system prompts in augmentation/system_prompts.py and randomly selected one each time.

HalF-Score

You can download the images under the dir PerturboLLaVA/HalFScore/images from This Link

Running the following script to get the tuple and compute the final HalFScore.

bash PerturboLLaVA/HalFScore/results/llava/best_150k_final_v3/eval.sh

Acknowledgement

We based our training and evaluation on these codebases. Thanks for their impressive works!

Xtuner: This is our LLaVA 1.5 reproduction codebase and the codebase for subsequent perturbative training experiments.
VLMEval: Our evaluation codebase for MMBench, SEEDBench, and HallusionBench.
OPERA: Our evaluation codebase for CHAIR.
VCD: We modified the original VCD code to support beam search.
RLAIF-V

BibTeX

@article{chen2025perturbollava,
  title={PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training},
  author={Chen, Cong and Liu, Mingyu and Jing, Chenchen and Zhou, Yizhou and Rao, Fengyun and Chen, Hao and Zhang, Bo and Shen, Chunhua},
  journal={arXiv preprint arXiv:2503.06486},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
HalFScore		HalFScore
augmentation		augmentation
imgs		imgs
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

Overview

Hallucination Evaluation

Hallucination Mitigation

Perturbative Training

HalF-Score

Acknowledgement

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Languages

aim-uofa/PerturboLLaVA

Folders and files

Latest commit

History

Repository files navigation

PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

Overview

Hallucination Evaluation

Hallucination Mitigation

Perturbative Training

HalF-Score

Acknowledgement

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages