Acknowledgements

⚔ Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

AAAI 2025

Breaking News 🔥🔥!!

🔥💥 2024/12: Glad to announce that MCM is accepted by AAAI 2025.
🎉🎉 2024/5: We are excited to officially announce the open-sourcing of MCM.

Abstract

Security concerns related to Large Language Models (LLMs) have been extensively explored; however, the safety implications for Multimodal Large Language Models (MLLMs), particularly in medical contexts (MedMLLMs), remain inadequately addressed. This paper investigates the security vulnerabilities of MedMLLMs, focusing on their deployment in clinical environments where the accuracy and relevance of question-and-answer interactions are crucial for addressing complex medical challenges. We introduce and redefine two attack types: mismatched malicious attack (2M-attack) and optimized mismatched malicious attack (O2M-attack), by integrating existing clinical data with atypical natural phenomena.

Using the comprehensive 3MAD dataset that we developed, which spans a diverse range of medical imaging modalities and adverse medical scenarios, we performed an in-depth analysis and proposed the MCM optimization method. This approach significantly improves the attack success rate against MedMLLMs. Our evaluations, which include white-box attacks on LLaVA-Med and transfer (black-box) attacks on four other SOTA models, reveal that even MedMLLMs designed with advanced security mechanisms remain vulnerable to breaches. This study highlights the critical need for robust security measures to enhance the safety and reliability of open-source MedMLLMs, especially in light of the potential impact of jailbreak attacks and other malicious exploits in clinical applications. \textbf{Warning:} Medical jailbreaking may generate content that includes unverified diagnoses and treatment recommendations. Always consult professional medical advice.

Methodlogy

The Multimodal Cross-Optimization (MCM) algorithm simultaneously optimizes both continuous image inputs (image w/ noise) and discrete text tokens (suffix) to jailbreak multimodal large language models into producing harmful content (jailbreaking answer). It does this by manipulating the input image (image w/o noise) with noise and appending specific text tokens to the query, aiming to maximize the likelihood of the model generating a harmful response to a malicious question

Code and Dataset

Left: Components of images in the 3MAD (9 modalities and 12 body parts). Right: Components of normal prompts in the 3MAD (18 medical tasks or requirements).

Warning

Medical large model jailbreaking may generate content that includes unverified diagnoses and treatment recommendations. Always consult professional medical advice.

Citation

If you find our work helpful, please consider citing the following paper:

@article{huang2024cross,
  title={Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models},
  author={Huang, Xijie and Wang, Xinyuan and Zhang, Hantao and Xi, Jiawen and An, Jingkun and Wang, Hao and Pan, Chengwei},
  journal={arXiv preprint arXiv:2405.20775},
  year={2024}
}

Acknowledgements

We acknowledge all the authors of the employed public datasets, allowing the community to use these valuable resources for research purposes. We also thank the authors of LLaVA-Med, GCG, and PGD for their significant research contributions.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
api		api
assets/figs		assets/figs
llava		llava
.gitignore		.gitignore
README.md		README.md
metric.py		metric.py
run_jailbreak.py		run_jailbreak.py
tmux_max.sh		tmux_max.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

⚔ Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

AAAI 2025

Breaking News 🔥🔥!!

Abstract

Methodlogy

Code and Dataset

Warning

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

JeixHuang/MCM

Folders and files

Latest commit

History

Repository files navigation

⚔ Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models

AAAI 2025

Breaking News 🔥🔥!!

Abstract

Methodlogy

Code and Dataset

Warning

Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages