Evaluating Robustness of Deep Neural Networks for Automated Cancer Detection in Endoscopy

This repository contains code used to create the robustness test set used in the following publications:

Tim J.M. Jaspers et al. - Investigating the Impact of Image Quality on Endoscopic AI Model Performance (Second Workshop on Applications of Medical AI (AMAI) - Satellite Event MICCAI 2023)
Tim J.M. Jaspers et al. - Robustness evaluation of deep neural networks for endoscopic image analysis: insights and strategies (Medical Image Analysis)

Abstract

Computer-aided detection and diagnosis systems (CADe/CADx) in endoscopy are commonly trained using high-quality imagery, which is not representative for the heterogeneous input typically encountered in clinical practice. In endoscopy, the image quality heavily relies on both the skills and experience of the endoscopist and the specifications of the system used for screening. Factors such as poor illumination, motion blur, and specific post-processing settings can significantly alter the quality and general appearance of these images. This so-called domain gap between the data used for de- veloping the system and the data it encounters after deployment, and the impact it has on the performance of deep neural networks (DNNs) supportive endoscopic CAD systems remains largely unexplored. As many of such systems, for e.g. polyp detection, are already being rolled out in clinical practice, this poses severe patient risks in particularly community hospitals, where both the imaging equipment and experience are subject to considerable variation. Therefore, this study aims to evaluate the impact of this domain gap on the clinical performance of CADe/CADx for various endoscopic applications. For this, we leverage two publicly available data sets (KVASIR-SEG and GIANA) and two in-house data sets. We investigate the performance of commonly-used DNN architectures under synthetic, clinically calibrated image degradations and on a prospectively collected dataset including 342 endoscopic images of lower subjective quality. Additionally, we assess the influence of DNN architecture and complexity, data augmentation, and pretraining techniques for improved robustness. The results reveal a considerable decline in performance of 11.6% (±1.5) as compared to the reference, within the clinically calibrated boundaries of image degradations. Nevertheless, employing more advanced DNN architectures and self-supervised in-domain pre-training effectively mitigate this drop to 7.7% (±2.03). Additionally, these enhancements yield the highest performance on the manually collected test set including images with lower subjective quality. By comprehensively assessing the robustness of popular DNN architectures and training strategies across multiple datasets, this study provides valuable in- sights into their performance and limitations for endoscopic applications. The findings highlight the importance of including robustness evaluation when developing DNNs for endoscopy applications and propose strategies to mitigate performance loss.

Installation

To clone the repository, use the following command:

git clone https://github.com/TimJaspers0801/Robustness.git

To install the required packages, navigate to the root directory of the repository and run the following command:

pip install -r requirements.txt

This will install all of the necessary packages listed in the requirements.txt file. Please also make sure you install the ImageMagicWand library: https://imagemagick.org/script/download.php

Create Robustness test set

The purpose of the robustness test set is created to evaluate endoscopic models using more heterogeneous data by incorporating the aforementioned corruptions. In the original paper, we included corruptions up to severity level 5, as they are clinically calibrated and still realistic at that level. The paper revealed a performance drop of up to 14% on the robustness test set.

An other option could be to only evaluated upto severity level 2, which represent the amount of image degradation expected in 'expert level' datasets. Alternatively, for those seeking to assess robustness in extreme scenarios, the evaluation could extend to severity levels 8, 9, and 10.

To generate the Robustness test set, use the following command:

python create_robustness_set.py 'path/to/testset/images' path_masks='path/to/testset/Masks'  max_level=5, min_level=1 nb_iterations=5 include_compression=True

include the paths to masks if there are present. nb_iterations denotes the number of iterations the original test set is looped over. In the original paper, the test set was corrupted a total of 5 times. After waiting for a while the corrupted images can be found in the 'robustness test set' folder.

Fig 4. Random examples included in the robustness test set.

Citation

If you find our work useful in your research please consider citing our paper:

@inproceedings{IQimpact2024,
author = {Jaspers, Tim J. M. and Boers, Tim G. W. and Kusters, Carolus H. J. and Jong, Martijn R. and Jukema, Jelmer B. and de Groof, Albert J. and Bergman, Jacques J. and de With, Peter H. N. and van der Sommen, Fons},
title = {Investigating the Impact of Image Quality on Endoscopic AI Model Performance},
year = {2023},
isbn = {978-3-031-47075-2},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
url = {https://doi.org/10.1007/978-3-031-47076-9_4},
doi = {10.1007/978-3-031-47076-9_4},
booktitle = {Applications of Medical Artificial Intelligence: Second International Workshop, AMAI 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, October 8, 2023, Proceedings},
pages = {32–41},
numpages = {10},
keywords = {Image degradation, Robustness, DNN, Endoscopy},
location = {Vancouver, BC, Canada}
}

@article{Robustness2024,
title = {Robustness evaluation of deep neural networks for endoscopic image analysis: Insights and strategies},
journal = {Medical Image Analysis},
volume = {94},
pages = {103157},
year = {2024},
issn = {1361-8415},
doi = {https://doi.org/10.1016/j.media.2024.103157},
url = {https://www.sciencedirect.com/science/article/pii/S1361841524000823},
author = {Tim J.M. Jaspers and Tim G.W. Boers and Carolus H.J. Kusters and Martijn R. Jong and Jelmer B. Jukema and Albert J. {de Groof} and Jacques J. Bergman and Peter H.N. {de With} and Fons {van der Sommen}},
keywords = {Deep learning, Endoscopy, Robustness, Image degradation, Image quality},
}

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
Images		Images
JPG_image		JPG_image
Public_datasets_split		Public_datasets_split
Robustness test set		Robustness test set
.gitignore		.gitignore
README.md		README.md
create_robustness_set.py		create_robustness_set.py
global_corruptions.py		global_corruptions.py
local_corruptions.py		local_corruptions.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluating Robustness of Deep Neural Networks for Automated Cancer Detection in Endoscopy

Abstract

Installation

Create Robustness test set

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

TimJaspers0801/Robustness

Folders and files

Latest commit

History

Repository files navigation

Evaluating Robustness of Deep Neural Networks for Automated Cancer Detection in Endoscopy

Abstract

Installation

Create Robustness test set

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages