Part-Prototype Models in Medical Imaging: Applications and Current Challenges
<p>Part−prototype network reasoning process during prediction (normal vs. pneumonia classification task from RX images). These models learn prototypes in terms of representative image regions for the predicted class from the training set and perform the classification based on their detection of new images (prototypical regions marked with yellow boxes).</p> "> Figure 2
<p>Global and local explanations of part-prototype network (classification of Alzheimer’s disease from MR images). The global explanation shows all the learned prototypes. The local explanation shows the model’s reasoning for a specific instance.</p> "> Figure 3
<p>ProtoPNet architecture.</p> "> Figure 4
<p>Prototypical part visualization in a normal vs. pneumonia classification task for a normal test image (marked with yellow box). Radiological images were displayed in the standard grayscale (windowing over the entire signal range) while activation maps and heatmaps were visualized using the same color map.</p> ">
Abstract
:1. Introduction
2. Foundational Part-Prototype Models
2.1. Part-Prototype Network (ProtoPNet)
- : Cross-entropy loss, which penalizes misclassification on the training data.
- : Cluster cost, which promotes every training image to have some latent patch that is close to at least one prototype of its own class.
- : Separation cost, which promotes every latent patch of a training image to stay away from the prototypes not of its class.
- Forwarding through a ProtoPNet to produce the activation map associated with the prototypes .
- Upsampling the activation map to the dimension of the input image.
- Localizing the smallest rectangular patch whose corresponding activation is at least as large as the 95th percentile of all the activation values in that same map.
2.2. XProtoNet
- Classification weighted loss, , is used to address the imbalance in the dataset:
- Regularization for interpretability which, similarly to [13], includes two different terms, and , to, respectively, maximize the similarity between x and for positive samples and minimize it for negative samples:
- Regularization for the occurrence map:
- -
- The term considers that an affine transformation of an image does not change the relative location, so it should not affect the occurrence map either.
- -
- The term regularizes to have an occurrence area as small as possible to not include unnecessary regions.
- Upsampling the occurrence maps to the input image size.
- Normalizing with the maximum value of the upsampled mask.
- Marking with contour the occurrence values that are greater than a factor of 0.3 of the maximum intensity.
2.3. Neural Prototype Tree (ProtoTree)
- Forwarding through .
- Creating a 2-dim similarity map:
- Upsampling with bicubic interpolation to the shape of .
- Visualizing as a rectangular patch at the same location nearest to the latent patch .z
2.4. Protypical Part Shared Network (ProtoPShare)
- Computing the data-dependent similarity for pair of prototypes , given by the compliance on the similarity scores for all the training input image . This considers two prototypes similar if they activate alike on the training images, even if far in the latent space:
- Selecting a percentage of the most similar pairs of prototypes to merge per step .
- For each pair, removing prototype and its weights and reusing prototype , aggregating weights and .
2.5. ProtoPool
2.6. Patch-Based Intuitive Prototype Network (PIPNet)
- Alignment Loss, , optimizes for near-binary encodings, where an image patch corresponds to exactly one prototype:
- Tanh Loss, prevents the trivial solution in which one prototype node is activated on all image patches in each image in the dataset forcing every prototype to be present at least once in a mini-batch:
3. Application and Advances in Medical Imaging
3.1. Two-Dimensional Image Models
3.2. Three-Dimensional and Multimodal Models
Paper | Modality | Dataset | Classes | Results |
---|---|---|---|---|
Two-dimensional image models | ||||
Singh et al. [25] | X-ray | Chest X-ray, COVID-19 Image | Normal, pneumonia, COVID−19 | Acc = 88.99% |
Singh et al. [26] | X-ray | Chest X-ray, COVID-19 Image | Normal, pneumonia, COVID-19 | Acc = 87.27% |
Singh et al. [27] | CT | COVIDx CT-2 | Normal, pneumonia, COVID−19 | Acc = 99.24% |
Kim et al. [19] | X-ray | NIH chest X-ray | Atelectasis, cardiomegaly, effusion, infiltration, mass, nodule, pneumonia, pneumathorax, consolidation, edema, emphysema, fibrosis, pleural thickening, hernia | Mean AUC = 0.822 |
Mohammadjafari et al. [31] | MRI | OASIS, ADNI | Normal vs. Alzheimer’s disease | Acc: 87.17% (OASIS), 91.02% (ADNI) |
Barnett et al. [33] | Mammography | Internal dataset | Mass margin classification Malignancy prediction | AUC: 0.951 (Mass-margin) 0.84 (Malignancy) |
Carloni et al. [34] | Mammography | CBIS-DDSM | Benign, malignant | Acc = 68.5% |
Amorim et al. [36] | Histology | PatchCamelyon | Benign, malignant | Acc = 98.14% |
Flores-Araiza et al. [38] | Endoscopy | Simulated in vivo dataset | Whewellite, weddellite, anhydrous uric acid, struvite, brushite, cystine | Acc = 88.21% |
Kong et al. [39] | Dermatoscopic images | ISIC-HAM10000 | Actinic keratosis, intraepithelial carcinoma, nevi, basal cell carcinoma, benign keratosis-like lesions, dermatofibroma, melanoma, vascular lesions | F1 = 74.6 |
Santiago et al. [41] | Dermatoscopic images | ISIC-HAM10000 | Actinic keratosis, intraepithelial carcinoma, nevi, basal cell carcinoma, benign keratosis-like lesions, dermatofibroma, melanoma, vascular lesions | Bal Acc = 75.0% (Highest, achieved with DenseNet) |
Cui et al. [42] | X-ray | Chest X-ray | Normal, pneumonia | Acc = 91.4% |
Nauta et al. [14] | Dermoscopic images, X-ray | ISIC, MURA, Hip and ankle fraction internal dataset | Benignant, malignant, normal, abnormal, fracture, no fracture | Acc: 94.1% (ISIC) 82.1% (MURA) 94.0% (Hip) 77.3% (Ankle) |
Santos et al. [43] | Retinograph | Messidor | Healthy vs. diseased retinopathy | AUC = 0.79 |
Wang et al. [45] | Mammography, retinal OCT | Mammography internal dataset, CMMD, NEH OCT | Cancer vs. non-cancer Benignant vs. malignant Normal, drusen, and choroidal neovascularization | AUC = 91.49 (Internal) AUC = 89.02 (CMMD) Acc = 91.9 (NEH OCT) |
Xu et al. [46] | X-ray Lung CT | COVIDx CXR-3 COVID-QU-Ex Lung CT scan | COVID-10, normal, pneumonia | F1: 99.2 96.8 98.5 |
Sinhamahapatra et al. [47] | CT | VerSe’19 dataset | Fracture vs. healthy | F1 = 75.97 |
Pathak et al. [18] | Mammography | CBIS, VinDir, CMMD | Benignant vs. malignant | F1 (PIP-Net model): 63 ± 3% (CBIS) 63 ± 3% (VinDir) 70 ± 1% (CMMD) |
Gallée et al. [24] | Thorax CT | LIDC-IDRI | Benignant vs. malignant | Acc = 93.0% |
Three-dimensional image models | ||||
Wei et al. [48] | 3D mpMRI: T1, T1CE, T2, FLAIR | BraTS 2020 | High-grade vs. low-grade glioma | Bal Acc = 85.8% |
Vaseli et al. [50] | Echocardiography | Private dataset TMED-2 | Normal vs. mild vs. severe aortic stenosis | Acc: 80.0% (Private) 79.7 (TMED-2)% |
De Santi et al. [52] | 3D MRI | ADNI | Normal vs. Alzheimer’s disease | Bal Acc = 82.02% |
Multimodal models | ||||
Wolf et al. [54] | 3D 18F-FDG PET and Tabular data | ADNI | Normal vs. Alzheimer’s disease | Bal Acc = 60.7% |
Wang et al. [55] | Chest X-ray and reports | MIMIC-CXR | Atelectasis, cardiomegaly, consolidation, edema, enlarged cardiomediastinum, fracture, lung lesion, lung opacity, pleural effusion, pleural other, pneumonia, pneumothorax, support device | Mean AUC = 0.828 |
De Santi et al. [56] | 3D MRI and Ages | ADNI | Normal vs. Alzheimer’s disease | Bal Acc = 83.04% |
4. Evaluation of Prototypes
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AP | activation precision |
AUC | Area Under the Curve |
AUROC | Area Under the Receiver Operating Characteristic |
CNN | Convolutional Neural Network |
CT | computed tomography |
CV | Computer Vision |
DL | Deep Learning |
IDS | incremental deletion score |
MI | Medical Imaging |
AP | activation precision |
AUC | Area Under the Curve |
AUROC | Area Under the Receiver Operating Characteristic |
CNN | Convolutional Neural Network |
CT | Computed Tomography |
CV | Computer Vision |
DL | Deep Learning |
IDS | incremental deletion score |
MI | Medical Imaging |
ML | Machine Learning |
MR | magnetic resonance |
MRI | magnetic resonance imaging |
OoD | Out-of-Distribution |
PIPNet | Patch-based Intuitive Prototype Network |
PP | part-prototype |
RX | Radiography |
SGD | stochastic gradient descent |
SOTA | State-of-the-Art |
XAI | Explainable Artificial Intelligence |
References
- Salahuddin, Z.; Woodruff, H.C.; Chatterjee, A.; Lambin, P. Transparency of deep neural networks for medical image analysis: A review of interpretability methods. Comput. Biol. Med. 2022, 140, 105111. [Google Scholar] [CrossRef] [PubMed]
- Borys, K.; Schmitt, Y.A.; Nauta, M.; Seifert, C.; Krämer, N.; Friedrich, C.M.; Nensa, F. Explainable AI in medical imaging: An overview for clinical practitioners—Saliency-based XAI approaches. Eur. J. Radiol. 2023, 162, 110787. [Google Scholar] [CrossRef] [PubMed]
- Borys, K.; Schmitt, Y.A.; Nauta, M.; Seifert, C.; Krämer, N.; Friedrich, C.M.; Nensa, F. Explainable AI in medical imaging: An overview for clinical practitioners—Beyond saliency-based XAI approaches. Eur. J. Radiol. 2023, 162, 110786. [Google Scholar] [CrossRef]
- Allgaier, J.; Mulansky, L.; Draelos, R.L.; Pryss, R. How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare. Artif. Intell. Med. 2023, 143, 102616. [Google Scholar] [CrossRef]
- Longo, L.; Brcic, M.; Cabitza, F.; Choi, J.; Confalonieri, R.; Ser, J.D.; Guidotti, R.; Hayashi, Y.; Herrera, F.; Holzinger, A.; et al. Explainable Artificial Intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research directions. Inf. Fusion 2024, 106, 102301. [Google Scholar] [CrossRef]
- Li, O.; Liu, H.; Chen, C.; Rudin, C. Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2017. [Google Scholar]
- Cynthia, R. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
- Nauta, M.; Schlötterer, J.; van Keulen, M.; Seifert, C. PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 2744–2753. [Google Scholar]
- Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; Garcia, S.; Gil-Lopez, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
- Montavon, G.; Samek, W.; Müller, K.R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 2018, 73, 1–15. [Google Scholar] [CrossRef]
- David, G.; Eric, V.; Yunyan, W.J.; Matt, T. DARPA’s explainable AI (XAI) program: A retrospective. Appl. Lett. 2021, 2, e61. [Google Scholar] [CrossRef]
- Biederman, I. Recognition-by-Components: A Theory of Human Image Understanding. Psychol. Rev. 1987, 94, 115–147. [Google Scholar] [CrossRef]
- Chen, C.; Li, O.; Tao, C.; Barnett, A.J.; Su, J.; Rudin, C. This Looks Like That: Deep Learning for Interpretable Image Recognition. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Nauta, M.; Hegeman, J.H.; Geerdink, J.; Schlötterer, J.; Keulen, M.v.; Seifert, C. Interpreting and Correcting Medical Image Classification with PIP-Net. In Artificial Intelligence, ECAI 2023 International Workshops, Proceedings of the XAI^3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, Kraków, Poland, 30 September–4 October 2023; Springer: Cham, Switzerland, 2024; pp. 198–215. [Google Scholar]
- Nauta, M.; Seifert, C. The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers. In Explainable Artificial Intelligence, Proceedings of the First World Conference, xAI 2023, Lisbon, Portugal, 26–28 July 2023; Longo, L., Ed.; Springer: Cham, Switzerland, 2023; pp. 397–420. [Google Scholar]
- Doshi-Velez, F.; Kim, B. Towards A Rigorous Science of Interpretable Machine Learning. arXiv 2017, arXiv:1702.08608. [Google Scholar]
- Jin, W.; Li, X.; Fatehi, M.; Hamarneh, G. Guidelines and evaluation of clinical explainable AI in medical image analysis. Med. Image Anal. 2023, 84, 102684. [Google Scholar] [CrossRef] [PubMed]
- Pathak, S.; Schlötterer, J.; Veltman, J.; Geerdink, J.; Keulen, M.V.; Seifert, C.; Pathak, S. Prototype-Based Interpretable Breast Cancer Prediction Models: Analysis and Challenges. In Explainable Artificial Intelligence, Proceedings of the Second World Conference, xAI 2024, Valletta, Malta, 17–19 July 2024; Springer: Cham, Switzerland, 2024; pp. 21–42. [Google Scholar] [CrossRef]
- Kim, E.; Kim, S.; Seo, M.; Yoon, S. XProtoNet: Diagnosis in chest radiography with global and local explanations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 15719–15728. [Google Scholar]
- Nauta, M.; van Bree, R.; Seifert, C. Neural Prototype Trees for Interpretable Fine-Grained Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 14933–14943. [Google Scholar]
- Rymarczyk, D.; Struski, Ł.; Tabor, J.; Zieliński, B. ProtoPShare: Prototypical Parts Sharing for Similarity Discovery in In-terpretable Image Classification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; Volume 11. [Google Scholar] [CrossRef]
- Rymarczyk, D.; Struski, Ł.; Górszczak, M.; Lewandowska, K.; Tabor, J.; Zieliński, B. Interpretable Image Classification with Differentiable Prototypes Assignment. In Computer Vision—ECCV 2022, Proceedings of the 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2022; Volume 13672, pp. 351–368. [Google Scholar] [CrossRef]
- van de Beld, J.J.; Pathak, S.; Geerdink, J.; Hegeman, J.H.; Seifert, C. Feature Importance to Explain Multimodal Prediction Models. a Clinical Use Case. In Explainable Artificial Intelligence, Proceedings of the Second World Conference, xAI 2024, Valletta, Malta, 17–19 July 2024; Springer: Cham, Switzerland, 2024; pp. 84–101. [Google Scholar] [CrossRef]
- Gallée, L.; Lisson, C.S.; Lisson, C.G.; Drees, D.; Weig, F.; Vogele, D.; Beer, M.; Götz, M. Evaluating the Explainability of Attributes and Prototypes for a Medical Classification Model. In Explainable Artificial Intelligence, Proceedings of the Second World Conference, xAI 2024, Valletta, Malta, 17–19 July 2024; Springer: Cham, Switzerland, 2024; pp. 43–56. [Google Scholar] [CrossRef]
- Singh, G.; Yow, K.C. These do not look like those: An interpretable deep learning model for image recognition. IEEE Access 2021, 9, 41482–41493. [Google Scholar] [CrossRef]
- Singh, G.; Yow, K.C. An Interpretable Deep Learning Model for COVID-19 Detection with Chest X-Ray Images. IEEE Access 2021, 9, 85198–85208. [Google Scholar] [CrossRef]
- Singh, G.; Yow, K.C. Object or background: An interpretable deep learning model for COVID-19 detection from CT-scan images. Diagnostics 2021, 11, 1732. [Google Scholar] [CrossRef]
- Kermany, D.; Zhang, K.; Goldbaum, M. Large dataset of labeled optical coherence tomography (oct) and chest X-ray images. Mendeley Data 2018, 3. [Google Scholar] [CrossRef]
- Cohen, J.P.; Morrison, P.; Dao, L. COVID-19 image data collection. arXiv 2020, arXiv:2003.11597. [Google Scholar]
- Wang, X.; Peng, Y.; Lu, L.; Lu, Z.; Bagheri, M.; Summers, R.M. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2097–2106. [Google Scholar]
- Mohammadjafari, S.; Cevik, M.; Thanabalasingam, M.; Basar, A.; Initiative, A.D.N. Using ProtoPNet for Interpretable Alzheimer’s Disease Classification. In Proceedings of the Canadian AI 2021, Canadian Artificial Intelligence Association (CAIAC), Vancouver, BC, Canada, 25–28 May 2021; Available online: https://caiac.pubpub.org/pub/klwhoig4 (accessed on 5 September 2024). [CrossRef]
- Marcus, D.S.; Wang, T.H.; Parker, J.; Csernansky, J.G.; Morris, J.C.; Buckner, R.L. Open Access Series of Imaging Studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J. Cogn. Neurosci. 2007, 19, 1498–1507. [Google Scholar] [CrossRef]
- Barnett, A.J.; Schwartz, F.R.; Tao, C.; Chen, C.; Ren, Y.; Lo, J.Y.; Rudin, C. A case-based interpretable deep learning model for classification of mass lesions in digital mammography. Nat. Mach. Intell. 2021, 3, 1061–1070. [Google Scholar] [CrossRef]
- Carloni, G.; Berti, A.; Iacconi, C.; Pascali, M.A.; Colantonio, S. On the applicability of prototypical part learning in medical images: Breast masses classification using ProtoPNet. In Pattern Recognition, Computer Vision, and Image Processing, Proceedings of the ICPR 2022 International Workshops and Challenges, Montreal, QC, Canada, 21–25 August 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 539–557. [Google Scholar]
- Lee, R.S.; Gimenez, F.; Hoogi, A.; Miyake, K.K.; Gorovoy, M.; Rubin, D.L. A curated mammography data set for use in computer-aided detection and diagnosis research. Sci. Data 2017, 4, 170177. [Google Scholar] [CrossRef]
- Amorim, J.P.; Abreu, P.H.; Santos, J.; Müller, H. Evaluating Post-hoc Interpretability with Intrinsic Interpretability. arXiv 2023, arXiv:2305.03002. [Google Scholar]
- Bejnordi, B.E.; Veta, M.; Van Diest, P.J.; Van Ginneken, B.; Karssemeijer, N.; Litjens, G.; Van Der Laak, J.A.; Hermsen, M.; Manson, Q.F.; Balkenhol, M.; et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. Jama 2017, 318, 2199–2210. [Google Scholar] [CrossRef] [PubMed]
- Flores-Araiza, D.; Lopez-Tiro, F.; El-Beze, J.; Hubert, J.; Gonzalez-Mendoza, M.; Ochoa-Ruiz, G.; Daul, C. Deep prototypical-parts ease morphological kidney stone identification and are competitively robust to photometric perturbations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 295–304. [Google Scholar]
- Kong, L.; Gong, L.; Wang, G.; Liu, S. DP-ProtoNet: An interpretable dual path prototype network for medical image diagnosis. In Proceedings of the 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom/BigDataSE/CSE/EUC/iSCI 2023, Exeter, UK, 1–3 November 2023; pp. 2797–2804. [Google Scholar] [CrossRef]
- Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 2018, 5, 180161. [Google Scholar] [CrossRef] [PubMed]
- Santiago, C.; Correia, M.; Verdelho, M.R.; Bissoto, A.; Barata, C. Global and Local Explanations for Skin Cancer Diagnosis Using Prototypes. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2023 Workshops, Proceedings of the ISIC 2023, Care-AI 2023, MedAGI 2023, DeCaF 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, 8–12 October 2023; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2023; Volume 14393, pp. 47–56. [Google Scholar] [CrossRef]
- Cui, J.; Gong, J.; Wang, G.; Li, J.; Liu, X.; Liu, S. An Novel Interpretable Fine-grained Image Classification Model Based on Improved Neural Prototype Tree. In Proceedings of the IEEE International Symposium on Circuits and Systems, Monterey, CA, USA, 21–25 May 2023. [Google Scholar] [CrossRef]
- de A. Santos, I.B.; de Carvalho, A.C.P.L.F. ProtoAL: Interpretable Deep Active Learning with prototypes for medical imaging. arxiv 2024, arXiv:cs.CV/2404.04736. [Google Scholar]
- Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. CE-Net: Context Encoder Network for 2D Medical Image Segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [Google Scholar] [CrossRef] [PubMed]
- Wang, C.; Chen, Y.; Liu, F.; Elliott, M.; Kwok, C.F.; Pena-Solorzano, C.; Frazer, H.; Mccarthy, D.J.; Carneiro, G. An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning. IEEE Trans. Med. Imaging 2024, 43, 392–404. [Google Scholar] [CrossRef]
- Xu, Y.; Meng, Z. Interpretable vision transformer based on prototype parts for COVID-19 detection. IET Image Process. 2024, 18, 1927–1937. [Google Scholar] [CrossRef]
- Sinhamahapatra, P.; Shit, S.; Sekuboyina, A.; Husseini, M.; Schinz, D.; Lenhart, N.; Menze, J.; Kirschke, J.; Roscher, K.; Guennemann, S. Enhancing Interpretability of Vertebrae Fracture Grading using Human-interpretable Prototypes. J. Mach. Learn. Biomed. Imaging 2024, 2024, 977–1002. [Google Scholar] [CrossRef]
- Wei, Y.; Tam, R.; Tang, X. MProtoNet: A Case-Based Interpretable Model for Brain Tumor Classification with 3D Multi-parametric Magnetic Resonance Imaging. In Proceedings of the Medical Imaging with Deep Learning, Nashville, TN, USA, 10–12 July 2023. [Google Scholar]
- Nauta, M.; Trienes, J.; Pathak, S.; Nguyen, E.; Peters, M.; Schmitt, Y.; Schlötterer, J.; van Keulen, M.; Seifert, C. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI. ACM Comput. Surv. 2023, 55, 1–42. [Google Scholar] [CrossRef]
- Vaseli, H.; Gu, A.N.; Amiri, S.N.A.; Tsang, M.Y.; Fung, A.; Kondori, N.; Saadat, A.; Abolmaesumi, P.; Tsang, T.S. ProtoASNet: Dynamic Prototypes for Inherently Interpretable and Uncertainty-Aware Aortic Stenosis Classification in Echocardiography. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2023, Proceedings of the 26th International Conference, Vancouver, BC, Canada, 8–12 October 2023; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2023; Volume 14225, pp. 368–378. [Google Scholar] [CrossRef]
- Huang, Z.; Long, G.; Wessler, B.; Hughes, M. TMED 2: A Dataset for Semi-Supervised Classification of Echocardiograms. 2022. Available online: https://www.michaelchughes.com/papers/HuangEtAl_TMED2_DataPerf_2022.pdf (accessed on 5 September 2024).
- De Santi, L.A.; Schlötterer, J.; Scheschenja, M.; Wessendorf, J.; Nauta, M.; Positano, V.; Seifert, C. PIPNet3D: Interpretable Detection of Alzheimer in MRI Scans. arXiv 2024, arXiv:2403.18328. [Google Scholar]
- Ma, Y.; Zhao, S.; Wang, W.; Li, Y.; King, I. Multimodality in meta-learning: A comprehensive survey. Know.-Based Syst. 2022, 250, 108976. [Google Scholar] [CrossRef]
- Wolf, T.N.; Pölsterl, S.; Wachinger, C. Don’t PANIC: Prototypical Additive Neural Network for Interpretable Classification of Alzheimer’s Disease. In Proceedings of the Information Processing in Medical Imaging: 28th International Conference, IPMI 2023, San Carlos de Bariloche, Argentina, 18–23 June 2023; Proceedings. Springer: Berlin/Heidelberg, Germany, 2023; pp. 82–94. [Google Scholar] [CrossRef]
- Wang, G.; Li, J.; Tian, C.; Ma, X.; Liu, S. A Novel Multimodal Prototype Network for Interpretable Medical Image Classification. In Proceedings of the Conference Proceedings—IEEE International Conference on Systems, Man and Cybernetics, Honolulu, HI, USA, 1–4 October 2023; pp. 2577–2583. [Google Scholar] [CrossRef]
- De Santi, L.A.; Schlötterer, J.; Nauta, M.; Positano, V.; Seifert, C. Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer’s Disease classification. In Proceedings of the xAI 2024 Late-breaking Work, Demos and Doctoral Consortium co-located with the 2nd World Conference on eXplainable Artificial Intelligence (xAI 2024), Valletta, Malta, 17–19 July 2024; pp. 73–80. Available online: https://ceur-ws.org/Vol-3793/paper_10.pdf (accessed on 5 September 2024).
- Johnson, A.E.W.; Pollard, T.J.; Greenbaum, N.R.; Lungren, M.P.; ying Deng, C.; Peng, Y.; Lu, Z.; Mark, R.G.; Berkowitz, S.J.; Horng, S. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs. arXiv 2019, arXiv:1901.07042. [Google Scholar]
- van der Velden, B.H.; Kuijf, H.J.; Gilhuijs, K.G.; Viergever, M.A. Explainable artificial intelligence (XAI) in deep learning-based medical image analysis. Med. Image Anal. 2022, 79, 102470. [Google Scholar] [CrossRef] [PubMed]
- Cabitza, F.; Campagner, A.; Ronzio, L.; Cameli, M.; Mandoli, G.E.; Pastore, M.C.; Sconfienza, L.M.; Folgado, D.; Barandas, M.; Gamboa, H. Rams, hounds and white boxes: Investigating human–AI collaboration protocols in medical diagnosis. Artif. Intell. Med. 2023, 138, 102506. [Google Scholar] [CrossRef]
- Gautam, S.; Höhne, M.M.C.; Hansen, S.; Jenssen, R.; Kampffmeyer, M. This looks More Like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation. Pattern Recogn. 2023, 136, 109172. [Google Scholar] [CrossRef]
- Opłatek, S.; Rymarczyk, D.; Zieliński, B. Revisiting FunnyBirds Evaluation Framework for Prototypical Parts Networks. In Explainable Artificial Intelligence, Proceedings of the Second World Conference, xAI 2024, Valletta, Malta, 17–19 July 2024; Springer: Cham, Switzerland, 2024; pp. 57–68. [Google Scholar] [CrossRef]
- Xu-Darme, R.; Quénot, G.; Chihani, Z.; Rousset, M.C. Sanity checks for patch visualisation in prototype-based image classification. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 17–24 June 2023; pp. 3691–3696. [Google Scholar] [CrossRef]
Co-12 Property | Description |
---|---|
Content | |
Correctness: | Since PP models are interpretable by design, the explanations are generated together with the prediction, and the reasoning process is correctly represented by design. Also, the faithfulness of the prototype visualization (from the latent representation to the input image patches), originally performed by carrying out bicubic upsampling, is not guaranteed by design and should be evaluated. |
Completeness: | The relation between the prototypes and classes is transparently shown, so the output-completeness is fulfilled by design, but the computation performed by the CNN backbone is not taken into consideration. |
Consistency: | PP models should not have random components in their designs, but nondeterminism may occur from the backbones’ initialisation and random seeds. It might be assessed by comparing explanations from models trained with different initializations or with different shuffling of the training data. |
Continuity: | It should be evaluated whether slightly perturbed inputs lead to the same explanation, given that the model makes the same classification. |
Contrastivity: | The incorporated interpretability of PP models results in a contrastivity incorporated by design; such a different classification corresponds to a different reasoning and, hence, to a different explanation. This evaluation might also include a target sensitivity analysis by inspecting where prototypes are detected in the test image. |
Covariate complexity: | The complexity of the features present in the prototypes is assessed with the ground truth, such as predefined concepts provided by human judgements (perceived homogeneity) or with object part annotations. |
Presentation | |
Compactness: | The number of prototypes which constitute the full classification model (global explanation size) in every input image (local explanation sizes) and the redundancy in the information content presented in different prototypes should be evaluated. The size of the explanation should be appropriate to not overwhelm the user. |
Composition: | How PP can be best presented to the user, and how these prototypes can be best structured and included in the reasoning process by comparing different explanation formats or by asking users about their preferences regarding the presentation and structure of the explanation should be assessed. |
Confidence: | Estimate the confidence of the explanation generation method, including measurements such as the prototype similarity scores. |
User | |
Context: | PP models should be evaluated with application-grounded user studies, similarly to evaluations with heatmaps, to understand their needs. |
Coherence: | Prototypes are often evaluated based on anecdotal evidence, with automated evaluation with an annotated dataset, or with manual evaluation. User studies might include the assessment of satisfaction, preference, and trust for part-prototypes. |
Controllability: | The ability to directly manipulate the explanation and the model’s reasoning e.g., enable users to suppress or modify learned prototypes, eventually with the aid of a graphical user interface. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
De Santi, L.A.; Piparo, F.I.; Bargagna, F.; Santarelli, M.F.; Celi, S.; Positano, V. Part-Prototype Models in Medical Imaging: Applications and Current Challenges. BioMedInformatics 2024, 4, 2149-2172. https://doi.org/10.3390/biomedinformatics4040115
De Santi LA, Piparo FI, Bargagna F, Santarelli MF, Celi S, Positano V. Part-Prototype Models in Medical Imaging: Applications and Current Challenges. BioMedInformatics. 2024; 4(4):2149-2172. https://doi.org/10.3390/biomedinformatics4040115
Chicago/Turabian StyleDe Santi, Lisa Anita, Franco Italo Piparo, Filippo Bargagna, Maria Filomena Santarelli, Simona Celi, and Vincenzo Positano. 2024. "Part-Prototype Models in Medical Imaging: Applications and Current Challenges" BioMedInformatics 4, no. 4: 2149-2172. https://doi.org/10.3390/biomedinformatics4040115
APA StyleDe Santi, L. A., Piparo, F. I., Bargagna, F., Santarelli, M. F., Celi, S., & Positano, V. (2024). Part-Prototype Models in Medical Imaging: Applications and Current Challenges. BioMedInformatics, 4(4), 2149-2172. https://doi.org/10.3390/biomedinformatics4040115