Abstract
The role of artificial intelligence-based decision-making processes in the recent advancements of data-driven technology is inevitable, but the key challenge is embedded here in the opacity of their decision-making mechanisms, presenting difficulties in explaining these processes to the end users. The well-known RISE method (randomized input sampling for explanations) and its variants are widely used now-a-days for explainability with image data through perturbative approach. However, due to significantly large number of forward passes as required for increasing number of mask generation, RISE is heavy in computation. The issue is potentially addressed in this paper by intelligently sampling fewer number of masks through a guided scheme, instead of using large number of randomly generated masks. Our proposed approach of guided input sampling-based explanations (GuISE), introduces an innovative method for generating an importance map, illustrating the saliency of each pixel in the model’s predictions. Unlike white-box explanation schemes that depend on gradients or internal network states for pixel importance estimation, GuISE functions as a black-box approach and outperforms particularly in its masking technique. To validate our approach, we compare it against the state-of-the-art importance extraction methods using both automatic deletion and insertion metrics. Extensive experiments on benchmark image datasets demonstrates comparable or superior performance of our proposed GuISE, even surpassing the white-box approaches. This highlights the effectiveness of GuISE in achieving explainability of deep neural networks for image-based applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Bayer, J., Münch, D., Arens, M.: A comparison of deep saliency map generators on multispectral data in object detection. In: Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies V, vol. 11869, pp. 61–74. SPIE (2021)
Bayer, J., Münch, D., Arens, M.: Deep saliency map generators for multispectral video classification. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 3757–3764. IEEE (2022)
Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N.: Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 839–847. IEEE (2018)
Chen, T.C.T., Wu, H.C., Chiu, M.C.: A deep neural network with modified random forest incremental interpretation approach for diagnosing diabetes in smart healthcare. Appl. Soft Comput. 152, 111183 (2024)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010)
Hakkoum, H., Abnane, I., Idri, A.: Interpretability in the medical field: a systematic mapping and review study. Appl. Soft Comput. 117, 108391 (2022)
Kaltsas, G.A., Nomikos, P., Kontogeorgos, G., Buchfelder, M., Grossman, A.B.: Diagnosis and management of pituitary carcinomas. J. Clin. Endocrinol. Metab. 90(5), 3089–3099 (2005)
Nguyen, T.T.H., Truong, V.B., Nguyen, V.T.K., Cao, Q.H., Nguyen, Q.K.: Towards trust of explainable AI in thyroid nodule diagnosis. arXiv preprint arXiv:2303.04731 (2023)
Nickparvar, M.: Brain tumor MRI dataset. Data set. Kaggle (2021). https://doi.org/10.34740/KAGGLE/DSV/2645886. Accessed 3 Mar 2021
Petsiuk, V., Das, A., Saenko, K.: RISE: randomized input sampling for explanation of black-box models. arXiv preprint arXiv:1806.07421 (2018)
Petsiuk, V., et al.: Black-box explanation of object detectors via saliency maps. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11443–11452 (2021)
Ribeiro, M.T., Singh, S., Guestrin, C.: “ Why should I trust you?” explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Roshan, K., Zafar, A.: Utilizing XAI technique to improve autoencoder based model for computer network anomaly detection with Shapley additive explanation (SHAP). arXiv preprint arXiv:2112.08442 (2021)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Soomro, S., Niaz, A., Choi, K.N.: Grad++ ScoreCAM: enhancing visual explanations of deep convolutional networks using incremented gradient and score-weighted methods. IEEE Access (2024)
Szczepankiewicz, K., et al.: Ground truth based comparison of saliency maps algorithms. Sci. Rep. 13(1), 16887 (2023)
Truong, V.B., Nguyen, T.T.H., Nguyen, V.T.K., Nguyen, Q.K., Cao, Q.H.: Towards better explanations for object detection. Preprint arXiv:2306.02744 (2023)
Wang, H., et al.: Score-CAM: score-weighted visual explanations for convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 24–25 (2020)
Yang, Q., Zhu, X., Fwu, J.K., Ye, Y., You, G., Zhu, Y.: MFPP: morphological fragmental perturbation pyramid for black-box model explanations. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 1376–1383. IEEE (2021)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhang, J., Bargal, S.A., Lin, Z., Brandt, J., Shen, X., Sclaroff, S.: Top-down neural attention by excitation backprop. Int. J. Comput. Vision 126(10), 1084–1102 (2018)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Acknowledgements
This research is supported by the Start-up Research Grant [SRG/2023/002658 to M.D.] from the SERB (DST), Govt. of India, New Delhi.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bano, A., Das, M. (2025). A Guided Input Sampling-Based Perturbative Approach for Explainable AI in Image-Based Application. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15304. Springer, Cham. https://doi.org/10.1007/978-3-031-78128-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-78128-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78127-8
Online ISBN: 978-3-031-78128-5
eBook Packages: Computer ScienceComputer Science (R0)