Abstract
In clinical practice, tri-modal medical image fusion, compared to the existing dual-modal technique, can provide a more comprehensive view of the lesions, aiding physicians in evaluating the disease’s shape, location, and biological activity. However, due to the limitations of imaging equipment and considerations for patient safety, the quality of medical images is usually limited, leading to sub-optimal fusion performance, and affecting the depth of image analysis by the physician. Thus, there is an urgent need for a technology that can both enhance image resolution and integrate multi-modal information. Although current image processing methods can effectively address image fusion and super-resolution individually, solving both problems synchronously remains extremely challenging. In this paper, we propose TFS-Diff, a simultaneously realize tri-modal medical image fusion and super-resolution model. Specially, TFS-Diff is based on the diffusion model generation of a random iterative denoising process. We also develop a simple objective function and the proposed fusion super-resolution loss, effectively evaluates the uncertainty in the fusion and ensures the stability of the optimization process. And the channel attention module is proposed to effectively integrate key information from different modalities for clinical diagnosis, avoiding information loss caused by multiple image processing. Extensive experiments on public Harvard datasets show that TFS-Diff significantly surpass the existing state-of-the-art methods in both quantitative and visual evaluations. Code is available at https://github.com/XylonXu01 /TFS-Diff.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bhutto, J.A., Tian, L., Du, Q., Sun, Z., Yu, L., Tahir, M.F.: CT and MRI medical image fusion using noise-removal and contrast enhancement scheme with convolutional neural network. Entropy 24(3), 393 (2022)
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Chen, J., Li, X., Luo, L., Ma, J.: Multi-focus image fusion based on multi-scale gradients and image matting. IEEE Trans. Multimedia 24, 655–667 (2021)
Cui, G., Feng, H., Xu, Z., Li, Q., Chen, Y.: Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Optics Commun. 341, 199–209 (2015)
Han, Y., Cai, Y., Cao, Y., Xu, X.: A new image fusion performance metric based on visual information fidelity. Information Fusion 14(2), 127–135 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Jie, Y., Li, X., Zhou, F., Ye, T.: Tri-modal medical image fusion and denoising based on Bitonicx filtering. IEEE Trans. Instrum. Meas. 72, 1–15 (2023)
Jie, Y., Zhou, F., Tan, H., Wang, G., Cheng, X., Li, X.: Tri-modal medical image fusion based on adaptive energy choosing scheme and sparse representation. Measurement 204, 112038 (2022)
Karim, S., Tong, G., Li, J., Qadir, A., Farooq, U., Yu, Y.: Current advances and future perspectives of image fusion: a comprehensive review. Information Fusion 90, 185–217 (2023)
Li, H., Yang, M., Yu, Z.: Joint image fusion and super-resolution for enhanced visualization via semi-coupled discriminative dictionary learning and advantage embedding. Neurocomputing 422, 62–84 (2021)
Li, X., Zhou, F., Tan, H.: Joint image fusion and denoising via three-layer decomposition and sparse representation. Knowl.-Based Syst. 224, 107087 (2021)
Li, Y., Sixou, B., Peyrin, F.: A review of the deep learning methods for medical images super resolution problems. IRBM 42(2), 120–133 (2021)
Ma, J., Chen, C., Li, C., Huang, J.: Infrared and visible image fusion via gradient transfer and total variation minimization. Information Fusion 31, 100–109 (2016)
Ma, J., Tang, L., Fan, F., Huang, J., Mei, X., Ma, Y.: Swinfusion: cross-domain long-range learning for general image fusion via swin transformer. IEEE/CAA J. Automat. Sinica 9(7), 1200–1217 (2022)
Ma, J., Xu, H., Jiang, J., Mei, X., Zhang, X.P.: DDCGAN: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion. IEEE Trans. Image Process. 29, 4980–4995 (2020)
Ma, J., Yu, W., Liang, P., Li, C., Jiang, J.: Fusiongan: a generative adversarial network for infrared and visible image fusion. Information Fusion 48, 11–26 (2019)
Mao, Y., Jiang, L., Chen, X., Li, C.: Disc-diff: disentangled conditional diffusion model for multi-contrast MRI super-resolution. arXiv preprint arXiv:2303.13933 (2023)
Rao, D., Xu, T., Wu, X.J.: Tgfuse: an infrared and visible image fusion approach based on transformer and generative adversarial network. IEEE Trans. Image Process. (2023)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4713–4726 (2022)
Stimpel, B., Syben, C., Schirrmacher, F., Hoelter, P., Dörfler, A., Maier, A.: Multi-modal super-resolution with deep guided filtering. In: Bildverarbeitung für die Medizin 2019. I, pp. 110–115. Springer, Wiesbaden (2019). https://doi.org/10.1007/978-3-658-25326-4_25
Summers, D.: Harvard whole brain atlas: www.med.harvard.edu/aanlib/home.html. J. Neurol. Neurosurg. Psychiatry 74(3), 288–288 (2003)
Tsiligianni, E., Zerva, M., Marivani, I., Deligiannis, N., Kondi, L.: Interpretable deep learning for multimodal super-resolution of medical images. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 421–429. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_41
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Xiang, T., Yan, L., Gao, R.: A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain. Infrared Phys. Technol. 69, 53–61 (2015)
Xiao, W., Zhang, Y., Wang, H., Li, F., Jin, H.: Heterogeneous knowledge distillation for simultaneous infrared-visible image fusion and super-resolution. IEEE Trans. Instrum. Meas. 71, 1–15 (2022)
Yin, H., Li, S., Fang, L.: Simultaneous image fusion and super-resolution using sparse representation. Information Fusion 14(3), 229–240 (2013)
Yue, J., Fang, L., Xia, S., Deng, Y., Ma, J.: Diffusion: towards high color fidelity in infrared and visible image fusion with diffusion models. arXiv preprint arXiv:2301.08072 (2023)
Zeng, K., Zheng, H., Cai, C., Yang, Y., Zhang, K., Chen, Z.: Simultaneous single-and multi-contrast super-resolution for brain MRI images based on a convolutional neural network. Comput. Biol. Med. 99, 133–141 (2018)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Zhang, Y., et al.: Medical image fusion based on quasi-cross bilateral filtering. Biomed. Signal Process. Control 80, 104259 (2023)
Zhao, Z., et al.: Cddfuse: correlation-driven dual-branch feature decomposition for multi-modality image fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5906–5916 (2023)
Zhao, Z., et al.: Ddfm: denoising diffusion model for multi-modality image fusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8082–8093 (2023)
Acknowledgments
This research was supported in part by the National Natural Science Foundation of China under Grant 62201149, in part by the Basic and Applied Basic Research of Guangdong Province under Grant 2023A1515140077, in part by the Natural Science Foundation of Guangdong Province under Grant 2024A1515011880, in part by the Guangdong Higher Education Innovation and Strengthening of Universities Project under Grant 2023KTSCX127, and in part by the Foshan Key Areas of Scientific and Technological Research Project under Grant 2120001008558, China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, Y., Li, X., Jie, Y., Tan, H. (2024). Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution Using Conditional Diffusion Model. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15007. Springer, Cham. https://doi.org/10.1007/978-3-031-72104-5_61
Download citation
DOI: https://doi.org/10.1007/978-3-031-72104-5_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72103-8
Online ISBN: 978-3-031-72104-5
eBook Packages: Computer ScienceComputer Science (R0)