Abstract
Multi-modal medical image fusion (MMIF) has found wide application in the field of disease diagnosis and surgical guidance. Despite the popularity of deep learning (DL)-based fusion methods, these DL algorithms cannot provide satisfactory fusion performance due to the difficulty in capturing the local information and the long-range dependencies effectively. To address these issues, this paper has presented an unsupervised MMIF method by combining a densely-connected high-resolution network (DHRNet) with a hybrid transformer. In this method, the local features are firstly extracted from the source image using the DHRNet. Then these features are input into the fine-grained attention module in the hybrid transformer to produce the global features by exploring their long-range dependencies. The local and global features are fused by the projection attention module in the hybrid transformer. Finally, based on the fused features, the fused result is reconstructed by the decoder network. The presented network is trained using an unsupervised loss function including edge preservation value, structural similarity, sum of the correlations of differences and structural tensor. Experiments on various multi-modal medical images show that, compared with several traditional and DL-based fusion methods, the presented method can generate visually better fused results and provide better quantitative metrics values.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Yang Y, Que Y, Huang S, Lin P (2016) Multimodal sensor medical Image fusion based on Type-2 fuzzy logic in NSCT domain. IEEE Sens J 16:3735–3745
Daneshzand M, Zoroofi RA, Faezipour M (2014) MR image assisted drug delivery in respiratory tract and trachea tissues based on an enhanced level set method. In: Proceedings of the 2014 zone 1 conference of the American Society for engineering education. IEEE, pp 1–7
James AP, Dasarathy BV (2014) Medical image fusion: a survey of the state of the art. Inf Fusion 19:4–19
Zong J, Qiu T (2017) Medical image fusion based on sparse representation of classified image patches. Biomed Signal Process Control 34:195–205
Mihaylova L, Faouzi E, Klein L (2012) Sensor and data fusion: taxonomy, challenges and applications. Handbook on soft computing for video surveillance. Chapman and Hall/CRC, Boca Raton, pp 155–200
Du J, Li W, Xiao B (2017) Anatomical-Functional image fusion by information of interest in local Laplacian filtering domain. IEEE Trans Image Process 26:5855–5866
Zhu Z, Zheng M, Qi G et al (2019) A phase congruency and local laplacian energy based multi-modality medical image fusion method in NSCT domain. IEEE Access 7:20811–20824
Liu Y, Liu S, Wang Z (2015) A general framework for image fusion based on multi-scale transform and sparse representation. Inf Fusion 24:147–164
Burt P, Adelson E (1983) The Laplacian pyramid as a compact image code. IEEE Trans Commun 31:532–540
Yang S, Wang M, Jiao L et al (2010) Image fusion based on a new contourlet packet. Inf Fusion 11:78–84
Da Cunha AL, Zhou J, Do MN (2006) The nonsubsampled contourlet transform: theory, design, and applications. IEEE Trans Image Process 15:3089–3101
Candès E, Demanet L, Donoho D, Ying L (2006) Fast discrete curvelet transforms. Multiscale Model Simul 5:861–899
Singh R, Khare A (2014) Fusion of multimodal medical images using daubechies complex wavelet transform—a multiresolution approach. Inf Fusion 19:49–60
Yu B, Jia B, Ding L et al (2016) Hybrid dual-tree complex wavelet transform and support vector machine for digital multi-focus image fusion. Neurocomputing 182:1–9
Li X, Zhou F, Tan H et al (2021) Multi-focus image fusion based on nonsubsampled contourlet transform and residual removal. Signal Process 184:108062
Liu Z, Song Y, Sheng VS et al (2019) MRI and PET image fusion using the nonparametric density model and the theory of variable-weight. Comput Methods Programs Biomed 175:73–82
Wang S, Meng J, Zhou Y et al (2021) Polarization image fusion algorithm using NSCT and CNN. J Russ Laser Res 42:443–452
Shreyamsha Kumar BK (2015) Image fusion based on pixel significance using cross bilateral filter. Signal Image Video Process 9:1193–1204
Li S, Kang X, Jianwen Hu (2013) Image fusion with guided filtering. IEEE Trans Image Process 22:2864–2875
Liu Z, Chai Y, Yin H et al (2017) A novel multi-focus image fusion approach based on image decomposition. Inf Fusion 35:102–116
Yin H (2011) Multimodal image fusion with joint sparsity model. Opt Eng 50:067007
Li S, Yin H, Fang L (2012) Group-sparse representation with dictionary learning for medical image denoising and fusion. IEEE Trans Biomed Eng 59:3450–3459
Yin H (2015) Sparse representation with learned multiscale dictionary for image fusion. Neurocomputing 148:600–610
Hou R, Zhou D, Nie R et al (2020) VIF-Net: an unsupervised framework for infrared and visible image fusion. IEEE Trans Comput Imaging 6:640–651
Li H, Wu X (2019) DenseFuse: a fusion approach to infrared and visible images. IEEE Trans Image Process 28:2614–2623
Li H, Wu X, Kittler J (2021) RFN-Nest: an end-to-end residual fusion network for infrared and visible images. Inf Fusion 73:72–86
Raza A, Liu J, Liu Y et al (2021) IR-MSDNet: infrared and visible image fusion based on infrared features and multiscale dense network. IEEE J Sel Top Appl Earth Obs Remote Sens 14:3426–3437
Ma J, Liang P, Yu W et al (2020) Infrared and visible image fusion via detail preserving adversarial learning. Inf Fusion 54:85–98
Ma B, Zhu Y, Yin X et al (2021) SESF-Fuse: an unsupervised deep model for multi-focus image fusion. Neural Comput Appl 33:5793–5804
Mustafa HT, Yang J, Zareapoor M (2019) Multi-scale convolutional neural network for multi-focus image fusion. Image Vis Comput 85:26–35
Gai D, Shen X, Chen H, Su P (2020) Multi-focus image fusion method based on two stage of convolutional neural network. Signal Process 176:107681
Wang Z, Chen B, Lu R et al (2020) FusionNet: an unsupervised convolutional variational network for hyperspectral and multispectral image fusion. IEEE Trans Image Process 29:7565–7577
Liu Y, Chen X, Cheng J, Peng H (2017) A medical image fusion method based on convolutional neural networks. In: 2017 20th international conference on information fusion (fusion). IEEE, pp 1–7
Liang X, Hu P, Zhang L et al (2019) MCFNet: multi-layer concatenation fusion network for medical images fusion. IEEE Sens J 19:7107–7119
Zhang Y, Liu Y, Sun P et al (2020) IFCNN: a general image fusion framework based on convolutional neural network. Inf Fusion 54:99–118
Xu H, Ma J, Jiang J et al (2022) U2Fusion: a unified unsupervised image fusion network. IEEE Trans Pattern Anal Mach Intell 44:502–518
Fu J, Li W, Du J, Huang Y (2021) A multiscale residual pyramid attention network for medical image fusion. Biomed Signal Process Control 66:102488
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inf Process Syst 2017:5999–6009
Dosovitskiy A, Beyer L, Kolesnikov A, et al (2021) An image is worth 16×16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Yu C, Xiao B, Gao C, et al (2021) Lite-HRNet: a lightweight high-resolution network. In: 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE, pp 10435–10445
Yuan L, Hou Q, Jiang Z, Feng J, Yan S (2021) VOLO: vision outlooker for visual recognition. In: Proceedings of the IEEE conference computer vision pattern recognition (CVPR). IEEE, arXiv preprint arXiv:2106.13112
Liu F, Ren X, Zhang Z, et al (2020) Rethinking skip connection with layer normalization. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Stroudsburg, pp 3586–3598
Hu J, Shen L, Albanie S et al (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42:2011–2023
Aslantas V, Bendes E (2015) A new image quality metric for image fusion: the sum of the correlations of differences. AEU - Int J Electron Commun 69:1890–1896
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13:600–612
Xydeas CS, Petrović V (2000) Objective image fusion performance measure. Electron Lett 36:308
Jung H, Kim Y, Jang H et al (2020) Unsupervised deep image fusion with structure tensor representations. IEEE Trans Image Process 29:3845–3858
Han Y, Cai Y, Cao Y, Xu X (2013) A new image fusion performance metric based on visual information fidelity. Inf Fusion 14:127–135
Eskicioglu AM, Fisher PS (1995) Image quality measures and their performance. IEEE Trans Commun 43:2959–2965
Othonos A (1997) Fiber bragg gratings. Rev Sci Instrum 68:4309–4341
Petrovic V, Xydeas C (2005) Objective image fusion performance characterisation. In: Tenth IEEE international conference on computer vision (ICCV’05), vol 1. IEEE, pp 1866–1871
Haghighat M, Razian MA (2014) Fast-FMI: non-reference image fusion metric. In: 2014 IEEE 8th international conference on application of information and communication technologies (AICT). IEEE, pp 1–3
Selvaraju RR, Cogswell M, Das A, et al (2017) Grad-CAM: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 618–626
Yin M, Liu X, Liu Y, Chen X (2019) Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain. IEEE Trans Instrum Meas 68:49–64
Acknowledgements
This work is supported by the National Natural Science Foundation of China (NSFC) (Grant No.: 61871440). The authors also acknowledge the support by the medical ultrasound lab at Huazhong University of Science and Technology for providing GPU computation platform.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Q., Ye, S., Wen, M. et al. Multi-modal medical image fusion based on densely-connected high-resolution CNN and hybrid transformer. Neural Comput & Applic 34, 21741–21761 (2022). https://doi.org/10.1007/s00521-022-07635-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07635-1