[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

VCAFusion: : An infrared and visible image fusion network with visual perception and cross-scale attention

Published: 18 July 2024 Publication History

Abstract

Infrared and visible image fusion methods aim to combine salient target instances and abundant texture details into fused images. However, due to the interference of harsh conditions, such as dense smoke, fog, and intense light, it is feasible for external interference information to be integrated into the fused image, which seriously affects the image quality. To this end, we propose an infrared and visible image fusion network with visual perception and cross-scale attention modules, termed VCAFusion, which integrates critical information from source images in harsh conditions more efficiently. Specifically, considering that the human eye can identify key information under adverse conditions, we design a visual perception module (VPM), guiding information integration from the perspective of human visual perception. In addition, we propose a cross-scale attention module (CSAM) based on shifted window cross-attention, which aims to capture the long-distance correlation between adjacent scale features, providing more accurate image information for image restoration. Experiments with a variety of datasets reveal that the VCAFusion can adaptively retain image information and further improve image generation ability in harsh conditions.

References

[1]
H. Zhang, H. Xu, X. Tian, J. Jiang, J. Ma, Image fusion meets deep learning: a survey and perspective, Inf. Fusion 76 (2021) 323–336,.
[2]
J. Heo, S.G. Kong, B.R. Abidi, M.A. Abidi, Fusion of visual and thermal signatures with eyeglass removal for robust face recognition, in: 2004 Conference on Computer Vision and Pattern Recognition Workshop, IEEE, 2004, p. 122.
[3]
P. Kumar, A. Mittal, P. Kumar, Fusion of thermal infrared and visible spectrum video for robust surveillance, in: Computer Vision, Graphics and Image Processing: 5th Indian Conference, ICVGIP 2006, Madurai, India, December 13-16, 2006. Proceedings, Springer, 2006, pp. 528–539.
[4]
T.T. Zin, H. Takahashi, T. Toriu, H. Hama, Fusion of infrared and visible images for robust person detection, Inf. Fusion (2011) 239–264.
[5]
J. Ma, C. Chen, C. Li, J. Huang, Infrared and visible image fusion via gradient transfer and total variation minimization, Inf. Fusion 31 (2016) 100–109,.
[6]
H. Li, X.-J. Wu, J. Kittler, Mdlatlrr: a novel decomposition method for infrared and visible image fusion, IEEE Trans. Image Process. 29 (2020) 4733–4746.
[7]
N. Cvejic, D. Bull, N. Canagarajah, Region-based multimodal image fusion using ica bases, IEEE Sens. J. 7 (2007) 743–751,.
[8]
J. Ma, Z. Zhou, B. Wang, H. Zong, Infrared and visible image fusion based on visual saliency map and weighted least square optimization, Infrared Phys. Technol. 82 (2017) 8–17,.
[9]
A. Sezer, A. Altan, Detection of solder paste defects with an optimization-based deep learning model using image processing techniques, SST 33 (5) (2021) 291–298.
[10]
İ. Yağ, A. Altan, Artificial intelligence-based robust hybrid algorithm design and implementation for real-time detection of plant diseases in agricultural environments, Biology 11 (12) (2022) 1732.
[11]
Y.B. Özçelik, A. Altan, A comparative analysis of artificial intelligence optimization algorithms for the selection of entropy-based features in the early detection of epileptic seizures, in: 2023 14th International Conference on Electrical and Electronics Engineering (ELECO), IEEE, 2023, pp. 1–5.
[12]
Z. Li, H. Wu, L. Cheng, S. Luo, M. Chen, Infrared and visible fusion imaging via double-layer fusion denoising neural network, Digit. Signal Process. 123 (2022),.
[13]
L. Tang, X. Xiang, H. Zhang, M. Gong, J. Ma, Divfusion: darkness-free infrared and visible image fusion, Inf. Fusion 91 (2023) 477–493,.
[14]
J. Wang, M. Jiang, J. Kong, MDAN: multilevel dual-branch attention network for infrared and visible image fusion, Opt. Lasers Eng. 176 (2024),.
[15]
L. Tang, J. Yuan, H. Zhang, X. Jiang, J. Ma, Piafusion: a progressive infrared and visible image fusion network based on illumination aware, Inf. Fusion 83–84 (2022) 79–92,.
[16]
X. Zhang, H. Zhai, J. Liu, Z. Wang, H. Sun, Real-time infrared and visible image fusion network using adaptive pixel weighting strategy, Inf. Fusion 99 (2023),.
[17]
J. Liu, R. Dian, S. Li, H. Liu, Sgfusion: a saliency guided deep-learning framework for pixel-level image fusion, Inf. Fusion 91 (2023) 205–214,.
[18]
C. Cheng, T. Xu, X.-J. Wu, Mufusion: a general unsupervised image fusion network based on memory unit, Inf. Fusion 92 (2023) 80–92,.
[19]
J. Ma, W. Yu, P. Liang, C. Li, J. Jiang, Fusiongan: a generative adversarial network for infrared and visible image fusion, Inf. Fusion 48 (2019) 11–26,.
[20]
J. Ma, H. Xu, J. Jiang, X. Mei, X.P. Zhang, Ddcgan: a dual-discriminator conditional generative adversarial network for multi-resolution image fusion, IEEE Trans. Image Process. 29 (2020) 4980–4995,.
[21]
J. Ma, H. Zhang, Z. Shao, P. Liang, H. Xu, Ganmcc: a generative adversarial network with multiclassification constraints for infrared and visible image fusion, IEEE Trans. Instrum. Meas. 70 (2021) 1–14,.
[22]
K. Li, G. Liu, X. Gu, H. Tang, J. Xiong, Y. Qian, Dant-gan: a dual attention-based of nested training network for infrared and visible image fusion, Digit. Signal Process. 145 (2024),.
[23]
J. Ma, L. Tang, F. Fan, J. Huang, X. Mei, Y. Ma, Swinfusion: cross-domain long-range learning for general image fusion via swin transformer, IEEE/CAA J. Autom. Sin. 9 (2022) 1200–1217,.
[24]
H. Li, X.-J. Wu, Densefuse: a fusion approach to infrared and visible images, IEEE Trans. Image Process. 28 (2019) 2614–2623,.
[25]
H. Li, X.-J. Wu, J. Kittler, Rfn-nest: an end-to-end residual fusion network for infrared and visible images, Inf. Fusion 73 (2021) 72–86,.
[26]
H. Li, J. Zhao, J. Li, Z. Yu, G. Lu, Feature dynamic alignment and refinement for infrared–visible image fusion: translation robust fusion, Inf. Fusion 95 (2023) 26–41.
[27]
H. Xu, J. Ma, J. Jiang, X. Guo, H. Ling, U2fusion: a unified unsupervised image fusion network, IEEE Trans. Pattern Anal. Mach. Intell. 44 (2022) 502–518,.
[28]
J. Li, H. Huo, C. Li, R. Wang, Q. Feng, Attentionfgan: infrared and visible image fusion using attention-based generative adversarial networks, IEEE Trans. Multimed. 23 (2021) 1383–1396,.
[29]
Y. Rao, D. Wu, M. Han, T. Wang, Y. Yang, T. Lei, C. Zhou, H. Bai, L. Xing, At-gan: a generative adversarial network with attention and transition for infrared and visible image fusion, Inf. Fusion 92 (2023) 336–349,.
[30]
L. Chang, Y. Huang, Q. Li, Y. Zhang, L. Liu, Q. Zhou, Dugan: Infrared and visible image fusion based on dual fusion paths and a u-type discriminator, Neurocomputing (2024).
[31]
A. Srinivas, T.-Y. Lin, N. Parmar, J. Shlens, P. Abbeel, A. Vaswani, Bottleneck transformers for visual recognition, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 16519–16529.
[32]
X. Chen, B. Yan, J. Zhu, D. Wang, X. Yang, H. Lu, Transformer tracking, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8126–8135.
[33]
S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P.H. Torr, et al., Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6881–6890.
[34]
D. Rao, T. Xu, X.-J. Wu, TGFuse: an infrared and visible image fusion approach based on transformer and generative adversarial network, IEEE Trans. Image Process. (2023) 1,.
[35]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
[36]
Z. Wang, Y. Chen, W. Shao, H. Li, L. Zhang, Swinfuse: a residual swin transformer fusion network for infrared and visible images, IEEE Trans. Instrum. Meas. 71 (2022) 1–12,.
[37]
H. Tang, Y. Qian, M. Xing, Y. Cao, G. Liu, Mpcfusion: multi-scale parallel cross fusion for infrared and visible images via convolution and vision transformer, Opt. Lasers Eng. 176 (2024).
[38]
X. Yang, H. Huo, C. Li, X. Liu, W. Wang, C. Wang, Semantic perceptive infrared and visible image fusion transformer, Pattern Recognit. 149 (2024).
[39]
R. Shapley, C. Enroth-Cugell, Visual adaptation and retinal gain controls, Prog. Retin. Res. 3 (1984) 263–346.
[40]
Z. Zhou, E. Fei, L. Miao, R. Yang, A perceptual framework for infrared–visible image fusion based on multiscale structure decomposition and biological vision, Inf. Fusion 93 (2023) 174–191,.
[41]
M.Z. Aziz, B. Mertsching, Fast and robust generation of feature maps for region-based visual attention, IEEE Trans. Image Process. 17 (5) (2008) 633–644,.
[42]
C.-H. Chou, Y.-C. Li, A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile, IEEE Trans. Circuits Syst. Video Technol. 5 (6) (1995) 467–476,.
[43]
J. Wu, L. Li, W. Dong, G. Shi, W. Lin, C.-C.J. Kuo, Enhanced just noticeable difference model for images with pattern complexity, IEEE Trans. Image Process. 26 (6) (2017) 2682–2693,.
[44]
J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.
[45]
W. Wang, W. Chen, Q. Qiu, L. Chen, B. Wu, B. Lin, X. He, W. Liu, Crossformer++: a versatile vision transformer hinging on cross-scale attention, IEEE TPAMI (2023).
[46]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L.u. Kaiser, I. Polosukhin, Attention is all you need, I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc., 2017.
[47]
J.V. Aardt, Assessment of image fusion procedures using entropy, image quality, and multispectral classification, J. Appl. Remote Sens. 2 (2008),.
[48]
Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612,.
[50]
L. Tang, J. Yuan, J. Ma, Image fusion in the loop of high-level vision tasks: a semantic-aware real-time infrared and visible image fusion network, Inf. Fusion 82 (2022) 28–42,.
[51]
G. Piella, H. Heijmans, A New Quality Metric for Image Fusion, Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), vol. 3, IEEE, 2003, III–173.
[52]
A. Eskicioglu, P. Fisher, Image quality measures and their performance, IEEE Trans. Commun. 43 (1995) 2959–2965,.
[53]
G. Cui, H. Feng, Z. Xu, Q. Li, Y. Chen, Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition, Opt. Commun. 341 (2015) 199–209,.
[54]
Y.-J. Rao, In-fibre Bragg grating sensors, Meas. Sci. Technol. 8 (1997) 355–375,.
[55]
Y. Han, Y. Cai, Y. Cao, X. Xu, A new image fusion performance metric based on visual information fidelity, Inf. Fusion 14 (2013) 127–135,.
[56]
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., Pytorch: an imperative style, high-performance deep learning library, Adv. Neural Inf. Process. Syst. 32 (2019).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Digital Signal Processing
Digital Signal Processing  Volume 151, Issue C
Aug 2024
491 pages

Publisher

Academic Press, Inc.

United States

Publication History

Published: 18 July 2024

Author Tags

  1. Image fusion
  2. Harsh conditions
  3. Visual perception
  4. Cross-scale attention

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media