Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection
<p>The design concept of our method.</p> "> Figure 2
<p>The overall architecture of our proposed method. Where <math display="inline"><semantics> <mrow> <mi>U</mi> <mi>P</mi> </mrow> </semantics></math> represents upsampling.</p> "> Figure 3
<p>The detailed architecture of CCM.</p> "> Figure 4
<p>The structure of an FSW.</p> "> Figure 5
<p>The dataset used in this experiment: (<b>a</b>) VT821, (<b>b</b>) VT1000, (<b>c</b>) VT5000.</p> "> Figure 6
<p>Quantitative comparison of our method with other state-of-the-art methods: (<b>a</b>) PR curve. (<b>b</b>) Fm curve.</p> "> Figure 7
<p>Qualitative comparison of our model with eleven recent state-of-the-art models.</p> "> Figure 8
<p>The visualization of ablation, where ’w/o’ stands for the absence of the corresponding module.</p> "> Figure 9
<p>Visualization of some typical failure cases in our method.</p> ">
Abstract
:1. Introduction
- We propose a novel RGB-T SOD method, which consists of three components: a feature encoder, a channel-wise criss-cross Module, and a feature selection wavelet transformer.
- We propose a channel-wise criss-cross module (CCM) that performs channel decomposition and parallel computation on features from both RGB and thermal modalities, effectively preventing information loss caused by direct fusion. This module employs attention mechanism to adaptively fuse complementary information from both modalities, and enhances the capture of global contextual information and fine-grained features through dynamic weight allocation, thereby achieving more comprehensive and robust feature fusion.
- We propose a contextual feature selection wavelet transformer (FSW) module that uses wavelet transform to decompose fused feature information into high-frequency and low-frequency components. The high-frequency features capture fine-grained edge details for accurate target localization, while the low-frequency features provide background and structural context. This design enables the model to remain sensitive to object edges while leveraging global context to improve robustness, particularly in complex scenes. By effectively integrating both frequency bands during feature aggregation, the FSW module enhances segmentation accuracy and object localization.
- Extensive experimental validation shows that our method achieves outstanding performance on three datasets.
2. Related Work
2.1. RGB Salient Object Detection
2.2. RGB-D Salient Object Detection
2.3. RGB-T Salient Object Detection
3. Method
3.1. Architecture Overview
3.2. Channel-Wise Criss-Cross Module
3.3. Feature Selection Wavelet Transformer
3.4. Loss Function
4. Experiments and Results
4.1. Datasets
4.2. Implementation Details
4.3. Evaluation Metrics
4.4. Comparisons with State-of-the-Art Methods
4.4.1. Quantitative Comparisons
4.4.2. Qualitative Comparison
4.5. Ablation Study
4.6. Failure Cases
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Deng, B.; Liu, D.; Cao, Y.; Liu, H.; Yan, Z.; Chen, H. CFRNet: Cross-Attention-Based Fusion and Refinement Network for Enhanced RGB-T Salient Object Detection. Sensors 2024, 24, 7146. [Google Scholar] [CrossRef] [PubMed]
- Song, K.; Xue, X.; Wen, H.; Ji, Y.; Yan, Y.; Meng, Q. Misaligned Visible-Thermal Object Detection: A Drone-based Benchmark and Baseline. IEEE Trans. Intell. Veh. 2024, 1–12, in press. [Google Scholar] [CrossRef]
- Ramm, R.; de Dios Cruz, P.; Heist, S.; Kühmstedt, P.; Notni, G. Fusion of Multimodal Imaging and 3D Digitization Using Photogrammetry. Sensors 2024, 24, 2290. [Google Scholar] [CrossRef] [PubMed]
- Qureshi, I.; Yan, J.; Abbas, Q.; Shaheed, K.; Riaz, A.B.; Wahid, A.; Khan, M.W.J.; Szczuko, P. Medical image segmentation using deep semantic-based methods: A review of techniques, applications and emerging trends. Inf. Fusion 2023, 90, 316–352. [Google Scholar] [CrossRef]
- Song, K.; Zhao, Y.; Huang, L.; Yan, Y.; Meng, Q. RGB-T image analysis technology and application: A survey. Eng. Appl. Artif. Intell. 2023, 120, 105919. [Google Scholar] [CrossRef]
- Wang, G.; Li, C.; Ma, Y.; Zheng, A.; Tang, J.; Luo, B. RGB-T Saliency Detection Benchmark: Dataset, Baselines, Analysis and a Novel Approach. In Proceedings of the Image and Graphics Technologies and Applications; Wang, Y., Jiang, Z., Peng, Y., Eds.; Springer: Singapore, 2018; pp. 359–369. [Google Scholar]
- Tu, Z.; Xia, T.; Li, C.; Lu, Y.; Tang, J. M3S-NIR: Multi-modal Multi-scale Noise-Insensitive Ranking for RGB-T Saliency Detection. In Proceedings of the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), San Jose, CA, USA, 28–30 March 2019; pp. 141–146. [Google Scholar] [CrossRef]
- Huang, L.; Song, K.; Gong, A.; Liu, C.; Yan, Y. RGB-T Saliency Detection via Low-Rank Tensor Learning and Unified Collaborative Ranking. IEEE Signal Process. Lett. 2020, 27, 1585–1589. [Google Scholar] [CrossRef]
- Tu, Z.; Li, Z.; Li, C.; Lang, Y.; Tang, J. Multi-Interactive Dual-Decoder for RGB-Thermal Salient Object Detection. IEEE Trans. Image Process. 2021, 30, 5678–5691. [Google Scholar] [CrossRef]
- Bi, H.; Wu, R.; Liu, Z.; Zhang, J.; Zhang, C.; Xiang, T.Z.; Wang, X. PSNet: Parallel symmetric network for RGB-T salient object detection. Neurocomputing 2022, 511, 410–425. [Google Scholar] [CrossRef]
- Zhang, Q.; Xi, R.; Xiao, T.; Huang, N.; Luo, Y. Enabling modality interactions for RGB-T salient object detection. Comput. Vis. Image Underst. 2022, 222, 103514. [Google Scholar] [CrossRef]
- Cong, R.; Zhang, K.; Zhang, C.; Zheng, F.; Zhao, Y.; Huang, Q.; Kwong, S. Does Thermal Really Always Matter for RGB-T Salient Object Detection? IEEE Trans. Multimed. 2023, 25, 6971–6982. [Google Scholar] [CrossRef]
- Huo, F.; Zhu, X.; Zhang, Q.; Liu, Z.; Yu, W. Real-Time One-Stream Semantic-Guided Refinement Network for RGB-Thermal Salient Object Detection. IEEE Trans. Instrum. Meas. 2022, 71, 1–12. [Google Scholar] [CrossRef]
- Zhou, W.; Guo, Q.; Lei, J.; Yu, L.; Hwang, J.N. ECFFNet: Effective and Consistent Feature Fusion Network for RGB-T Salient Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 1224–1235. [Google Scholar] [CrossRef]
- Song, K.; Wen, H.; Xue, X.; Huang, L.; Ji, Y.; Yan, Y. Modality Registration and Object Search Framework for UAV-Based Unregistered RGB-T Image Salient Object Detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
- Zhou, W.; Sun, F.; Jiang, Q.; Cong, R.; Hwang, J.N. WaveNet: Wavelet Network with Knowledge Distillation for RGB-T Salient Object Detection. IEEE Trans. Image Process. 2023, 32, 3027–3039. [Google Scholar] [CrossRef]
- Yue, H.; Guo, J.; Yin, X.; Zhang, Y.; Zheng, S.; Zhang, Z.; Li, C. Salient object detection in low-light images via functional optimization-inspired feature polishing. Knowl.-Based Syst. 2022, 257, 109938. [Google Scholar] [CrossRef]
- Siris, A.; Jiao, J.; Tam, G.K.; Xie, X.; Lau, R.W. Scene Context-Aware Salient Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 4156–4166. [Google Scholar]
- Wang, Q.; Liu, Y.; Xiong, Z.; Yuan, Y. Hybrid Feature Aligned Network for Salient Object Detection in Optical Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
- Wu, Z.; Li, S.; Chen, C.; Qin, H.; Hao, A. Salient Object Detection via Dynamic Scale Routing. IEEE Trans. Image Process. 2022, 31, 6649–6663. [Google Scholar] [CrossRef]
- Wu, Y.H.; Liu, Y.; Zhang, L.; Cheng, M.M.; Ren, B. EDN: Salient Object Detection via Extremely-Downsampled Network. IEEE Trans. Image Process. 2022, 31, 3125–3136. [Google Scholar] [CrossRef]
- Li, J.; Qiao, S.; Zhao, Z.; Xie, C.; Chen, X.; Xia, C. Rethinking Lightweight Salient Object Detection via Network Depth-Width Tradeoff. IEEE Trans. Image Process. 2023, 32, 5664–5677. [Google Scholar] [CrossRef]
- Jin, W.D.; Xu, J.; Han, Q.; Zhang, Y.; Cheng, M.M. CDNet: Complementary Depth Network for RGB-D Salient Object Detection. IEEE Trans. Image Process. 2021, 30, 3376–3390. [Google Scholar] [CrossRef]
- Chen, Q.; Zhang, Z.; Lu, Y.; Fu, K.; Zhao, Q. 3-D Convolutional Neural Networks for RGB-D Salient Object Detection and Beyond. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 4309–4323. [Google Scholar] [CrossRef] [PubMed]
- Song, M.; Song, W.; Yang, G.; Chen, C. Improving RGB-D Salient Object Detection via Modality-Aware Decoder. IEEE Trans. Image Process. 2022, 31, 6124–6138. [Google Scholar] [CrossRef] [PubMed]
- Sun, F.; Ren, P.; Yin, B.; Wang, F.; Li, H. CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection. IEEE Trans. Multimed. 2024, 26, 2249–2262. [Google Scholar] [CrossRef]
- Wu, Z.; Allibert, G.; Meriaudeau, F.; Ma, C.; Demonceaux, C. HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness. IEEE Trans. Image Process. 2023, 32, 2160–2173. [Google Scholar] [CrossRef]
- Zhang, Q.; Qin, Q.; Yang, Y.; Jiao, Q.; Han, J. Feature Calibrating and Fusing Network for RGB-D Salient Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 1493–1507. [Google Scholar] [CrossRef]
- Gao, W.; Liao, G.; Ma, S.; Li, G.; Liang, Y.; Lin, W. Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 2091–2106. [Google Scholar] [CrossRef]
- Tu, Z.; Li, Z.; Li, C.; Tang, J. Weakly Alignment-Free RGBT Salient Object Detection with Deep Correlation Network. IEEE Trans. Image Process. 2022, 31, 3752–3764. [Google Scholar] [CrossRef]
- Song, K.; Huang, L.; Gong, A.; Yan, Y. Multiple Graph Affinity Interactive Network and a Variable Illumination Dataset for RGBT Image Salient Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 3104–3118. [Google Scholar] [CrossRef]
- Wang, H.; Song, K.; Huang, L.; Wen, H.; Yan, Y. Thermal images-aware guided early fusion network for cross-illumination RGB-T salient object detection. Eng. Appl. Artif. Intell. 2023, 118, 105640. [Google Scholar] [CrossRef]
- Zhou, H.; Tian, C.; Zhang, Z.; Li, C.; Ding, Y.; Xie, Y.; Li, Z. Position-Aware Relation Learning for RGB-Thermal Salient Object Detection. IEEE Trans. Image Process. 2023, 32, 2593–2607. [Google Scholar] [CrossRef]
- Huang, Z.; Wang, X.; Wei, Y.; Huang, L.; Shi, H.; Liu, W.; Huang, T.S. CCNet: Criss-Cross Attention for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 6896–6908. [Google Scholar] [CrossRef] [PubMed]
- Goyal, A.; Meenpal, T. Patch-Based Dual-Tree Complex Wavelet Transform for Kinship Recognition. IEEE Trans. Image Process. 2021, 30, 191–206. [Google Scholar] [CrossRef] [PubMed]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [PubMed]
- Milletari, F.; Navab, N.; Ahmadi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571. [Google Scholar]
- de Boer, P.T.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A Tutorial on the Cross-Entropy Method. Ann. Oper. Res. 2005, 134, 19–67. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar] [CrossRef]
- Tu, Z.; Xia, T.; Li, C.; Wang, X.; Ma, Y.; Tang, J. RGB-T Image Saliency Detection via Collaborative Graph Learning. IEEE Trans. Multimed. 2020, 22, 160–173. [Google Scholar] [CrossRef]
- Tu, Z.; Ma, Y.; Li, Z.; Li, C.; Xu, J.; Liu, Y. RGBT Salient Object Detection: A Large-Scale Dataset and Benchmark. IEEE Trans. Multimed. 2023, 25, 4163–4176. [Google Scholar] [CrossRef]
- Fan, D.P.; Cheng, M.M.; Liu, Y.; Li, T.; Borji, A. Structure-measure: A new way to evaluate foreground maps. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4548–4557. [Google Scholar]
- Fan, D.P.; Ji, G.P.; Qin, X.; Cheng, M.M. Cognitive vision inspired object segmentation metric and loss function. Sci. Sin. Informationis 2021, 6, 5. [Google Scholar]
- Achanta, R.; Hemami, S.; Estrada, F.; Susstrunk, S. Frequency-tuned salient region detection. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 1597–1604. [Google Scholar]
- Perazzi, F.; Krähenbühl, P.; Pritch, Y.; Hornung, A. Saliency filters: Contrast based filtering for salient region detection. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 733–740. [Google Scholar]
- Piao, Y.; Ji, W.; Li, J.; Zhang, M.; Lu, H. Depth-induced multi-scale recurrent attention network for saliency detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7254–7263. [Google Scholar]
- Zhao, J.X.; Liu, J.J.; Fan, D.P.; Cao, Y.; Yang, J.; Cheng, M.M. EGNet: Edge guidance network for salient object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8779–8788. [Google Scholar]
- Qin, X.; Fan, D.P.; Huang, C.; Diagne, C.; Zhang, Z.; Sant’Anna, A.C.; Suarez, A.; Jagersand, M.; Shao, L. Boundary-aware segmentation network for mobile and web applications. arXiv 2021, arXiv:2101.04704. [Google Scholar]
- Liu, J.J.; Hou, Q.; Cheng, M.M.; Feng, J.; Jiang, J. A simple pooling-based design for real-time salient object detection. In Proceedings of theIEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3917–3926. [Google Scholar]
- Deng, Z.; Hu, X.; Zhu, L.; Xu, X.; Qin, J.; Han, G.; Heng, P.A. R3net: Recurrent residual refinement network for saliency detection. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; AAAI Press: Menlo Park, CA, USA, 2018; Volume 684690. [Google Scholar]
- Zhao, T.; Wu, X. Pyramid feature attention network for saliency detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3085–3094. [Google Scholar]
- Wu, Z.; Su, L.; Huang, Q. Cascaded partial decoder for fast and accurate salient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3907–3916. [Google Scholar]
- Liu, N.; Zhang, N.; Han, J. Learning Selective Self-Mutual Attention for RGB-D Saliency Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 13753–13762. [Google Scholar] [CrossRef]
- Wen, H.; Song, K.; Huang, L.; Wang, H.; Yan, Y. Cross-modality salient object detection network with universality and anti-interference. Knowl.-Based Syst. 2023, 264, 110322. [Google Scholar] [CrossRef]
- Liu, Z.; Huang, X.; Zhang, G.; Fang, X.; Wang, L.; Tang, B. Scribble-supervised rgb-t salient object detection. In Proceedings of the 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 10–14 July 2023; pp. 2369–2374. [Google Scholar]
- Wang, J.; Song, K.; Bao, Y.; Huang, L.; Yan, Y. CGFNet: Cross-guided fusion network for RGB-T salient object detection. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 2949–2961. [Google Scholar] [CrossRef]
- Xu, C.; Li, Q.; Zhou, Q.; Jiang, X.; Yu, D.; Zhou, Y. Asymmetric cross-modal activation network for RGB-T salient object detection. Knowl.-Based Syst. 2022, 258, 110047. [Google Scholar] [CrossRef]
- Pang, Y.; Zhao, X.; Zhang, L.; Lu, H. CAVER: Cross-Modal View-Mixed Transformer for Bi-Modal Salient Object Detection. IEEE Trans. Image Process. 2023, 32, 892–904. [Google Scholar] [CrossRef]
- Zhou, W.; Zhu, Y.; Lei, J.; Yang, R.; Yu, L. LSNet: Lightweight spatial boosting network for detecting salient objects in RGB-thermal images. IEEE Trans. Image Process. 2023, 32, 1329–1340. [Google Scholar] [CrossRef] [PubMed]
Methods | VT821 | VT1000 | VT5000 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
M3S-NIR | 0.723 | 0.140 | 0.738 | 0.837 | 0.726 | 0.145 | 0.735 | 0.828 | 0.652 | 0.168 | 0.596 | 0.760 |
MTMR | 0.725 | 0.108 | 0.690 | 0.812 | 0.706 | 0.119 | 0.715 | 0.836 | 0.680 | 0.114 | 0.613 | 0.792 |
SGDL | 0.765 | 0.085 | 0.735 | 0.840 | 0.787 | 0.090 | 0.770 | 0.859 | 0.751 | 0.089 | 0.695 | 0.829 |
S2MA | 0.829 | 0.081 | 0.779 | 0.855 | 0.921 | 0.029 | 0.913 | 0.952 | 0.855 | 0.055 | 0.812 | 0.895 |
PFA | 0.761 | 0.096 | 0.711 | 0.854 | 0.813 | 0.078 | 0.805 | 0.888 | 0.748 | 0.099 | 0.719 | 0.857 |
DMRA | 0.666 | 0.216 | 0.661 | 0.766 | 0.784 | 0.124 | 0.798 | 0.863 | 0.659 | 0.184 | 0.601 | 0.758 |
LSNet | 0.879 | 0.033 | 0.845 | 0.921 | 0.926 | 0.023 | 0.922 | 0.963 | 0.877 | 0.037 | 0.850 | 0.924 |
BASNet | 0.823 | 0.067 | 0.763 | 0.858 | 0.909 | 0.030 | 0.901 | 0.944 | 0.839 | 0.054 | 0.791 | 0.884 |
ADF | 0.810 | 0.077 | 0.752 | 0.839 | 0.910 | 0.034 | 0.908 | 0.950 | 0.864 | 0.048 | 0.837 | 0.911 |
CPD | 0.818 | 0.079 | 0.758 | 0.862 | 0.907 | 0.031 | 0.897 | 0.947 | 0.855 | 0.046 | 0.818 | 0.905 |
DCNet | 0.877 | 0.033 | 0.851 | 0.920 | 0.923 | 0.021 | 0.919 | 0.961 | 0.872 | 0.035 | 0.853 | 0.925 |
EGNet | 0.830 | 0.063 | 0.756 | 0.857 | 0.910 | 0.033 | 0.898 | 0.945 | 0.853 | 0.050 | 0.808 | 0.893 |
MIDD | 0.871 | 0.045 | 0.851 | 0.918 | 0.907 | 0.029 | 0.906 | 0.952 | 0.856 | 0.046 | 0.839 | 0.913 |
PoolNet | 0.788 | 0.082 | 0.707 | 0.842 | 0.849 | 0.063 | 0.826 | 0.904 | 0.788 | 0.080 | 0.727 | 0.852 |
R3Net | 0.782 | 0.081 | 0.711 | 0.819 | 0.886 | 0.037 | 0.876 | 0.939 | 0.812 | 0.059 | 0.753 | 0.863 |
MGAI | 0.891 | 0.031 | 0.873 | 0.935 | 0.929 | 0.021 | 0.926 | 0.966 | 0.883 | 0.034 | 0.862 | 0.931 |
CGFNet | 0.880 | 0.038 | 0.866 | 0.920 | 0.923 | 0.023 | 0.923 | 0.959 | 0.883 | 0.035 | 0.869 | 0.927 |
SSOD | 0.895 | 0.027 | 0.878 | 0.942 | 0.925 | 0.020 | 0.922 | 0.964 | 0.877 | 0.033 | 0.859 | 0.933 |
ACMANet | 0.883 | 0.035 | 0.851 | 0.926 | 0.927 | 0.021 | 0.923 | 0.964 | 0.887 | 0.033 | 0.871 | 0.936 |
TNet | 0.899 | 0.030 | 0.888 | 0.938 | 0.929 | 0.021 | 0.930 | 0.966 | 0.895 | 0.033 | 0.881 | 0.937 |
GRNet | 0.893 | 0.031 | 0.866 | 0.933 | 0.931 | 0.018 | 0.927 | 0.966 | 0.888 | 0.034 | 0.870 | 0.931 |
CAVER | 0.898 | 0.026 | 0.877 | 0.934 | 0.938 | 0.016 | 0.939 | 0.973 | 0.900 | 0.028 | 0.882 | 0.944 |
OURS | 0.910 | 0.025 | 0.892 | 0.943 | 0.942 | 0.015 | 0.946 | 0.979 | 0.917 | 0.024 | 0.909 | 0.958 |
VT821 | VT1000 | VT5000 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
OnlyRGB | 0.885 | 0.034 | 0.845 | 0.912 | 0.939 | 0.016 | 0.939 | 0.974 | 0.903 | 0.027 | 0.890 | 0.947 |
OnlyT | 0.846 | 0.043 | 0.814 | 0.907 | 0.908 | 0.027 | 0.903 | 0.956 | 0.871 | 0.038 | 0.844 | 0.927 |
OURS | 0.910 | 0.025 | 0.892 | 0.943 | 0.942 | 0.015 | 0.946 | 0.979 | 0.917 | 0.024 | 0.909 | 0.958 |
Setting Type | Configuration | VT821 | VT1000 | VT5000 | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Module | w/o CCM | 0.883 | 0.030 | 0.863 | 0.925 | 0.024 | 0.917 | 0.886 | 0.034 | 0.876 |
w/o FSW | 0.875 | 0.034 | 0.859 | 0.922 | 0.025 | 0.913 | 0.879 | 0.037 | 0.868 | |
w/o DT-CWT | 0.905 | 0.028 | 0.885 | 0.935 | 0.018 | 0.945 | 0.910 | 0.027 | 0.897 | |
OURS | 0.910 | 0.025 | 0.892 | 0.942 | 0.015 | 0.946 | 0.917 | 0.024 | 0.909 | |
Loss | w/o BCE | 0.907 | 0.027 | 0.889 | 0.941 | 0.016 | 0.944 | 0.911 | 0.027 | 0.906 |
w/o IoU | 0.908 | 0.026 | 0.891 | 0.939 | 0.016 | 0.942 | 0.910 | 0.026 | 0.899 | |
BCE+IoU | 0.910 | 0.025 | 0.892 | 0.942 | 0.015 | 0.946 | 0.917 | 0.024 | 0.909 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, J.; Wen, X.; He, Y.; Yang, X.; Song, K. Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection. Sensors 2024, 24, 8159. https://doi.org/10.3390/s24248159
Zhao J, Wen X, He Y, Yang X, Song K. Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection. Sensors. 2024; 24(24):8159. https://doi.org/10.3390/s24248159
Chicago/Turabian StyleZhao, Jianxun, Xin Wen, Yu He, Xiaowei Yang, and Kechen Song. 2024. "Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection" Sensors 24, no. 24: 8159. https://doi.org/10.3390/s24248159
APA StyleZhao, J., Wen, X., He, Y., Yang, X., & Song, K. (2024). Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection. Sensors, 24(24), 8159. https://doi.org/10.3390/s24248159