[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Dual cross-enhancement network for highly accurate dichotomous image segmentation

Published: 18 November 2024 Publication History

Abstract

The existing image segmentation tasks mainly focus on segmenting objects with specific characteristics, such as salient, camouflaged, and meticulous objects, etc. However, the research of highly accurate Dichotomous Image Segmentation (DIS) combining these tasks has just started and still faces problems such as insufficient information interaction between layers and incomplete integration of high-level semantic information and low-level detailed features. In this paper, a new dual cross-enhancement network (DCENet) for highly accurate DIS is proposed, which mainly consists of two new modules: a cross-scaling guidance (CSG) module and a semantic cross-transplantation (SCT) module. Specifically, the CSG module adopts the adjacent-layer cross-scaling guidance method, which can efficiently interact with the multi-scale features of the adjacent layers extracted; the SCT module uses dual-branch features to complement each other. Moreover, in the way of transplantation, the high-level semantic information of the low-resolution branch is used to guide the low-level detail features of the high-resolution branch, and the features of different resolution branches are effectively fused. Finally, experimental results on the challenging DIS5K benchmark dataset show that the proposed network outperforms the 9 state-of-the-art (SOTA) networks in 5 widely used evaluation metrics. In addition, the ablation experiments also demonstrate the effectiveness of the cross-scaling guidance module and the semantic cross-transplantation module.

Highlights

We design a novel model for dichotomous image segmentation (DIS) with two main modules: cross-scaling guidance (CSG) and semantic cross-transplantation (SCT). Extensive experiments on 4 challenging tasks of the DIS5K public benchmark show the effectiveness of these modules.
To guide multi-scale features between adjacent layers, we introduce the cross-scaling guidance (CSG) module, which imitates the behavior of human beings zooming in and out when observing high-resolution images, providing mutual scaling guidance between layers.
To effectively fuse features from different resolutions, we propose the semantic cross-transplantation (SCT) module, which supplements double-branch features and transfers rich semantic information from the low-resolution branch to the high-resolution branch.

References

[1]
Achanta R., Hemami S., Estrada F., Susstrunk S., Frequency-tuned salient region detection, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2009, pp. 1597–1604.
[2]
Badrinarayanan V., Kendall A., Cipolla R., Segnet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 39 (12) (2017) 2481–2495.
[3]
Borji A., Cheng M.M., Jiang H., Li J., Salient object detection: A benchmark, IEEE Trans. Image Process. 24 (12) (2015) 5706–5722.
[4]
Chen, Y., Dai, X., Chen, D., Liu, M., Dong, X., Yuan, L., Liu, Z., 2022. Mobile-former: Bridging mobilenet and transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5270–5279.
[5]
Chen L.C., Papandreou G., Kokkinos I., Murphy K., Yuille A.L., Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Trans. Pattern Anal. Mach. Intell. 40 (4) (2017) 834–848.
[6]
Chen, S., Tan, X., Wang, B., Hu, X., 2018. Reverse attention for salient object detection. In: Proceedings of the European Conference on Computer Vision. ECCV, pp. 234–250.
[7]
Chen, Z., Xu, Q., Cong, R., Huang, Q., 2020a. Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 07. pp. 10599–10606.
[8]
Chen Z., Zhou H., Lai J., Yang L., Xie X., Contour-aware loss: Boundary-aware learning for salient object segmentation, IEEE Trans. Image Process. 30 (2020) 431–443.
[9]
Fan, D.P., Cheng, M.M., Liu, Y., Li, T., Borji, A., 2017. Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 4548–4557.
[10]
Fan D.P., Gong C., Cao Y., Ren B., Cheng M.M., Borji A., Enhanced-alignment measure for binary foreground map evaluation, 2018, arXiv preprint arXiv:1805.10421.
[11]
Fan D.P., Ji G.P., Cheng M.M., Shao L., Concealed object detection, IEEE Trans. Pattern Anal. Mach. Intell. 44 (10) (2021) 6024–6042.
[12]
Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L., 2020. Camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2777–2787.
[13]
Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., Wei, X., 2021b. Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9716–9725.
[14]
Han K., Liew J.H., Feng J., Tian H., Zhao Y., Wei Y., Slim scissors: Segmenting thin object from synthetic background, in: European Conference on Computer Vision, Springer, 2022, pp. 379–395.
[15]
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778.
[16]
Hu, X., Wang, S., Qin, X., Dai, H., Ren, W., Luo, D., Tai, Y., Shao, L., 2023. High-resolution iterative feedback network for camouflaged object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 1. pp. 881–889.
[17]
Ke, Y.Y., Tsubono, T., 2022. Recursive contour-saliency blending network for accurate salient object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 2940–2950.
[18]
Le T.N., Nguyen T.V., Nie Z., Tran M.T., Sugimoto A., Anabranch network for camouflaged object segmentation, Comput. Vis. Image Underst. 184 (2019) 45–56.
[19]
Li J., Xia X., Li W., Li H., Wang X., Xiao X., Wang R., Zheng M., Pan X., Next-vit: Next generation vision transformer for efficient deployment in realistic industrial scenarios, 2022, arXiv preprint arXiv:2207.05501.
[20]
Li, A., Zhang, J., Lv, Y., Liu, B., Zhang, T., Dai, Y., 2021. Uncertainty-aware joint salient object and camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10071–10081.
[21]
Liew, J.H., Cohen, S., Price, B., Mai, L., Feng, J., 2021. Deep interactive thin object selection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 305–314.
[22]
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B., 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 10012–10022.
[23]
Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3431–3440.
[24]
Margolin, R., Zelnik-Manor, L., Tal, A., 2014. How to evaluate foreground maps?. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255.
[25]
Mei, H., Ji, G.P., Wei, Z., Yang, X., Wei, X., Fan, D.P., 2021. Camouflaged object segmentation with distraction mining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8772–8781.
[26]
Nirkin, Y., Wolf, L., Hassner, T., 2021. Hyperseg: Patch-wise hypernetwork for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4061–4070.
[27]
Pang, Y., Zhao, X., Xiang, T.Z., Zhang, L., Lu, H., 2022. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2160–2170.
[28]
Pang, Y., Zhao, X., Zhang, L., Lu, H., 2020. Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 9413–9422.
[29]
Paszke A., Gross S., Chintala S., Chanan G., Yang E., DeVito Z., Lin Z., Desmaison A., Antiga L., Lerer A., Automatic differentiation in pytorch, 2017.
[30]
Peng, Z., Huang, W., Gu, S., Xie, L., Wang, Y., Jiao, J., Ye, Q., 2021. Conformer: Local features coupling global representations for visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 367–376.
[31]
Perazzi F., Krähenbühl P., Pritch Y., Hornung A., Saliency filters: Contrast based filtering for salient region detection, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012, pp. 733–740.
[32]
Qin X., Dai H., Hu X., Fan D.P., Shao L., Van Gool L., Highly accurate dichotomous image segmentation, in: European Conference on Computer Vision, Springer, 2022, pp. 38–56.
[33]
Qin, X., Zhang, Z., Huang, C., Gao, C., Dehghan, M., Jagersand, M., 2019. Basnet: Boundary-aware salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7479–7489.
[34]
Ronneberger O., Fischer P., Brox T., U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, Springer, 2015, pp. 234–241.
[35]
Saponara S., Elhanashi A., Gagliardi A., Reconstruct fingerprint images using deep learning and sparse autoencoder algorithms, Real-Time Image Processing and Deep Learning 2021, vol. 11736, SPIE, 2021, pp. 9–18.
[36]
Tang, L., Li, B., Zhong, Y., Ding, S., Song, M., 2021. Disentangled high quality salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3580–3590.
[37]
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I., Attention is all you need, Adv. Neural Inf. Process. Syst. 30 (2017).
[38]
Wang, L., Lu, H., Wang, Y., Feng, M., Wang, D., Yin, B., Ruan, X., 2017. Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 136–145.
[39]
Wang, H., Wan, L., Tang, H., 2023. LeNo: Adversarial Robust Salient Object Detection Networks with Learnable Noise. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 2. pp. 2537–2545.
[40]
Wang, T., Zhang, L., Wang, S., Lu, H., Yang, G., Ruan, X., Borji, A., 2018. Detect globally, refine locally: A novel approach to saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3127–3135.
[41]
Wei, J., Wang, S., Huang, Q., 2020. F3Net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No. 07. pp. 12321–12328.
[42]
Xiao T., Singh M., Mintun E., Darrell T., Dollár P., Girshick R., Early convolutions help transformers see better, Adv. Neural Inf. Process. Syst. 34 (2021) 30392–30400.
[43]
Xie, C., Xia, C., Ma, M., Zhao, Z., Chen, X., Li, J., 2022. Pyramid grafting network for one-stage high resolution saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 11717–11726.
[44]
Yang C., Wang Y., Zhang J., Zhang H., Lin Z., Yuille A., Meticulous object segmentation, 2020, arXiv preprint arXiv:2012.07181.
[45]
Yang, F., Zhai, Q., Li, X., Huang, R., Luo, A., Cheng, H., Fan, D.P., 2021. Uncertainty-guided transformer reasoning for camouflaged object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4146–4155.
[46]
Zeng, Y., Zhang, P., Zhang, J., Lin, Z., Lu, H., 2019. Towards high-resolution salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7234–7243.
[47]
Zhai, Q., Li, X., Yang, F., Chen, C., Cheng, H., Fan, D.P., 2021. Mutual graph learning for camouflaged object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 12997–13007.
[48]
Zhao, J.X., Liu, J.J., Fan, D.P., Cao, Y., Yang, J., Cheng, M.M., 2019. EGNet: Edge guidance network for salient object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 8779–8788.
[49]
Zhao X., Pang Y., Zhang L., Lu H., Zhang L., Suppress and balance: A simple gated network for salient object detection, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, Springer, 2020, pp. 35–51.
[50]
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H., et al., 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6881–6890.
[51]
Zheng, D., Zheng, X., Yang, L.T., Gao, Y., Zhu, C., Ruan, Y., 2023. Mffn: Multi-view feature fusion network for camouflaged object detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 6232–6242.
[52]
Zhuge M., Fan D.P., Liu N., Zhang D., Xu D., Shao L., Salient object detection via integrity learning, IEEE Trans. Pattern Anal. Mach. Intell. 45 (3) (2022) 3738–3752.

Index Terms

  1. Dual cross-enhancement network for highly accurate dichotomous image segmentation
              Index terms have been assigned to the content through auto-classification.

              Recommendations

              Comments

              Please enable JavaScript to view thecomments powered by Disqus.

              Information & Contributors

              Information

              Published In

              cover image Computer Vision and Image Understanding
              Computer Vision and Image Understanding  Volume 248, Issue C
              Nov 2024
              456 pages

              Publisher

              Elsevier Science Inc.

              United States

              Publication History

              Published: 18 November 2024

              Author Tags

              1. 41A05
              2. 41A10
              3. 65D05
              4. 65D17

              Author Tags

              1. Deep Learning
              2. Dichotomous Image Segmentation
              3. Cross-scaling Guidance
              4. Semantic Cross-Transplantation

              Qualifiers

              • Research-article

              Contributors

              Other Metrics

              Bibliometrics & Citations

              Bibliometrics

              Article Metrics

              • 0
                Total Citations
              • 0
                Total Downloads
              • Downloads (Last 12 months)0
              • Downloads (Last 6 weeks)0
              Reflects downloads up to 12 Jan 2025

              Other Metrics

              Citations

              View Options

              View options

              Media

              Figures

              Other

              Tables

              Share

              Share

              Share this Publication link

              Share on social media