More Web Proxy on the site http://driver.im/

research-article

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Authors:

Shiliang Huang,

Xiao LuoAuthors Info & Claims

The Visual Computer, Volume 40, Issue 1

Pages 373 - 391

https://doi.org/10.1007/s00371-023-02787-0

Published: 15 February 2023 Publication History

Abstract

Semantic segmentation suffers from boundary shift and shape deformation problems due to the neglect of overall guidance information. Motivated by that the object boundaries have a stronger representation for overall information of target objects, we propose a simple yet effective network, boundary-guidance network (BG-Net), for object consistency maintaining in semantic segmentation. In our work, we explore the pixels both near and far from the boundary, characterizing each pixel by utilizing pixel-boundary association. The boundary feature is integrated into the segmentation feature to mitigate boundary shift and shape deformation problem. We explicitly supervise the angle and distance information of pixels pointing to the nearest object boundary. Then the association can be learned by geometric modelling. Meanwhile, the low-level feature emphasized up-sampling (LFEU) module is designed to supplement the detail representation in high-level feature without direct interference. Finally, we evaluate our method on Cityscapes and CamVid datasets. The experimental results demonstrate the superiority of our BG-Net.

References

[1]

Peng G, Yang S, and Wang H Refine for semantic segmentation based on parallel convolutional network with attention model Neural Process. Lett. 2021 53 6 4177-4188

[2]

Liu S, Ye H, Jin K, and Cheng H Ct-unet: context-transfer-unet for building segmentation in remote sensing images Neural Process. Lett. 2021 53 6 4257-4277

[3]

Everingham M, Gool LV, Williams CKI, Winn J, and Zisserman A The pascal visual object classes (voc) challenge Int. J. Comput. Vision 2010 88 2 303-338

[4]

Lin TY et al. Fleet D, Pajdla T, Schiele B, Tuytelaars T, et al. Microsoft COCO: common objects in context European Conference on Computer Vision 2014 Cham Springer

[5]

Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 891-898 (2014)

[6]

Neuhold, G., Ollmann, T., Bulo, S.R., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: IEEE International Conference on Computer Vision (2017)

[7]

Abu Alhaija H, Mustikovela SK, Mescheder L, Geiger A, and Rother C Augmented reality meets computer vision : efficient data generation for urban driving scenes Int. J. Comput. Vis. 2017 2 1-12

[8]

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

[9]

Chen LC, Papandreou G, Kokkinos I, Murphy K, and Yuille AL Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs PAMI 2018 40 4 834-848

[10]

Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)

[11]

Chen, L.C., Papandreou, G., Scroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR (2017) arXiv:1706.05587

[12]

Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

[13]

Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. arXiv preprint arXiv:1909.11065 (2019)

[14]

Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., Agrawal, A.: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7151–7160 (2018)

[15]

Zhang, H., Zhang, H., Wang, C., Xie, J.: Co-occurrent features in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–557 (2019)

[16]

Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Expectation-maximization attention networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9167–9176 (2019)

[17]

He, J., Deng, Z., Zhou, L., Wang, Y., Qiao, Y.: Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7519–7528 (2019)

[18]

Zhang, D., Zhang, H., Tang, J., Hua, X.-S., Sun, Q.: Self-regulation for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6953–6963 (2021)

[19]

Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)

[20]

Chen, X., Yuan, Y., Zeng, G., Wang, J.: Semi-supervised semantic segmentation with cross pseudo supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2613–2622 (2021)

[21]

Acuna, D., Kar, A., Fidler, S.: Devil is in the edges: Learning semantic boundaries from noisy annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11075–11083 (2019)

[22]

Chen, X., Williams, B.M., Vallabhaneni, S.R., Czanner, G., Williams, R., Zheng, Y.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11632–11640 (2019)

[23]

Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., Wang, J.: High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019)

[24]

Ke, T.-W., Hwang, J.-J., Liu, Z., Yu, S.X.: Adaptive affinity fields for semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 587–602 (2018)

[25]

Liu, Y., Cheng, M.-M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3000–3009 (2017)

[26]

Ding, H., Jiang, X., Liu, A.Q., Thalmann, N.M., Wang, G.: Boundary-aware feature propagation for scene segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6819–6829 (2019)

[27]

Fieraru, M., Khoreva, A., Pishchulin, L., Schiele, B.: Learning to refine human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 205–214 (2018)

[28]

Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 603–612 (2019)

[29]

Kuo, W., Angelova, A., Malik, J., Lin, T.-Y.: Shapemask: Learning to segment novel objects by refining shape priors. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9207–9216 (2019)

[30]

Kirillov, A., Wu, Y., He, K., Girshick, R.: Pointrend: Image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020)

[31]

Yuan, Y., Xie, J., Chen, X., Wang, J.: Segfix: Model-agnostic boundary refinement for segmentation. In: European Conference on Computer Vision, pp. 489–506. Springer, (2020)

[32]

Badrinarayanan V, Kendall A, and Cipolla R Segnet: a deep convolutional encoder-decoder architecture for image segmentation IEEE Trans. Pattern Anal. Mach. Intell. 2017 39 12 2481-2495

[33]

Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)

[34]

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer, (2015)

[35]

Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: European Conference on Computer Vision, pp. 38–56. Springer, (2016)

[36]

Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: Fastfcn: Rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:1903.11816 (2019)

[37]

Kirillov, A., Girshick, R., He, K., Dollár, P.: Panoptic feature pyramid networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6399–6408 (2019)

[38]

Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)

[39]

Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

[40]

Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3684–3692 (2018)

[41]

Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

[42]

Zhu, Z., Xu, M., Bai, S., Huang, T., Bai, X.: Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 593–602 (2019)

[43]

Zhang, D., Zhang, H., Tang, J., Wang, M., Hua, X., Sun, Q.: Feature pyramid transformer. arXiv preprint arXiv:2007.09451 (2020)

[44]

Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)

[45]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

[46]

Xiong, Y., Liao, R., Zhao, H., Hu, R., Bai, M., Yumer, E., Urtasun, R.: Upsnet: A unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)

[47]

Liao X, Yin J, Chen M, and Qin Z Adaptive payload distribution in multiple images steganography based on image texture features IEEE Trans. Dependable Secur. Comput. 2020

[48]

Liao X, Yu Y, Li B, Li Z, and Qin Z A new payload partition strategy in color image steganography IEEE Trans. Circ. Syst. Video Technol. 2019 30 3 685-696

[49]

Liao X, Li K, Zhu X, and Liu KR Robust detection of image operator chain with two-stream convolutional neural network IEEE J. Select. Topics Signal Process. 2020 14 5 955-968

[50]

Sun, Y., Chen, Q., He, X., Wang, J., Feng, H., Han, J., Ding, E., Cheng, J., Li, Z., Wang, J.: Singular value fine-tuning: Few-shot segmentation requires few-parameters fine-tuning. arXiv:2206.06122 [cs.CV] (2022)

[51]

Li Z, Sun Y, Zhang L, and Tang J Ctnet: context-based tandem network for semantic segmentation IEEE Trans. Pattern Anal. Mach. Intell. 2021

[52]

Sun, Y., Li, Z.: Ssa: Semantic structure aware inference for weakly pixel-wise dense predictions without cost. arXiv preprint arXiv:2111.03392 (2021)

[53]

Ma Z, Yuan M, Gu J, Meng W, Xu S, and Zhang X Triple-strip attention mechanism-based natural disaster images classification and segmentation Vis. Comput. 2022 38 9 3163-3173

[54]

Fu Y, Chen Q, and Zhao H CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation Vis. Comput. 2022 38 9–10 3243-3252

[55]

Liu T, Cai Y, Zheng J, and Thalmann NM Beacon: a boundary embedded attentional convolution network for point cloud instance segmentation Vis. Comput. 2022 38 7 2303-2313

[56]

Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-scnn: Gated shape cnns for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5229–5238 (2019)

[57]

Li, X., Li, X., Zhang, L., Cheng, G., Shi, J., Lin, Z., Tan, S., Tong, Y.: Improving semantic segmentation via decoupled body and edge supervision. arXiv preprint arXiv:2007.10035 (2020)

[58]

Roy K and Sahay RR A robust multi-scale deep learning approach for unconstrained hand detection aided by skin segmentation Vis. Comput. 2022 38 8 2801-2825

[59]

Jiang M, Zhai F, and Kong J Sparse attention module for optimizing semantic segmentation performance combined with a multi-task feature extraction network Vis. Comput. 2022 38 7 2473-2488

[60]

Lyu C, Hu G, and Wang D Attention to fine-grained information: hierarchical multi-scale network for retinal vessel segmentation Vis. Comput. 2020

[61]

Cheng Z, Qu A, and He X Contour-aware semantic segmentation network with spatial attention mechanism for medical image Vis. Comput. 2022 38 3 749-762

[62]

Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011)

[63]

Dias, P.A., Medeiros, H.: Semantic segmentation refinement by monte carlo region growing of high confidence detections. In: Asian Conference on Computer Vision, pp. 131–146. Springer, (2018)

[64]

Zhang, C., Lin, G., Liu, F., Yao, R., Shen, C.: Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5217–5226 (2019)

[65]

Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., Wei, X.: Rethinking BiSeNet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9716-9725 (2021)

[66]

Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28 (2015)

[67]

Kimmel R, Kiryati N, and Bruckstein AM Sub-pixel distance maps and weighted distance transforms J. Math. Imaging Vis. 1996 6 2 223-233

[68]

Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: ECCV (1), pp. 44–57 (2008)

[69]

Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

[70]

Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

[71]

Tao, A., Sapra, K., Catanzaro, B.: Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821 (2020)

[72]

Chen, Y., Kalantidis, Y., Li, J., Yan, S., Feng, J.:

A^{2}

-nets: Double attention networks. Adv. Neural Inform. Process. Syst. 31 (2018)

[73]

Amirul Islam, M., Rochan, M., Bruce, N.D., Wang, Y.: Gated feedback refinement network for dense image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3751–3759 (2017)

[74]

Mohan, R., Valada, A.: Efficientps: Efficient panoptic segmentation. arXiv preprint arXiv:2004.02307 (2020)

[75]

Cheng, B., Collins, M.D., Zhu, Y., Liu, T., Huang, T.S., Adam, H., Chen, L.-C.: Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12475–12485 (2020)

[76]

Chen, L.-C., Lopes, R.G., Cheng, B., Collins, M.D., Cubuk, E.D., Zoph, B., Adam, H., Shlens, J.: Naive-student: Leveraging semi-supervised learning in video sequences for urban scene segmentation. European Conference on Computer Vision, ECCV 2020, pp. 695–714

[77]

Kundu, A., Vineet, V., Koltun, V.: Feature space optimization for semantic video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3168–3175 (2016)

[78]

Bilinski, P., Prisacariu, V.: Dense decoder shortcut connections for single-pass semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6596–6605 (2018)

[79]

Chandra, S., Couprie, C., Kokkinos, I.: Deep spatio-temporal random fields for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8915–8924 (2018)

[80]

Zhu, Y., Sapra, K., Reda, F.A., Shih, K.J., Newsam, S., Tao, A., Catanzaro, B.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019)

Index Terms

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Interest point and salient region detections
        Video segmentation
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Lesions Segmentation of Medical Ultrasound Images with Boundary Enhancement Strategy
IMIP 2022: 2022 4th International Conference on Intelligent Medicine and Image Processing

Segmentation plays an important role for ultrasound image analysis. However, it is still a challenging problem for automatic target segmentation on ultrasound image owing to the low resolution and low contrast with surrounding tissues or organs on ...
Integrating low-level and semantic features for object consistent segmentation

The aim of semantic segmentation is to assign each pixel a semantic label. Numerous methods for semantic segmentation have been proposed in recent years and most of them chose pixel or superpixel as the processing primitives. However, as the information ...
Object Boundary Guided Semantic Segmentation
Computer Vision – ACCV 2016
Abstract
Semantic segmentation is critical to image content understanding and object localization. Recent development in fully-convolutional neural network (FCN) has enabled accurate pixel-level labeling. One issue in previous works is that the FCN based ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Visual Computer: International Journal of Computer Graphics

The Visual Computer: International Journal of Computer Graphics Volume 40, Issue 1

Jan 2024

439 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 15 February 2023

Accepted: 05 January 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents