[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation

Published: 15 February 2023 Publication History

Abstract

Semantic segmentation suffers from boundary shift and shape deformation problems due to the neglect of overall guidance information. Motivated by that the object boundaries have a stronger representation for overall information of target objects, we propose a simple yet effective network, boundary-guidance network (BG-Net), for object consistency maintaining in semantic segmentation. In our work, we explore the pixels both near and far from the boundary, characterizing each pixel by utilizing pixel-boundary association. The boundary feature is integrated into the segmentation feature to mitigate boundary shift and shape deformation problem. We explicitly supervise the angle and distance information of pixels pointing to the nearest object boundary. Then the association can be learned by geometric modelling. Meanwhile, the low-level feature emphasized up-sampling (LFEU) module is designed to supplement the detail representation in high-level feature without direct interference. Finally, we evaluate our method on Cityscapes and CamVid datasets. The experimental results demonstrate the superiority of our BG-Net.

References

[1]
Peng G, Yang S, and Wang H Refine for semantic segmentation based on parallel convolutional network with attention model Neural Process. Lett. 2021 53 6 4177-4188
[2]
Liu S, Ye H, Jin K, and Cheng H Ct-unet: context-transfer-unet for building segmentation in remote sensing images Neural Process. Lett. 2021 53 6 4257-4277
[3]
Everingham M, Gool LV, Williams CKI, Winn J, and Zisserman A The pascal visual object classes (voc) challenge Int. J. Comput. Vision 2010 88 2 303-338
[4]
Lin TY et al. Fleet D, Pajdla T, Schiele B, Tuytelaars T, et al. Microsoft COCO: common objects in context European Conference on Computer Vision 2014 Cham Springer
[5]
Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Yuille, A.: The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 891-898 (2014)
[6]
Neuhold, G., Ollmann, T., Bulo, S.R., Kontschieder, P.: The mapillary vistas dataset for semantic understanding of street scenes. In: IEEE International Conference on Computer Vision (2017)
[7]
Abu Alhaija H, Mustikovela SK, Mescheder L, Geiger A, and Rother C Augmented reality meets computer vision : efficient data generation for urban driving scenes Int. J. Comput. Vis. 2017 2 1-12
[8]
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
[9]
Chen LC, Papandreou G, Kokkinos I, Murphy K, and Yuille AL Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs PAMI 2018 40 4 834-848
[10]
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: ICLR (2016)
[11]
Chen, L.C., Papandreou, G., Scroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR (2017) arXiv:1706.05587
[12]
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
[13]
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. arXiv preprint arXiv:1909.11065 (2019)
[14]
Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., Agrawal, A.: Context encoding for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7151–7160 (2018)
[15]
Zhang, H., Zhang, H., Wang, C., Xie, J.: Co-occurrent features in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–557 (2019)
[16]
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Expectation-maximization attention networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9167–9176 (2019)
[17]
He, J., Deng, Z., Zhou, L., Wang, Y., Qiao, Y.: Adaptive pyramid context network for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7519–7528 (2019)
[18]
Zhang, D., Zhang, H., Tang, J., Hua, X.-S., Sun, Q.: Self-regulation for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6953–6963 (2021)
[19]
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)
[20]
Chen, X., Yuan, Y., Zeng, G., Wang, J.: Semi-supervised semantic segmentation with cross pseudo supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2613–2622 (2021)
[21]
Acuna, D., Kar, A., Fidler, S.: Devil is in the edges: Learning semantic boundaries from noisy annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11075–11083 (2019)
[22]
Chen, X., Williams, B.M., Vallabhaneni, S.R., Czanner, G., Williams, R., Zheng, Y.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11632–11640 (2019)
[23]
Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., Wang, J.: High-resolution representations for labeling pixels and regions. arXiv preprint arXiv:1904.04514 (2019)
[24]
Ke, T.-W., Hwang, J.-J., Liu, Z., Yu, S.X.: Adaptive affinity fields for semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 587–602 (2018)
[25]
Liu, Y., Cheng, M.-M., Hu, X., Wang, K., Bai, X.: Richer convolutional features for edge detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3000–3009 (2017)
[26]
Ding, H., Jiang, X., Liu, A.Q., Thalmann, N.M., Wang, G.: Boundary-aware feature propagation for scene segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6819–6829 (2019)
[27]
Fieraru, M., Khoreva, A., Pishchulin, L., Schiele, B.: Learning to refine human pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 205–214 (2018)
[28]
Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 603–612 (2019)
[29]
Kuo, W., Angelova, A., Malik, J., Lin, T.-Y.: Shapemask: Learning to segment novel objects by refining shape priors. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9207–9216 (2019)
[30]
Kirillov, A., Wu, Y., He, K., Girshick, R.: Pointrend: Image segmentation as rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9799–9808 (2020)
[31]
Yuan, Y., Xie, J., Chen, X., Wang, J.: Segfix: Model-agnostic boundary refinement for segmentation. In: European Conference on Computer Vision, pp. 489–506. Springer, (2020)
[32]
Badrinarayanan V, Kendall A, and Cipolla R Segnet: a deep convolutional encoder-decoder architecture for image segmentation IEEE Trans. Pattern Anal. Mach. Intell. 2017 39 12 2481-2495
[33]
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)
[34]
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241. Springer, (2015)
[35]
Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: European Conference on Computer Vision, pp. 38–56. Springer, (2016)
[36]
Wu, H., Zhang, J., Huang, K., Liang, K., Yu, Y.: Fastfcn: Rethinking dilated convolution in the backbone for semantic segmentation. arXiv preprint arXiv:1903.11816 (2019)
[37]
Kirillov, A., Girshick, R., He, K., Dollár, P.: Panoptic feature pyramid networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6399–6408 (2019)
[38]
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
[39]
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
[40]
Yang, M., Yu, K., Zhang, C., Li, Z., Yang, K.: Denseaspp for semantic segmentation in street scenes. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3684–3692 (2018)
[41]
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
[42]
Zhu, Z., Xu, M., Bai, S., Huang, T., Bai, X.: Asymmetric non-local neural networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 593–602 (2019)
[43]
Zhang, D., Zhang, H., Tang, J., Wang, M., Hua, X., Sun, Q.: Feature pyramid transformer. arXiv preprint arXiv:2007.09451 (2020)
[44]
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
[45]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
[46]
Xiong, Y., Liao, R., Zhao, H., Hu, R., Bai, M., Yumer, E., Urtasun, R.: Upsnet: A unified panoptic segmentation network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8818–8826 (2019)
[47]
Liao X, Yin J, Chen M, and Qin Z Adaptive payload distribution in multiple images steganography based on image texture features IEEE Trans. Dependable Secur. Comput. 2020
[48]
Liao X, Yu Y, Li B, Li Z, and Qin Z A new payload partition strategy in color image steganography IEEE Trans. Circ. Syst. Video Technol. 2019 30 3 685-696
[49]
Liao X, Li K, Zhu X, and Liu KR Robust detection of image operator chain with two-stream convolutional neural network IEEE J. Select. Topics Signal Process. 2020 14 5 955-968
[50]
Sun, Y., Chen, Q., He, X., Wang, J., Feng, H., Han, J., Ding, E., Cheng, J., Li, Z., Wang, J.: Singular value fine-tuning: Few-shot segmentation requires few-parameters fine-tuning. arXiv:2206.06122 [cs.CV] (2022)
[51]
Li Z, Sun Y, Zhang L, and Tang J Ctnet: context-based tandem network for semantic segmentation IEEE Trans. Pattern Anal. Mach. Intell. 2021
[52]
Sun, Y., Li, Z.: Ssa: Semantic structure aware inference for weakly pixel-wise dense predictions without cost. arXiv preprint arXiv:2111.03392 (2021)
[53]
Ma Z, Yuan M, Gu J, Meng W, Xu S, and Zhang X Triple-strip attention mechanism-based natural disaster images classification and segmentation Vis. Comput. 2022 38 9 3163-3173
[54]
Fu Y, Chen Q, and Zhao H CGFNet: cross-guided fusion network for RGB-thermal semantic segmentation Vis. Comput. 2022 38 9–10 3243-3252
[55]
Liu T, Cai Y, Zheng J, and Thalmann NM Beacon: a boundary embedded attentional convolution network for point cloud instance segmentation Vis. Comput. 2022 38 7 2303-2313
[56]
Takikawa, T., Acuna, D., Jampani, V., Fidler, S.: Gated-scnn: Gated shape cnns for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5229–5238 (2019)
[57]
Li, X., Li, X., Zhang, L., Cheng, G., Shi, J., Lin, Z., Tan, S., Tong, Y.: Improving semantic segmentation via decoupled body and edge supervision. arXiv preprint arXiv:2007.10035 (2020)
[58]
Roy K and Sahay RR A robust multi-scale deep learning approach for unconstrained hand detection aided by skin segmentation Vis. Comput. 2022 38 8 2801-2825
[59]
Jiang M, Zhai F, and Kong J Sparse attention module for optimizing semantic segmentation performance combined with a multi-task feature extraction network Vis. Comput. 2022 38 7 2473-2488
[60]
Lyu C, Hu G, and Wang D Attention to fine-grained information: hierarchical multi-scale network for retinal vessel segmentation Vis. Comput. 2020
[61]
Cheng Z, Qu A, and He X Contour-aware semantic segmentation network with spatial attention mechanism for medical image Vis. Comput. 2022 38 3 749-762
[62]
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Advances in Neural Information Processing Systems, pp. 109–117 (2011)
[63]
Dias, P.A., Medeiros, H.: Semantic segmentation refinement by monte carlo region growing of high confidence detections. In: Asian Conference on Computer Vision, pp. 131–146. Springer, (2018)
[64]
Zhang, C., Lin, G., Liu, F., Yao, R., Shen, C.: Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5217–5226 (2019)
[65]
Fan, M., Lai, S., Huang, J., Wei, X., Chai, Z., Luo, J., Wei, X.: Rethinking BiSeNet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9716-9725 (2021)
[66]
Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28 (2015)
[67]
Kimmel R, Kiryati N, and Bruckstein AM Sub-pixel distance maps and weighted distance transforms J. Math. Imaging Vis. 1996 6 2 223-233
[68]
Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: ECCV (1), pp. 44–57 (2008)
[69]
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., Schiele, B.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
[70]
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
[71]
Tao, A., Sapra, K., Catanzaro, B.: Hierarchical multi-scale attention for semantic segmentation. arXiv preprint arXiv:2005.10821 (2020)
[72]
Chen, Y., Kalantidis, Y., Li, J., Yan, S., Feng, J.: A2-nets: Double attention networks. Adv. Neural Inform. Process. Syst. 31 (2018)
[73]
Amirul Islam, M., Rochan, M., Bruce, N.D., Wang, Y.: Gated feedback refinement network for dense image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3751–3759 (2017)
[74]
Mohan, R., Valada, A.: Efficientps: Efficient panoptic segmentation. arXiv preprint arXiv:2004.02307 (2020)
[75]
Cheng, B., Collins, M.D., Zhu, Y., Liu, T., Huang, T.S., Adam, H., Chen, L.-C.: Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12475–12485 (2020)
[76]
Chen, L.-C., Lopes, R.G., Cheng, B., Collins, M.D., Cubuk, E.D., Zoph, B., Adam, H., Shlens, J.: Naive-student: Leveraging semi-supervised learning in video sequences for urban scene segmentation. European Conference on Computer Vision, ECCV 2020, pp. 695–714
[77]
Kundu, A., Vineet, V., Koltun, V.: Feature space optimization for semantic video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3168–3175 (2016)
[78]
Bilinski, P., Prisacariu, V.: Dense decoder shortcut connections for single-pass semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6596–6605 (2018)
[79]
Chandra, S., Couprie, C., Kokkinos, I.: Deep spatio-temporal random fields for efficient video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8915–8924 (2018)
[80]
Zhu, Y., Sapra, K., Reda, F.A., Shih, K.J., Newsam, S., Tao, A., Catanzaro, B.: Improving semantic segmentation via video propagation and label relaxation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8856–8865 (2019)

Index Terms

  1. BG-Net: boundary-guidance network for object consistency maintaining in semantic segmentation
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image The Visual Computer: International Journal of Computer Graphics
          The Visual Computer: International Journal of Computer Graphics  Volume 40, Issue 1
          Jan 2024
          439 pages

          Publisher

          Springer-Verlag

          Berlin, Heidelberg

          Publication History

          Published: 15 February 2023
          Accepted: 05 January 2023

          Author Tags

          1. Semantic segmentation
          2. Boundary-guidance
          3. Low-level feature
          4. Object consistency maintaining

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 30 Dec 2024

          Other Metrics

          Citations

          View Options

          View options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media