Credible Dual-Expert Learning for Weakly Supervised Semantic Segmentation

Bingfeng Zhang^1,2,
Jimin Xiao ORCID: orcid.org/0000-0002-9416-2486³,
Yunchao Wei⁴ &
…
Yao Zhao⁴

1117 Accesses
1 Altmetric
Explore all metrics

Abstract

Great progress has been witnessed for weakly supervised semantic segmentation, which aims to segment objects without dense pixel annotations. Most approaches concentrate on generating high quality pseudo labels, which are then fed into a standard segmentation model as supervision. However, such a solution has one major limitation: noise of pseudo labels is inevitable, which is unsolvable for the standard segmentation model. In this paper, we propose a credible dual-expert learning (CDL) framework to mitigate the noise of pseudo labels. Specifically, we first observe that the model predictions with different optimization loss functions will have different credible regions; thus, it is possible to make self-corrections with multiple predictions. Based on this observation, we design a dual-expert structure to mine credible predictions, which are then processed by our noise correction module to update pseudo labels in an online way. Meanwhile, to handle the case that the dual-expert produces incredible predictions for the same region, we design a relationship transfer module to provide feature relationships, enabling our noise correction module to transfer predictions from the credible regions to such incredible regions. Considering the above designs, we propose a base CDL network and an extended CDL network to satisfy different requirements. Extensive experiments show that directly replacing our model with a conventional fully supervised segmentation model, the performances of various weakly supervised semantic segmentation pipelines were boosted, achieving new state-of-the-art performances on both PASCAL VOC 2012 and MS COCO with a clear margin. Code will be available at: https://github.com/zbf1991/CDL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation with Boundary Exploration

Cross-supervision-based equilibrated fusion mechanism of local and global attention for semantic segmentation

Article 14 September 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

We use two datasets: PASCAL VOC 2012 (Everingham et al., 2010) with SBD (Hariharan et al., 2011) and MS COCO-2014 (Lin et al., 2014). All of them are from public resources. The PASCAL VOC 2012 dataset generated and analyzed during the current study is available in the PASCAL repository (Everingham et al., 2010), which is from the following public domain resources: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html. The dataset SBD (Hariharan et al., 2011) is available at the following public domain resources: http://home.bharathh.info/pubs/codes/SBD/download.html. The MS COCO-2014 dataset generated and analyzed during the current study is available in COCO repository (Lin et al., 2014), which is from the following public domain resources: https://cocodataset.org/.

References

Ahn, J., & Kwak, S. (2018). Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4981–4990).
Ahn, J., Cho, S., & Kwak, S. (2019). Weakly supervised learning of instance segmentation with inter-pixel relations. In Proceedings of the IEEE conference on computer vision and pattern recognition.
Asgari Taghanaki, S., Abhishek, K., Cohen, J. P., Cohen-Adad, J., & Hamarneh, G. (2021). Deep semantic segmentation of natural and medical images: A review. Artificial Intelligence Review, 54(1), 137–178.
Article Google Scholar
Bearman, A., Russakovsky, O., Ferrari, V., & Fei-Fei, L. (2016). What’s the point: Semantic segmentation with point supervision. In Proceedings of the European conference on computer vision (pp. 549–565).
Chang, Y. T., Wang, Q., Hung, W. C., Piramuthu, R., Tsai, Y. H., & Yang, M. H. (2020). Weakly-supervised semantic segmentation via sub-category exploration. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8991–9000).
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2014). Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062.
Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587.
Chen, L., Wu, W., Fu, C., Han, X., & Zhang, Y. (2020). Weakly supervised semantic segmentation with boundary exploration. In Proceedings of the European conference on computer vision (pp. 347–362).
Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (pp. 801–818).
Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.
Article Google Scholar
Dai, J., He, K., & Sun, J. (2015). Boxsup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1635–1643).
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 248–255).
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Fan, J., Zhang, Z., & Tan, T. (2020). Employing multi-estimations for weakly-supervised semantic segmentation. In Proceedings of the European conference on computer vision.
Fan, J., Zhang, Z., Song, C., & Tan, T. (2020). Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4283–4292).
Gao, S. H., Cheng, M. M., Zhao, K., Zhang, X. Y., Yang, M. H., & Torr, P. (2021). Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2019.2938758
Article Google Scholar
Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., & Malik, J. (2011). Semantic contours from inverse detectors. In Proceedings of the IEEE international conference on computer vision (pp. 991–998).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Huang, Z., Wang, X., Wang, J., Liu, W., & Wang, J. (2018). Weakly-supervised semantic segmentation network with deep seeded region growing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7014–7023).
Jadon, S. (2020). A survey of loss functions for semantic segmentation. In 2020 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB) (pp. 1–7).
Khoreva, A., Benenson, R., Hosang, J., Hein, M., & Schiele, B. (2017). Simple does it: Weakly supervised instance and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 876–885).
Krähenbühl, P., & Koltun, V. (2013). Parameter learning and convergent inference for dense random fields. In International conference on machine learning (pp. 513–521).
Kulharia, V., Chandra, S., Agrawal, A., Torr, P., & Tyagi, A. (2020). Box2seg: Attention weighted loss and discriminative feature learning for weakly supervised segmentation. In Proceedings of the European conference on computer vision (pp. 290–308).
Kweon, H., Yoon, S. H., Kim, H., Park, D., & Yoon, K. J. (2021). Unlocking the potential of ordinary classifier: Class-specific adversarial erasing framework for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6994–7003).
Lee, J., Choi, J., Mok, J., & Yoon, S. (2021). Reducing information bottleneck for weakly supervised semantic segmentation. Advances in Neural Information Processing Systems, 34, 27408–27421.
Google Scholar
Lee, J., Kim, E., & Yoon, S. (2021). Anti-adversarially manipulated attributions for weakly and semi-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4071–4080).
Lee, J., Kim, E., Lee, S., Lee, J., & Yoon, S. (2019). Ficklenet: Weakly and semi-supervised semantic image segmentation using stochastic inference. arXiv preprint arXiv:1902.10421.
Lee, S., Lee, M., Lee, J., & Shim, H. (2021). Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5495–5505).
Lee, J., Yi, J., Shin, C., & Yoon, S. (2021). Bbam: Bounding box attribution map for weakly supervised semantic and instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2643–2652).
Li, Y., Kuang, Z., Liu, L., Chen, Y., & Zhang, W. (2021). Pseudo-mask matters in weakly-supervised semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6964–6973).
Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., & Liu, H. (2019). Expectation-maximization attention networks for semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9167–9176).
Lin, D., Dai, J., Jia, J., He, K., & Sun, J. (2016). Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on computer vision and pattern recognition (pp. 3159–3167).
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117–2125).
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C.L. (2014). Microsoft coco: Common objects in context. In Proceedings of the European conference on computer vision (pp. 740–755).
Liu, Y., Wu, Y. H., Wen, P., Shi, Y., Qiu, Y., & Cheng, M. M. (2020). Leveraging instance-, image- and dataset-level information for weakly supervised instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2020.3023152
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431–3440).
Luo, W., Yang, M., & Zheng, W. (2021). Weakly-supervised semantic segmentation with saliency and incremental supervision updating. Pattern Recognition, 115, 107858.
Article Google Scholar
Milletari, F. (2018). Hough voting strategies for segmentation, detection and tracking (Ph.D. Thesis, Technische Universität München).
Milletari, F., Navab, N., & Ahmadi, S. A. (2016). V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 4th international conference on 3D vision (3DV) (pp. 565–571).
Nakashima, K. (2017). Deeplab with pytorch. https://github.com/kazuto1011/deeplab-pytorch.
Oh, Y., Kim, B., & Ham, B. (2021). Background-aware pooling and noise-aware loss for weakly-supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6913–6922).
Pan, J., Zhu, P., Zhang, K., Cao, B., Wang, Y., Zhang, D., Han, J., & Hu, Q. (2022). Learning self-supervised low-rank network for single-stage weakly and semi-supervised semantic segmentation. International Journal of Computer Vision, 130(5), 1181–1195.
Article Google Scholar
Pu, M., Huang, Y., Guan, Q., & Zou, Q. (2018). Graphnet: Learning image pseudo annotations for weakly-supervised semantic segmentation. In Proceedings of the 26th ACM international conference on multimedia (pp. 483–491).
Ru, L., Du, B., Zhan, Y., & Wu, C. (2022). Weakly-supervised semantic segmentation with visual words learning and hybrid pooling. International Journal of Computer Vision, 130(4), 1127–1144.
Article Google Scholar
Shimoda, W., & Yanai, K. (2019). Self-supervised difference detection for weakly-supervised semantic segmentation. In Proceedings of the IEEE international conference on computer vision (pp. 5208–5217).
Song, C., Huang, Y., Ouyang, W., & Wang, L. (2019). Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation. arXiv preprint arXiv:1904.11693.
Su, Y., Sun, R., Lin, G., & Wu, Q. (2021). Context decoupling augmentation for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7004–7014).
Sun, K., Shi, H., Zhang, Z., & Huang, Y. (2021). Ecs-net: Improving weakly supervised semantic segmentation by using connections between class activation maps. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7283–7292).
Sun, G., Wang, W., Dai, J., & Van Gool, L. (2020). Mining cross-image semantics for weakly supervised semantic segmentation. In Proceedings of the European conference on computer vision (pp. 347–365).
Tang, M., Perazzi, F., Djelouah, A., Ben Ayed, I., Schroers, C., & Boykov, Y. (2018). On regularized losses for weakly-supervised cnn segmentation. In Proceedings of the European conference on computer vision (pp. 507–522).
Wang, B., Qi, G., Tang, S., Zhang, T., Wei, Y., Li, L., & Zhang, Y. (2019). Boundary perception guidance: A scribble-supervised semantic segmentation approach. In International Joint Conference on Artificial Intelligence.
Wang, Y., Zhang, J., Kan, M., Shan, S., & Chen, X. (2020). Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. arXiv preprint arXiv:2004.04581.
Wang, X., Liu, S., Ma, H., & Yang, M. H. (2020). Weakly-supervised semantic segmentation by iterative affinity learning. International Journal of Computer Vision, 128(6), 1736–1749.
Article MathSciNet MATH Google Scholar
Wei, Y., Feng, J., Liang, X., Cheng, M. M., Zhao, Y., & Yan, S. (2017). Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1568–1576).
Wei, Y., Xiao, H., Shi, H., Jie, Z., Feng, J., & Huang, T. S. (2018). Revisiting dilated convolution: A simple approach for weakly-and semi-supervised semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7268–7277).
Wu, T., Huang, J., Gao, G., Wei, X., Wei, X., Luo, X., & Liu, C.H. (2021). Embedded discriminative attention mechanism for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16765–16774).
Wu, Z., Shen, C., & Van Den Hengel, A. (2019). Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognition, 90, 119–133.
Article Google Scholar
Xiao, T., Liu, Y., Zhou, B., Jiang, Y., & Sun, J. (2018). Unified perceptual parsing for scene understanding. In Proceedings of the European conference on computer vision.
Xu, L., Ouyang, W., Bennamoun, M., Boussaid, F., Sohel, F., & Xu, D. (2021). Leveraging auxiliary tasks with affinity learning for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 6984–6993).
Yao, Y., Chen, T., Xie, G. S., Zhang, C., Shen, F., Wu, Q., Tang, Z., & Zhang, J. (2021). Non-salient region object mining for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2623–2632).
Zhang, F., Gu, C., Zhang, C., & Dai, Y. (2021). Complementary patch for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 7242–7251).
Zhang, T., Lin, G., Liu, W., Cai, J., & Kot, A. (2020). Splitting vs. merging: Mining object regions with discrepancy and intersection loss for weakly supervised semantic segmentation. In Proceedings of the European conference on computer vision.
Zhang, B., Xiao, J., & Zhao, Y. (2021). Dynamic feature regularized loss for weakly supervised semantic segmentation. arXiv preprint arXiv:2108.01296.
Zhang, B., Xiao, J., Jiao, J., Wei, Y., & Zhao, Y. (2021). Affinity attention graph neural network for weakly supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 8082–8096.
Zhang, B., Xiao, J., Wei, Y., Huang, K., Luo, S., & Zhao, Y. (2022). End-to-end weakly supervised semantic segmentation with reliable region mining. Pattern Recognition, 128, 108663.
Article Google Scholar
Zhang, D., Zhang, H., Tang, J., Hua, X., & Sun, Q. (2020). Causal intervention for weakly-supervised semantic segmentation. arXiv preprint arXiv:2009.12547.
Zhang, B., Xiao, J., Wei, Y., Sun, M., & Huang, K. (2020). Reliability does matter: An end-to-end weakly supervised semantic segmentation approach. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12765–12772.
Article Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2881–2890).
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2921–2929).

Download references

Acknowledgements

This work was supported by National Key R &D Program of China (No. 2022YFE0200300), National Natural Science Foundation of China (No. 61972323) and Independent Innovation Research Project of China University of Petroleum (East China) (No. 22CX06060A).

Author information

Authors and Affiliations

China University of Petroleum (East China), Qingdao, China
Bingfeng Zhang
University of Liverpool, Liverpool, UK
Bingfeng Zhang
School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China
Jimin Xiao
Institute of Information Science, Beijing Jiaotong University, Beijing, China
Yunchao Wei & Yao Zhao

Authors

Bingfeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jimin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yunchao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jimin Xiao.

Additional information

Communicated by Karteek Alahari.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, B., Xiao, J., Wei, Y. et al. Credible Dual-Expert Learning for Weakly Supervised Semantic Segmentation. Int J Comput Vis 131, 1892–1908 (2023). https://doi.org/10.1007/s11263-023-01796-9

Download citation

Received: 12 July 2022
Accepted: 07 April 2023
Published: 26 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11263-023-01796-9

Credible Dual-Expert Learning for Weakly Supervised Semantic Segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation with Boundary Exploration

Cross-supervision-based equilibrated fusion mechanism of local and global attention for semantic segmentation

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Credible Dual-Expert Learning for Weakly Supervised Semantic Segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation with Boundary Exploration

Cross-supervision-based equilibrated fusion mechanism of local and global attention for semantic segmentation

Explore related subjects

Data Availability Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation