[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Synergy between Semantic Segmentation and Image Denoising via Alternate Boosting

Published: 06 February 2023 Publication History

Abstract

The capability of image semantic segmentation may be deteriorated due to the noisy input image, where image denoising prior to segmentation may help. Both image denoising and semantic segmentation have been developed significantly with the advance of deep learning. In this work, we are interested in the synergy between these two tasks by using a holistic deep model. We observe that not only denoising helps combat the drop of segmentation accuracy due to the noisy input, but also pixel-wise semantic information boosts the capability of denoising. We then propose a boosting network to perform denoising and segmentation alternately. The proposed network is composed of multiple segmentation and denoising blocks (SDBs), each of which estimates a semantic map and then uses the map to regularize denoising. Experimental results show that the denoised image quality is improved substantially and the segmentation accuracy is improved to close to that on clean images, and segmentation and denoising are both boosted as the number of SDBs increases. On the Cityscapes dataset, using three SDBs improves the denoising quality to 34.42 dB in PSNR, and the segmentation accuracy to 66.5 in mIoU, when the additive white Gaussian noise level is 50.

References

[1]
Abdelrahman Abdelhamed, Stephen Lin, and Michael S. Brown. 2018. A high-quality denoising dataset for smartphone cameras. In CVPR. IEEE, 1692–1700.
[2]
Pedram Abdolghader, Andrew Ridsdale, Tassos Grammatikopoulos, Gavin Resch, Francois Legare, Albert Stolow, Adrian F. Pegoraro, and Isaac Tamblyn. 2021. Unsupervised hyperspectral stimulated Raman microscopy image enhancement: Denoising and segmentation via one-shot deep learning. arXiv preprint arXiv:2104.08338 (2021).
[3]
Saeed Anwar and Nick Barnes. 2019. Real image denoising with feature attention. In ICCV. IEEE, 3155–3164.
[4]
Saeed Anwar, Fatih Porikli, and Cong Phuoc Huynh. 2017. Category-specific object image denoising. IEEE Trans. Image Process. 26, 11 (2017), 5506–5518.
[5]
Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. 2010. Contour detection and hierarchical image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 33 (2010), 898–916.
[6]
Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 39, 12 (2017), 2481–2495.
[7]
Tim-Oliver Buchholz, Mangal Prakash, Deborah Schmidt, Alexander Krull, and Florian Jug. 2020. DenoiSeg: Joint denoising and segmentation. In ECCV. Springer, 324–337.
[8]
Michael R. Charest, Michael Elad, and Peyman Milanfar. 2006. A general iterative regularization framework for image denoising. In CISS. IEEE, 452–457.
[9]
Chang Chen, Zhiwei Xiong, Xinmei Tian, Zheng-Jun Zha, and Feng Wu. 2019. Real-world image denoising with deep boosting. IEEE Trans. Patt. Anal. Mach. Intell. 42, 12 (2019), 3071–3087.
[10]
Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Chengpeng Chen. 2021. HINet: Half instance normalization network for image restoration. In CVPR. 182–192.
[11]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2014. Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint arXiv:1412.7062 (2014).
[12]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2017. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Patt. Anal. Mach. Intell. 40, 4 (2017), 834–848.
[13]
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. IEEE, 3213–3223.
[14]
Kostadin Dabov, Alessandro Foi, Vladimir Katkovnik, and Karen Egiazarian. 2007. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 8 (2007), 2080–2095.
[15]
Weisheng Dong, Xin Li, Lei Zhang, and Guangming Shi. 2011. Sparsity-based image denoising via dictionary learning and structural clustering. In CVPR. IEEE, 457–464.
[16]
Michael Elad and Michal Aharon. 2006. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15, 12 (2006), 3736–3745.
[17]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, 249–256.
[18]
Shi Guo, Zifei Yan, Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2019. Toward convolutional blind denoising of real photographs. In CVPR. IEEE, 1712–1722.
[19]
Shizhong Han, Zibo Meng, Ahmed-Shehab Khan, and Yan Tong. 2016. Incremental boosting convolutional neural network for facial action unit recognition. In NIPS. 109–117.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In ICCV. 1026–1034.
[21]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. IEEE, 770–778.
[22]
Fumitaka Hosotani, Yuya Inuzuka, Masaya Hasegawa, Shigeki Hirobayashi, and Tadanobu Misawa. 2015. Image denoising with edge-preserving and segmentation based on mask NHA. IEEE Trans. Image Process. 24 (2015), 6025–6033.
[23]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML. 448–456.
[24]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer, 694–711.
[25]
Yoonsik Kim, Jae Woong Soh, Gu Yong Park, and Nam Ik Cho. 2020. Transfer learning from synthetic to real-noise denoising with adaptive instance normalization. In CVPR. IEEE, 3482–3492.
[26]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[27]
Alexander Krull, Tim-Oliver Buchholz, and Florian Jug. 2019. Noise2Void-learning denoising from single noisy images. In CVPR. IEEE, 2129–2137.
[28]
Agostina J. Larrazabal, Cesar Martinez, and Enzo Ferrante. 2019. Anatomical priors for image segmentation via post-processing with denoising autoencoders. In MICCAI. Springer, 585–593.
[29]
Ghazanfar Latif, D. A. Iskandar, Jaafar Alghazo, Mohsin Butt, and Adil H. Khan. 2018. Deep CNN based MR image denoising for tumor segmentation using watershed transform. Int. J. Eng. Technol. 7 (2018), 37–42.
[30]
Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. 2018. Noise2Noise: Learning image restoration without clean data. In ICML. 2965–2974.
[31]
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. 2021. SwinIR: Image restoration using Swin transformer. In ICCV. 1833–1844.
[32]
Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. 2017. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In CVPR. IEEE, 1925–1934.
[33]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In ECCV. Springer, 740–755.
[34]
Xiaoyu Lin. 2021. Learning degraded image classification with restoration data fidelity. arXiv preprint arXiv:2101.09606 (2021).
[35]
Ding Liu, Bihan Wen, Jianbo Jiao, Xianming Liu, Zhangyang Wang, and Thomas S. Huang. 2020. Connecting image denoising and high-level vision tasks via deep learning. IEEE Trans. Image Process. 29 (2020), 3695–3706.
[36]
Ding Liu, Bihan Wen, Xianming Liu, Zhangyang Wang, and Thomas S. Huang. 2018. When image denoising meets high-level vision tasks: A deep learning approach. In IJCAI. 842–848.
[37]
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. 2021. Swin transformer: Hierarchical vision transformer using shifted windows. In ICCV. 10012–10022.
[38]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. IEEE, 3431–3440.
[39]
Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro, and Andrew Zisserman. 2009. Non-local sparse models for image restoration. In ICCV. IEEE, 2272–2279.
[40]
Xiaojiao Mao, Chunhua Shen, and Yu-Bin Yang. 2016. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In NIPS. 2802–2810.
[41]
Mohammad Moghimi, Serge J. Belongie, Mohammad J. Saberian, Jian Yang, Nuno Vasconcelos, and Li-Jia Li. 2016. Boosted convolutional neural networks. In BMVC. 1–6.
[42]
Adam Paszke, Sam Gross, Francisco Massa, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In NeurIPS. 8024–8035.
[43]
Tobias Plotz and Stefan Roth. 2017. Benchmarking denoising algorithms with real photographs. In CVPR. IEEE, 1586–1595.
[44]
Tal Remez, Or Litany, Raja Giryes, and Alex M. Bronstein. 2018. Class-aware fully convolutional Gaussian and Poisson denoising. IEEE Trans. Image Process. 27, 11 (2018), 5707–5722.
[45]
Wenqi Ren, Jinshan Pan, Xiaochun Cao, and Ming-Hsuan Yang. 2017. Video deblurring via semantic segmentation and pixel-wise non-linear kernel. In ICCV. IEEE, 1077–1085.
[46]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234–241.
[47]
Jie Shao, Kai Hu, Changhu Wang, Xiangyang Xue, and Bhiksha Raj. 2020. Is normalization indispensable for training deep neural network? In NeurIPS. 13434–13444.
[48]
Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc Van Gool, and Rainer Stiefelhagen. 2018. Classification-driven dynamic image enhancement. In CVPR. IEEE, 4033–4041.
[49]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[50]
Maneesh Singh, Prakash Ishwar, Krishna Ratakonda, and Narendra Ahuja. 1999. Segmentation based denoising using multiple compaction domains. In ICIP. IEEE, 372–375.
[51]
Robin Strudel, Ricardo Garcia, Ivan Laptev, and Cordelia Schmid. 2021. Segmenter: Transformer for semantic segmentation. In ICCV. 7262–7272.
[52]
Hossein Talebi, Xiang Zhu, and Peyman Milanfar. 2012. How to SAIF-ly boost denoising performance. IEEE Trans. Image Process. 22, 4 (2012), 1470–1485.
[53]
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In CVPR. IEEE, 9446–9454.
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Adv. Neural Inf. Process. 30 (2017).
[55]
Mayank Vatsa, Richa Singh, and Afzel Noore. 2009. Denoising and segmentation of 3D brain images.Image Process. Comput. Vis. Patt. Recog. 9 (2009), 561–567.
[56]
Jingdong Wang, Ke Sun, Tianheng Cheng, et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE Trans. Patt. Anal. Mach. Intell. (2020). DOI:
[57]
Li Wang, Dong Li, Yousong Zhu, Lu Tian, and Yi Shan. 2020. Dual super-resolution learning for semantic segmentation. In CVPR. IEEE, 3774–3783.
[58]
Sicheng Wang, Bihan Wen, Junru Wu, Dacheng Tao, and Zhangyang Wang. 2019. Segmentation-aware image denoising without knowing true segmentation. arXiv preprint arXiv:1905.08965 (2019).
[59]
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. 2021. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In ICCV. 568–578.
[60]
Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR. IEEE, 606–615.
[61]
Zhendong Wang, Xiaodong Cun, Jianmin Bao, and Jianzhuang Liu. 2021. Uformer: A general u-shaped transformer for image restoration. arXiv preprint arXiv:2106.03106 (2021).
[62]
Ziyue Xu, Ulas Bagci, Jurgen Seidel, David Thomasson, Jeff Solomon, and Daniel J. Mollura. 2014. Segmentation based denoising of PET images: An iterative approach via regional means and affinity propagation. In MICCAI. Springer, 698–705.
[63]
Ziyue Xu, Mingchen Gao, Georgios Z. Papadakis, Brian Luna, Sanjay Jain, Daniel J. Mollura, and Ulas Bagci. 2018. Joint solution for PET image segmentation, denoising, and partial volume correction. Med. Image Anal. 46 (2018), 229–243.
[64]
Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, and Kuiyuan Yang. 2018. DenseASPP for semantic segmentation in street scenes. In CVPR. IEEE, 3684–3692.
[65]
Fisher Yu and Vladlen Koltun. 2015. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015).
[66]
Yuhui Yuan, Xilin Chen, and Jingdong Wang. 2020. Object-contextual representations for semantic segmentation. In ECCV. Springer, 173–190.
[67]
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2021. Restormer: Efficient transformer for high-resolution image restoration. arXiv preprint arXiv:2111.09881 (2021).
[68]
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. 2021. Multi-stage progressive image restoration. In CVPR. 14821–14831.
[69]
Haochen Zhang, Dong Liu, and Zhiwei Xiong. 2019. Two-stream action recognition-oriented video super-resolution. In ICCV. IEEE, 8799–8808.
[70]
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. 2017. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26, 7 (2017), 3142–3155.
[71]
Kai Zhang, Wangmeng Zuo, Shuhang Gu, and Lei Zhang. 2017. Learning deep CNN denoiser prior for image restoration. In CVPR. IEEE, 3929–3938.
[72]
Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 27, 9 (2018), 4608–4622.
[73]
Zhenyu Zhang, Zhen Cui, Chunyan Xu, Zequn Jie, Xiang Li, and Jian Yang. 2018. Joint task-recursive learning for semantic segmentation and depth estimation. In ECCV. Springer, 235–251.
[74]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In CVPR. IEEE, 2881–2890.
[75]
Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip H. S. Torr, et al. 2021. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In CVPR. 6881–6890.
[76]
Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ADE20K dataset. In CVPR. IEEE, 633–641.

Cited By

View all
  • (2025)Pocket convolution Mamba for brain tumor segmentationThe Journal of Supercomputing10.1007/s11227-024-06732-381:1Online publication date: 1-Jan-2025
  • (2024)Bridging the Domain Gap in Scene Flow Estimation via Hierarchical Smoothness RefinementACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366182320:8(1-21)Online publication date: 12-Jun-2024
  • (2024)Efficient Brain Tumor Segmentation with Lightweight Separable Spatial Convolutional NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365371520:7(1-19)Online publication date: 16-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 2
March 2023
540 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3572860
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 February 2023
Online AM: 14 July 2022
Accepted: 07 July 2022
Revised: 02 June 2022
Received: 13 December 2021
Published in TOMM Volume 19, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Alternate boosting
  2. deep learning
  3. image denoising
  4. semantic segmentation

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • Natural Science Foundation of China
  • Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)7
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Pocket convolution Mamba for brain tumor segmentationThe Journal of Supercomputing10.1007/s11227-024-06732-381:1Online publication date: 1-Jan-2025
  • (2024)Bridging the Domain Gap in Scene Flow Estimation via Hierarchical Smoothness RefinementACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366182320:8(1-21)Online publication date: 12-Jun-2024
  • (2024)Efficient Brain Tumor Segmentation with Lightweight Separable Spatial Convolutional NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365371520:7(1-19)Online publication date: 16-May-2024
  • (2024)A Novel Framework for Joint Learning of City Region Partition and RepresentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365285720:7(1-23)Online publication date: 16-May-2024
  • (2024)Multi-Content Interaction Network for Few-Shot SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364385020:6(1-20)Online publication date: 8-Mar-2024
  • (2024)Pedestrian Attribute Recognition via Spatio-temporal Relationship Learning for Visual SurveillanceACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363262420:6(1-15)Online publication date: 8-Mar-2024
  • (2024)Joint EM Image Denoising and Segmentation with Instance-Aware InteractionMedical Image Computing and Computer Assisted Intervention – MICCAI 202410.1007/978-3-031-72104-5_39(403-413)Online publication date: 7-Oct-2024
  • (2023)3V3D: Three-View Contextual Cross-slice Difference Three-dimensional Medical Image Segmentation Adversarial NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359261419:6(1-28)Online publication date: 12-Jul-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media