[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Semantic image segmentation task is a key content of the computer vision field. However, the biggest challenge is feature extraction will cause the loss of information in images. Simultaneously, the context information will also get weak, which leads to the rough of segmentation results. Thus, this paper proposes a novel semantic segmentation model using encoder–decoder as the basic structure. The designed model consists of two parallel branches. The first branch is Laplacian Pyramid network. It can store residual features that will be fed into the second branch to restore the lost information effectively. The second branch is backbone segmentation network realized by encoder–decoder structure. In encoder stage, we use ResNet-50 to extract features, and we replace traditional convolution with deformable convolution, which can make the contour of objects clearer after segmentation. In decoder stage, we provide a spatial attention with filtering and centralization to obtain consistent spatial attention matrix. It can capture the correlation of pixels in images and acquire the context information. We also design weighted-sum module of the feature map. It is accomplished by performing element-weighted-wise sum between the decoder feature and corresponding residual feature from the Laplacian Pyramid. This module can recover the boundary and detail information further. Our proposed model can obtain dense feature prediction and promote segmentation accuracy. Experimental results show that the model can reach 91.2% average accuracy and 65.0% mIOU on Caimvid dataset, respectively, which proves the rationality and effectiveness of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Brabandere BD, Neven D, Gool LV (2017) Semantic instance segmentation with a discriminative loss function. Preprint arXiv:1708.02551

  2. Kirillov A, Girshick R, He K (2019) Panoptic feature pyramid networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6399–6408

  3. Hong C, Yu J, Zhang J, Jin X, Lee KH (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inform 15(7):3952–3961

    Article  Google Scholar 

  4. Jian Y, Fidler S, Urtasun R (2012) Describing the scene as a whole: joint object detection, scene classification and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 702–709

  5. He K, Cao X, Shi Y, Nie D, Gao Y, Shen D (2018) Pelvic organ segmentation using distinctive curve guided fully convolutional networks. IEEE Trans Med Imaging 38(2):585–595

    Article  Google Scholar 

  6. Yu X, Ye X, Gao Q (2019) Pelvic organ segmentation using distinctive curve guided fully convolutional networks. Int J Press Vessels Pip 172:329–336

    Article  Google Scholar 

  7. Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779

    Article  Google Scholar 

  8. Lim YW, Sang UL (1990) On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques. Pattern Recogn 23(9):935–952

    Article  Google Scholar 

  9. Wang XP, Chen L, Wu S (2014) Watershed image segmentation based on area constraint and adaptive gradient modification. J Optoelectron 25(11):2219–2226

    Google Scholar 

  10. Chanda B, Kundu MK, Padmaja YV (1998) A multi-scale morphologic edge detector. Pattern Recogn 31(10):1469–1478

    Article  Google Scholar 

  11. Leymarie F, Levine MD (1993) Tracking deformable objects in the plane using an active contour model. IEEE Trans Pattern Anal Mach Intell 15(6):617–634

    Article  Google Scholar 

  12. LeCun Y, Boser B, Denker J, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  13. He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  14. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the advances in neural information processing systems, pp 1097–1105

  15. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Preprint arXiv: 1409.1556

  16. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  17. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  18. Jegou S, Drozdzal M, Vazquez D, Romero A, Bengio Y (2017) The one hundred layers tiramisu: fully convolutional denseNets for semantic segmentation. In: IEEE conference on computer vision and pattern recognition workshops, pp 11–19

  19. Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for scene segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  20. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234–241

  21. Lin GS, Milan A, Shen CH, Reid ID (2017) Refifinenet: multi-path efifinement networks for highresolution semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 5168–5177

  22. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International conference on learning representations, pp 357–361

  23. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  24. Ren Z, Kong Q, Han J, Plumbley MD, Schuller BW (2019) Attention-based atrous convolutional neural networks: visualisation and understanding perspectives of acoustic scenes. In: Proceedings of the advances in international conference on acoustics, speech and signal processing, pp 56–60

  25. Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: European conference on computer vision. Springer, pp 552–568

  26. Heeger DJ, Bergen JR (1995) Pyramid-based texture analysis/synthesis. In: Proc. Conf. Comput. Graph. Interactive techniques, pp 229–238

  27. Paris S, Hasinoff SW, Kautz J (2011) Local Laplacian filters: Edgeaware image processing with a Laplacian pyramid. ACM Trans Graph 30(4):1–12

    Article  Google Scholar 

  28. Burt PJ, Adelson HE (1983) The Laplacian pyramid as a compact image code. IEEE Trans Commun 30(4):532–540

    Article  Google Scholar 

  29. Ghiasi G, Fowlkes CC (2016) Laplacian pyramid reconstruction and refinement for semantic segmentation. In: Proc. Eur. Conf. Comput. Vis, pp 519–534

  30. Dai JF, Qi HZ, Xiong YW, Li Y, Zhang GD, Hu H, Wei YC (2017) Deformable convolutional networks. In: roceedings of the IEEE international conference on computer vision (ICCV), pp 764–773

  31. Lazarow J, Lee K, Shi K, Tu Z (2020) Learning instance occlusion for panoptic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10720–10729

  32. Saleh FS, Aliakbarian MS, Salzmann M, Petersson L, Alvarez JM (2018) Effective use of synthetic data for urban scene semantic segmentation. In: European conference on computer vision, pp 86–103

  33. Hazirbas C, Ma L, Domokos C, Cremers D (2016) Fusenet: incorporating depth into semantic segmentation via fusion-based cnn architecture. In: Asian conference on computer vision Springer, pp 213–228

  34. Dai J, He K, Sun J (2015) Boxsup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: IEEE International conference on computer vision, pp 1635–1643

  35. Li H, Xiong P, Fan HQ, Sun J (2019) Dfanet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the ieee conference on computer vision and pattern recognition, pp 9522–9531

  36. Fu J, Liu J, Tian HJ, Li Y, Bao YJ, Fang ZW, Lu HQ (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3146–3154

  37. Yu CQ, Wang JB Peng C, Gao CX, Yu G, Sang N (2018) Learning a discriminative feature network for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1857–1866

  38. Zhu H, Miao Y, Zhang X (2020) Semantic image segmentation with improved position attention and feature fusion. Neural Process Lett 50(1):329–351

    Article  Google Scholar 

  39. Brostow G, Fauqueur J, Cipolla R (2009) Semantic object classes in video: a high-definition ground truth database. Pattern Recogn Lett 30(2):88–97

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongsheng Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y. Semantic Image Segmentation with Feature Fusion Based on Laplacian Pyramid. Neural Process Lett 54, 4153–4170 (2022). https://doi.org/10.1007/s11063-022-10801-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10801-0

Keywords

Navigation