[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Smoke semantic segmentation with multi-scale residual paths and weighted middle surveillances

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Visual smoke segmentation is widely used for fire detection, simulation, human evacuation and pollution monitoring. However, it is challenging to accurately separate smoke from a single image due to highly complicated appearance of smoke, such as blurriness, semi-transparency and varying shapes. We design several multi-resolution convolutional paths to generate multi-scale feature maps for obtaining scale invariance. To capture the feature representation of blurriness and semi-transparency features, we present a Multi-scale Residual Module (MRM) by significantly increasing the width and depth of residual paths. Combining learnable de-convolution and bilinear interpolation has the advantage of generating multi-scale features that are helpful to progressively produce up-sampled and middle predictions for training surveillances. By designing a monotonically decreasing function with respect to the independent pre-training error of each prediction, we implement an automatic regulation of relative importance for accelerating the training. Experimental results show that our method achieves the best results of 79.03%, 78.30% and 78.65% on the three virtual smoke data sets among state-of-the-art methods. In addition, another superiority of our method is the proposed multi-prediction loss making the training of our network to converge faster and more stably.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Yuan F (2012) A double mapping framework for extraction of shape-invariant features based on multi-scale partitions with Adaboost for video smoke detection. Pattern Recognit 45(12):4326–4336

    Article  Google Scholar 

  2. Yuan F, Shi J, Xia X, Fang Y, Fang Z, Mei T (2016) High-order local ternary patterns with locality preserving projection for smoke detection and image classification. Inf Sci 372:225–240

    Article  Google Scholar 

  3. Yuan F, Shi J, Xia X, Huang Q, Li X (2019) Co-occurrence Matching of local binary patterns for improving visual adaption and its application to smoke recognition. IET Comput Vis 13(2):178–187

    Article  Google Scholar 

  4. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  Google Scholar 

  5. Zhao H, Qi X, Shen X, Shi J, Jia J (2018) ICNet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European Conference on Computer Vision (ECCV), Munich, pp 418–434

  6. Yuan Y, Chao M, Lo Y (2017) Automatic skin lesion segmentation using deep fully convolutional networks with Jaccard distance. IEEE Trans Med Imaging 36(9):1876–1886

    Article  Google Scholar 

  7. Salehi SS, Erdogmus D, Gholipour A (2017) Auto-context convolutional neural network (auto-net) for brain extraction in magnetic resonance imaging. IEEE Trans Med Imaging 36(11):2319–2330

    Article  Google Scholar 

  8. Lee H, Kwon H (2017) Going deeper with contextual CNN for hyperspectral image classification. IEEE Trans Image Processing 26(10):4843–4855

    Article  MathSciNet  Google Scholar 

  9. Jiao L, Liang M, Chen H, Yang S, Liu H, Cao X (2017) Deep fully convolutional network-based spatial distribution prediction for hyperspectral image classification. IEEE Trans Geosci Remote Sens 55(10):5585–5599

    Article  Google Scholar 

  10. Mou L, Ghamisi P, Zhu X (2018) Unsupervised spectral-spatial feature learning via deep residual conv-deconv network for hyperspectral image classification. IEEE Trans Geosci Remote Sens 56(1):391–406

    Article  Google Scholar 

  11. Yuan F, Zhang L, Wan B, Xia X, Shi J (2019) Convolutional neural networks based on multi-scale additive merging layers for visual smoke recognition. Mach Vis Appl 30:345–358

    Article  Google Scholar 

  12. Peng C, Zhang X, Yu G, Luo G, Sun J (2017) Large kernel matters — improve semantic segmentation by global convolutional network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 1743–1751

  13. Long J, Shelhame E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, pp 3431–3440

  14. Karen S, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representation (ICLR), San Diego

  15. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp 770–778

  16. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Munich, pp 234–241

  17. Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFS. In: Proceedings of the International Conference on Learning Representations (ICLR), San Diego

  18. Cheng Y, Cai R, Li Z, Zhao X, Huang K (2017) Locality-sensitive deconvolution networks with gated fusion for RGB-D indoor semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, pp 1475–1483

  19. Islam MA, Naha S, Rochan M, Bruce N, Wang Y (2017) Label refinement network for coarse-to-fine semantic segmentation. arXiv preprint arXiv: 1703.00551. https://www.arxiv.org/abs/1703.00551v1

  20. Caelles S, Maninis K, Ponttuset J, Lealtaixé L, Cremers D, Van Gool L (2017) One-Shot Video Object Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 5320–5329

  21. Jun TJ, Kweon J, Kim Y-H, Kim D (2020) T-Net: Nested encoder–decoder architecture for the main vessel segmentation in coronary angiography. Neural Netw 128:216–233

  22. Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  23. Chen L, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  24. Lin G, Milan A, Shen C, Reid I (2017) refinenet: multi-path refinement networks for high-resolution semantic segmentation. In: proceedings of the ieee conference on computer vision and Pattern Recognition (CVPR), Honolulu, pp 5168–5177

  25. Ding H, Jiang X, Shuai B, Qun Liu A, Wang G (2018) Context contrasted feature and gated multi-scale aggregation for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, pp 2393–2402

  26. Jain SD, Xiong B, Grauma K (2017) FusionSeg: Learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 2117–2126

  27. Lee S, Park SJ, Hong K (2017) RDFNet: RGB-D multi-level residual feature fusion for indoor semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, pp 4990–4999

  28. Durand T, Mordan T, Thome N, Cord M (2017) WILDCAT: Weakly supervised learning of deep ConvNets for image classification, pointwise localization and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 5957–5966

  29. Luo P, Wang G, Lin L, Wang X (2017) Deep dual learning for semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, pp 2737–2745

  30. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 1800–1807

  31. Chen L, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), Munich, pp 833–851

  32. Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 5987–5995

  33. Bilinski P, Prisacariu V (2018) Dense decoder shortcut connections for single-pass semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, pp 6596–6605

  34. Srinivasu PN, Balas VE (2021) Self-learning network-based segmentation for real-time brain M.R. images through HARIS. Peer J Computer Science 7:e654

    Article  Google Scholar 

  35. Srinivasu PN, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ (2021) Classification of skin disease using deep learning neural networks with MobileNet V2 and LSTM. Sensors 21(8):2852

    Article  Google Scholar 

  36. Sagar A, Garg S, Nath P (2018) Nagrath, smoke detection in digital frames. Int Res J Eng Technol 5(4):3843–3846

    Google Scholar 

  37. Filonenko A, Hernandez DC, Jo K (2018) Fast smoke detection for video surveillance using CUDA. IEEE Trans Ind Inform 14(2):725–733

    Article  Google Scholar 

  38. Dimitropoulos K, Barmpoutis P, Grammalidis NN (2017) Higher order linear dynamical systems for smoke detection in video surveillance applications. IEEE Trans Circuits Syst Video Technol 27(5):1143–1154

    Article  Google Scholar 

  39. Zhao Y (2015) Candidate smoke region segmentation of fire video based on rough set theory. J Electr Comput Eng 11:1–8

    Google Scholar 

  40. Zhang N, Wang H, Hu Y (2015) Smoke image segmentation algorithm based on rough set and region growing. J Front Comput Sci Technol 11(8):1296–1299

    Google Scholar 

  41. Chen J, Zhao G, Salo M, Rahtu E, Pietikainen M (2013) Automatic dynamic texture segmentation using local descriptors and optical flow. IEEE Trans Image Process 22(1):326–339

    Article  MathSciNet  Google Scholar 

  42. Andrearczyk V, Whelan PF (2018) Convolutional neural network on three orthogonal planes for dynamic texture classification. Pattern Recognit 76:36–49

  43. Jia Y, Lin G, Wang J (2016) Early video smoke segmentation algorithm based on saliency detection and Gaussian mixture model. Comput Eng 42(2):206–209

    Google Scholar 

  44. Hu Y, Wang H, Ma Z (2016) Adaptive smoke image segmentation algorithm based on improved Gaussiean mixture model. Journal of Computer-Aided Design & Computer Graphics 28(7):1138–1145

    Google Scholar 

  45. Lin Z, Liu H, Wotton M (2019) Kalman filter-based large-scale wildfire monitoring with a system of UAVs. IEEE Trans Ind Electron 66(1):606–615

    Article  Google Scholar 

  46. Tian H, Li W, Ogunbona PO, Wang L (2018) Detection and separation of smoke from single image frames. IEEE Trans Image Processing 27(3):1164–1177

    Article  MathSciNet  Google Scholar 

  47. Kaabi R, Sayadi M, Bouchouicha M, Fnaiech F, Moreau E, Ginoux J (2018) Early smoke detection of forest wildfire video using deep belief network. In: Proceedings of the International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp 1–6

    Google Scholar 

  48. Li X, Chen Z, Wu Q, Liu C (2020) 3D parallel fully convolutional networks for real-time video wildfire smoke detection. IEEE Trans Circuits Syst Video Technol 30(1):89–103

    Article  Google Scholar 

  49. Yuan F, Zhang L, Xia X, Wan B, Huang Q, Li X (2019) Deep smoke segmentation. Neurocomputing 357:248–260

    Article  Google Scholar 

  50. Yuan F, Zhang L, Xia X, Huang Q, Li X (2020) Wave-shaped deep neural network for smoke density estimation. IEEE Trans Image Process 29:2301–2313

    Article  Google Scholar 

  51. Zhang P, Wang D, Lu H, Wang H, Ruan X (2017) Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the International Conference on Computer Vision (ICCV), Venice, pp 202–211

  52. Chen X, Liew JH, Xiong W, Chui C, Ong SH (2018) Focus, segment and erase: an efficient network for multi-label brain tumor segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), Munich, pp 674–689

  53. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, pp 630–645

  54. Hou Q, Cheng M, Hu X, Borji A, Tu Z, Torr PH (2017) Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, pp 5300–5309

  55. Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, pp 4159–4167

  56. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, pp 7132–7141

  57. Lee C, Xie S, Gallagher PW, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, pp 562–570

  58. Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Proceedings of the European Conference on Computer Vision (ECCV), Florence, pp 746–760

  59. Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  60. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The Cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, pp. 3213–3223.

  61. Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of the International Conference on Computational Statistics (COMPSTAT), Paris, pp 177–186

  62. Wang W, Shen J, Shao L (2018) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49

    Article  MathSciNet  Google Scholar 

  63. Yuan F, Li K, Wang C, Fang Z (2023) A lightweight network for smoke semantic segmentation. Pattern Recognit 137:109289:1-11

  64. Wu T, Tang S, Zhang R, Cao J, Zhang Y (2021) CGNet: A light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179

    Article  Google Scholar 

  65. Yuan F, Dong Z, Zhang L, Xia X, Shi J (2022) Cubic-cross convolutional attention and count prior embedding for smoke segmentation. Pattern Recognit 131:1–10

    Article  Google Scholar 

  66. Yuan F, Shi Y, Zhang L, Fang Y (2023) A cross-scale mixed attention network for smoke segmentation. Digit Signal Process 134:1–11

    Article  Google Scholar 

  67. Yuan F (2011) Video-based smoke detection with histogram sequence of LBP and LBPV pyramids. Fire Safety J 46(3):132–139

    Article  Google Scholar 

  68. Toreyin B, Dedeoglu Y, Gudukbay U, Cetin A (2006) Computer vision based method for real-time fire and flame detection. Pattern Recognit Lett 27(1):49–58

    Article  Google Scholar 

  69. Luo Z, Mishra A, Achkar A, Eichel J, Li S, Jodoin PM (2017) Non-local deep features for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, Venice, pp 6593–6601

  70. Cheng Y, Cai R, Li Z, Zhao X, Huang K (2017) Locality-sensitive deconvolution networks with gated fusion for RGB-D indoor semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol 21–26, Honolulu, pp 1475–1483

  71. http://signal.ee.bilkent.edu.tr/VisiFire/Demo/SampleClips.html

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (62272308) and the Capacity Construction Project of Shanghai Local Colleges (23010504100).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feiniu Yuan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, F., Zhang, L. & Xia, X. Smoke semantic segmentation with multi-scale residual paths and weighted middle surveillances. Multimed Tools Appl 83, 47199–47224 (2024). https://doi.org/10.1007/s11042-023-17260-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17260-2

Keywords

Navigation