research-article

SEB-Net: Revisiting Deep Encoder-Decoder Networks for Scene Understanding

Authors:

Firdaous EssafAuthors Info & Claims

ICCAI '20: Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence

Pages 542 - 551

https://doi.org/10.1145/3404555.3404629

Published: 20 August 2020 Publication History

Get Access

Abstract

As a research area of computer vision and deep learning, scene understanding has attracted a lot of attention in recent years. One major challenge encountered is obtaining high levels of segmentation accuracy while dealing with the computational cost and time associated with training or inference. Most current algorithms compromise one metric for the other depending on the intended devices. To address this problem, this paper proposes a novel deep neural network architecture called Segmentation Efficient Blocks Network (SEB-Net) that seeks to achieve the best possible balance between accuracy and computational costs as well as real-time inference speed. The model is composed of both an encoder path and a decoder path in a symmetric structure. The encoder path consists of 16 convolution layers identical to a VGG-19 model, and the decoder path includes what we call E-blocks (Efficient Blocks) inspired by the widely popular ENet architecture's bottleneck module with slight modifications. One advantage of this model is that the max-unpooling in the decoder path is employed for expansion and projection convolutions in the E-Blocks, allowing for less learnable parameters and efficient computation (10.1 frames per second (fps) for a 480x320 input, 11x fewer parameters than DeconvNet, 52.4 GFLOPs for a 640x360 input on a TESLA K40 GPU device). Experimental results on two outdoor scene datasets; Cambridge-driving Labeled Video Database (CamVid) and Cityscapes, indicate that SEB-Net can achieve higher performance compared to Fully Convolutional Networks (FCN), SegNet, DeepLabV, and Dilation8 in most cases. What's more, SEB-Net outperforms efficient architectures like ENet and LinkNet by 16.1 and 11.6 respectively in terms of Instance-level intersection over Union (iLoU). SEB-Net also shows better performance when further evaluated on the SUNRGB-D, an indoor scene dataset

References

[1]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., pp. 1--9, 2012.

Abstract

References

Index Terms

Recommendations

An improved U-Net method for the semantic segmentation of remote sensing images

Sharp U-Net: Depthwise convolutional network for biomedical image segmentation

RU-Net: An improved U-Net placenta segmentation network based on ResNet

Comments

Information

Published In

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations