Abstract
Recent deep learning models achieve impressive results on 3D scene analysis tasks by operating directly on unstructured point clouds. A lot of progress was made in the field of object classification and semantic segmentation. However, the task of instance segmentation is currently less explored. In this work, we present 3D-BEVIS (3D bird’s-eye-view instance segmentation), a deep learning framework for joint semantic- and instance-segmentation on 3D point clouds. Following the idea of previous proposal-free instance segmentation approaches, our model learns a feature embedding and groups the obtained feature space into semantic instances. Current point-based methods process local sub-parts of a full scene independently, followed by a heuristic merging step. However, to perform instance segmentation by clustering on a full scene, globally consistent features are required. Therefore, we propose to combine local point geometry with global context information using an intermediate bird’s-eye view representation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Intel RealSense Stereoscopic Depth Cameras. Computing Research Repository CoRR abs/1705.05548
Matterport: 3D models of interior spaces. http://matterport.com. Accessed 1 Aug 2019
Armeni, I., et al.: 3D semantic parsing of large-scale indoor spaces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) (2015)
Boulch, A., Guerry, J., Le Saux, B., Audebert, N.: SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks. Comput. Graph. (2017)
Brabandere, B.D., Neven, D., Gool, L.V.: Semantic instance segmentation with a discriminative loss function. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2017)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) (2002)
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet benchmark challenge. http://kaldir.vc.in.tum.de/scannet_benchmark/ (2018). Accessed 19 May 2019
Dai, A., Nießner, M.: 3DMV: joint 3D-multi-view prediction for 3D semantic scene segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 458–474. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_28
Dai, J., He, K., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Engelmann, F.: FabScan-Affordable 3D Laser Scanning of Physical Objects (2011)
Engelmann, F., Kontogianni, T., Leibe, B.: Dilated point convolutions: on the receptive field of point convolutions. computing research repository, CoRR abs/1907.12046 (2019)
Engelmann, F., Kontogianni, T., Schult, J., Leibe, B.: Know what your neighbors do: 3D semantic segmentation of point clouds. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131, pp. 395–409. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11015-4_29
Fathi, A., et al.: Semantic instance segmentation via deep metric learning. Computing research repository CoRR abs/1703.10277 (2017)
He, K., Gkioxari, G., Dollar, P., Girshick, R.B.: Mask R-CNN. In: International Conference on Computer Vision (ICCV) (2017)
Hou, J., Dai, A., Nießner, M.: 3D-SIS: 3D semantic instance segmentation of RGB-D scans. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Hsu, Y.C., Xu, Z., Kira, Z., Huang, J.: Learning to cluster for proposal-free instance segmentation. In: International Conference on Neural Networks (IJCNN) (2018)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Kong, S., Fowlkes, C.: Recurrent pixel embedding for instance grouping. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Newell, A., Huang, Z., Deng, J.: Pixels to graphs by associative embedding. In: Neural Information Processing Systems (NIPS) (2017)
Pinheiro, P.O., Collobert, R., Dollar, P.: Learning to segment object candidates. In: Neural Information Processing Systems (NIPS) (2015)
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum PointNets for 3D Object Detection from RGB-D Data. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Neural Information Processing Systems (NIPS) (2017)
Rethage, D., Wald, J., Sturm, J., Navab, N., Tombari, F.: Fully-convolutional point networks for large-scale point clouds. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 625–640. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_37
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Shelhamer, E., Long, J., Darrell, T.: Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) (2017)
Simon, M., Milz, S., Amende, K., Gross, H.: Complex-YOLO: real-time 3D object detection on point clouds. Computing research repository CoRR abs/1803.06199 (2018)
Tatarchenko, M., Park, J., Koltun, V., Zhou, Q.Y.: Tangent convolutions for dense prediction in 3D. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Wang, W., Yu, R., Huang, Q., Neumann, U.: SGPN: similarity group proposal network for 3D point cloud instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. Computing research repository CoRR abs/1801.07829 (2018)
Yi, L., Zhao, W., Wang, H., Sung, M., Guibas, L.J.: GSPN: generative shape proposal network for 3D instance segmentation in point cloud. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Elich, C., Engelmann, F., Kontogianni, T., Leibe, B. (2019). 3D Bird’s-Eye-View Instance Segmentation. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-33676-9_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33675-2
Online ISBN: 978-3-030-33676-9
eBook Packages: Computer ScienceComputer Science (R0)