Abstract
During the last few years, deeper and deeper networks have been constantly proposed for addressing computer vision tasks. Residual Networks (ResNets) are the latest advancement in the field of deep learning that led to remarkable results in several image recognition and detection tasks. In this work, we modify two variants of the original ResNets, i.e. Wide Residual Networks (WRNs) and Residual of Residual Networks (RoRs), to work on 3D data and investigate for the first time, to our knowledge, their performance in the task of 3D object classification. We use a dataset containing volumetric representations of 3D models so as to fully exploit the underlying 3D information and present evidence that ‘3D ResNets’ constitute a valuable tool for classifying objects on 3D data as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Brock, A., Lim, T., Ritchie, J., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. CoRR abs/1608.04236 (2016). http://arxiv.org/abs/1608.04236
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference (BMVC) (2014)
Chollet, F., et al.: Keras (2015). https://github.com/fchollet/keras
Gu, J., et al.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018). https://doi.org/10.1016/j.patcog.2017.10.013
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: IEEE International Conference on Computer Vision ICCV, pp. 1026–1034 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Hegde, V., Zadeh, R.: FusionNet: 3D object classification using multiple data representations. CoRR abs/1607.05695 (2016).http://arxiv.org/abs/1607.05695
Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 646–661. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_39
Ioannidou, A., Chatzilari, E., Nikolopoulos, S., Kompatsiaris, I.: Deep learning advances in computer vision with 3D data: a survey. ACM Comput. Surv. 50(2), 201–2038 (2017). https://doi.org/10.1145/3042064
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network trainingby reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning (ICML), pp. 448–456 (2015).http://jmlr.org/proceedings/papers/v37/ioffe15.html
Johns, E., Leutenegger, S., Davison, A.: Pairwise decomposition of image sequences for active multi-view recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3813–3822 (2016)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
Klokov, R., Lempitsky, V.: Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. CoRR abs/1704.01222 (2017). http://arxiv.org/abs/1704.01222
Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network forreal-time object recognition. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928 (2015)
Moniz, J., Pal, C.: Convolutional residual memory networks. CoRR abs/1606.05262 (2016). http://arxiv.org/abs/1606.05262
Qi, C., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.: Volumetric and multi-view CNNs for object classification on 3D data. CoRR abs/1604.03265 (2016). http://arxiv.org/abs/1604.03265
Sedaghat, N., Zolfaghari, M., Brox, T.: Orientation-boosted voxel nets for 3D object recognition. CoRR abs/1604.03351 (2016). http://arxiv.org/abs/1604.03351
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)
Srivastava, R., Greff, K., Schmidhuber, J.: Highway networks. CoRR abs/1505.00387 (2015). http://arxiv.org/abs/1505.00387
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 945–953 (2015)
Targ, S., Almeida, D., Lyman, K.: Resnet in resnet: generalizing residual architectures. CoRR abs/1603.08029 (2016).http://arxiv.org/abs/1603.08029
Theano Development Team: Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016. http://arxiv.org/abs/1605.02688
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)
Zhang, K., Sun, M., Han, X., Yuan, X., Guo, L., Liu, T.: Residual networks of residual networks: multilevel residual networks. IEEE Trans. Circ. Syst. Video Technol. PP(99), 1 (2017)
Acknowledgements
The research leading to these results has received funding from the European Union H2020 Horizon Programme (2014–2020) under grant agreement 665066, project DigiArt (The Internet Of Historical Things And Building New 3D Cultural Worlds).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ioannidou, A., Chatzilari, E., Nikolopoulos, S., Kompatsiaris, I. (2019). 3D ResNets for 3D Object Classification. In: Kompatsiaris, I., Huet, B., Mezaris, V., Gurrin, C., Cheng, WH., Vrochidis, S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science(), vol 11295. Springer, Cham. https://doi.org/10.1007/978-3-030-05710-7_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-05710-7_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05709-1
Online ISBN: 978-3-030-05710-7
eBook Packages: Computer ScienceComputer Science (R0)