Real-time 3D scene reconstruction with dynamically moving object using a single depth camera

Feixiang Lu ORCID: orcid.org/0000-0003-3952-4402¹,
Bin Zhou¹,
Yu Zhang¹ &
…
Qinping Zhao¹

1662 Accesses
11 Citations
Explore all metrics

Abstract

Online 3D reconstruction of real-world scenes has been attracting increasing interests from both the academia and industry, especially with the consumer-level depth cameras becoming widely available. Recent most online reconstruction systems take live depth data from a moving Kinect camera and incrementally fuse them to a single high-quality 3D model in real time. Although most real-world scenes have static environment, the daily objects in a scene often move dynamically, which are non-trivial to reconstruct especially when the camera is also not still. To solve this problem, we propose a single depth camera-based real-time approach for simultaneous reconstruction of dynamic object and static environment, and provide solutions for its key issues. In particular, we first introduce a robust optimization scheme which takes advantage of raycasted maps to segment moving object and background from the live depth map. The corresponding depth data are then fused to the volumes, respectively. These volumes are raycasted to extract views of the implicit surface which can be used as a consistent reference frame for the next iteration of segmentation and tracking. Particularly, in order to handle fast motion of dynamic object and handheld camera in the fusion stage, we propose a sequential 6D pose prediction method which largely increases the registration robustness and avoids registration failures occurred in conventional methods. Experimental results show that our approach can reconstruct moving object as well as static environment with rich details, and outperform conventional methods in multiple aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

ArticulatedFusion: Real-Time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera

Temporally Coherent General Dynamic Scene Reconstruction

Article Open access 18 August 2020

Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Cao, C., Weng, Y., Lin, S., Zhou, K.: 3d shape regression for real-time facial animation. ACM Trans. Graph. (TOG) 32(4), 41 (2013)
Article MATH Google Scholar
Chen, J., Bautembach, D., Izadi, S.: Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. (TOG) 32(4), 113 (2013)
MATH Google Scholar
Chen, K., Lai, Y., Wu, Y.X., Martin, R.R., Hu, S.M.: Automatic semantic modeling of indoor scenes from low-quality rgb-d data using contextual information. ACM Trans. Gr. 33(6), 208:1–208:12 (2014)
Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. Image Vis. Comput. 10(3), 145–155 (1992)
Article Google Scholar
Dou, M., Taylor, J., Fuchs, H., Fitzgibbon, A., Izadi, S.: 3d scanning deformable objects with a single rgbd sensor. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 493–501. IEEE (2015)
Guo, K., Xu, F., Yu, T., Liu, X., Dai, Q., Liu, Y.: Real-time geometry, albedo, and motion reconstruction using a single rgb-d camera. ACM Trans. Graph. (TOG) 36(3), 32 (2017)
Article Google Scholar
Hernández, C., Vogiatzis, G., Brostow, G.J., Stenger, B., Cipolla, R.: Non-rigid photometric stereo with colored lights. In: IEEE 11th International Conference on Computer Vision, 2007. ICCV 2007, pp. 1–8. IEEE (2007)
Innmann, M., Zollhöfer, M., Nießner, M., Theobalt, C., Stamminger, M.: Volumedeform: real-time volumetric non-rigid reconstruction. In: European Conference on Computer Vision, pp. 362–379. Springer (2016)
Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., et al.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th annual ACM Symposium on User Interface Software and Technology, pp. 559–568. ACM (2011)
Jaimez, M., Kerl, C., Gonzalez-Jimenez, J., Cremers, D.: Fast odometry and scene flow from rgb-d cameras based on geometric clustering. In: Proc. International Conference on Robotics and Automation (ICRA) (2017)
Kahler, O., Prisacariu, V., Valentin, J., Murray, D.: Hierarchical voxel block hashing for efficient integration of depth images. IEEE Robot. Autom. Lett. 1, 192–197 (2016)
Article Google Scholar
Kahler, O., Prisacariu, V.A., Ren, C.Y., Sun, X., Torr, P., Murray, D.: Very high frame rate volumetric integration of depth images on mobile devices. IEEE Trans. Vis. Comput. Graph. 21(11), 1241–1250 (2015)
Article Google Scholar
Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. In: ACM Transactions on Graphics (TOG), vol. 28, p. 175. ACM (2009)
Liao, M., Zhang, Q., Wang, H., Yang, R., Gong, M.: Modeling deformable objects from a single depth camera. In: IEEE 12th International Conference on Computer Vision, pp. 167–174. IEEE (2009)
McCormac, J., Handa, A., Davison, A., Leutenegger, S.: Semanticfusion: Dense 3d semantic mapping with convolutional neural networks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635. IEEE (2017)
Newcombe, R.A., Fox, D., Seitz, S.M.: Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohi, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: 10th IEEE international symposium on Mixed and augmented reality (ISMAR), pp. 127–136. IEEE (2011)
Nießner, M., Zollhöfer, M., Izadi, S., Stamminger, M.: Real-time 3d reconstruction at scale using voxel hashing. ACM Trans. Graph. (TOG) 32(6), 169 (2013)
Article Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3d tracking of hand articulations using kinect. In: BmVC, vol. 1, p. 3 (2011)
Roth, H., Vona, M.: Moving volume kinectfusion. In: BMVC, pp. 1–11 (2012)
Shen, C.H., Fu, H., Chen, K., Hu, S.M.: Structure recovery by part assembly. ACM Trans. Graph. (TOG) 31(6), 180 (2012)
Article Google Scholar
Steinbrucker, F., Kerl, C., Cremers, D.: Large-scale multi-resolution surface reconstruction from rgb-d sequences. In: The IEEE International Conference on Computer Vision (ICCV) (2013)
Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 103–110. IEEE (2012)
Toscana, G., Rosa, S., Bona, B.: Fast graph-based object segmentation for rgb-d images. In: Proceedings of SAI Intelligent Systems Conference, pp. 42–58. Springer (2016)
Weiss, A., Hirshberg, D., Black, M.J.: Home 3d body scans from noisy image and range data. In: IEEE International Conference on Computer Vision (ICCV), pp. 1951–1958. IEEE (2011)
Whelan, T., Kaess, M., Fallon, M., et al.: Kintinuous: Spatially extended kinectFusion [J]. Robot Auton Syst 69(C), 3–14 (2012)
Whelan, T., Salas-Moreno, R.F., Glocker, B., Davison, A.J., Leutenegger, S.: Elasticfusion: real-time dense slam and light source estimation. Int. J. Robot. Res. 35(14), 1697–1716 (2016)
Article Google Scholar
Xu, K., Huang, H., Shi, Y., Li, H., Long, P., Caichen, J., Sun, W., Chen, B.: Autoscanning for coupled scene reconstruction and proactive object analysis. ACM Trans. Graph. (TOG) 34(6), 177 (2015)
Article Google Scholar
Xu, K., Shi, Y., Zheng, L., Zhang, J., Liu, M., Huang, H., Su, H., Cohen-Or, D., Chen, B.: 3d attention-driven depth acquisition for object identification. ACM Trans. Graph. (TOG) 35(6), 238 (2016)
Google Scholar
Yu, T., Guo, K., Xu, F., Dong, Y., Su, Z., Zhao, J., Li, J., Dai, Q., Liu, Y.: Bodyfusion: Real-time capture of human motion and surface geometry using a single depth camera. In: The IEEE International Conference on Computer Vision (ICCV). ACM (2017)
Zhang, Y., Xu, W., Tong, Y., Zhou, K.: Online structure analysis for real-time indoor scene reconstruction. ACM Trans. Graph. (TOG) 34(5), 159 (2015)
Article Google Scholar
Zollhöfer, M., Nießner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., et al.: Real-time non-rigid reconstruction using an rgb-d camera. ACM Trans. Graph. (TOG) 33(4), 156 (2014)
Article MATH Google Scholar

Download references

Acknowledgements

This study was funded by National Natural Science Foundation of China (Grant Nos. 61502023 and U1736217).

Author information

Authors and Affiliations

State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
Feixiang Lu, Bin Zhou, Yu Zhang & Qinping Zhao

Authors

Feixiang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Bin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qinping Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feixiang Lu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, F., Zhou, B., Zhang, Y. et al. Real-time 3D scene reconstruction with dynamically moving object using a single depth camera. Vis Comput 34, 753–763 (2018). https://doi.org/10.1007/s00371-018-1540-8

Download citation

Published: 08 May 2018
Issue Date: June 2018
DOI: https://doi.org/10.1007/s00371-018-1540-8

Real-time 3D scene reconstruction with dynamically moving object using a single depth camera

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ArticulatedFusion: Real-Time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera

Temporally Coherent General Dynamic Scene Reconstruction

Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Real-time 3D scene reconstruction with dynamically moving object using a single depth camera

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ArticulatedFusion: Real-Time Reconstruction of Motion, Geometry and Segmentation Using a Single Depth Camera

Temporally Coherent General Dynamic Scene Reconstruction

Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation