EndoSurf: Neural Surface Reconstruction of Deformable Tissues with Stereo Endoscope Videos

Ruyi Zha¹⁴,
Xuelian Cheng^15,17,18,
Hongdong Li¹⁴,
Mehrtash Harandi¹⁵ &
…
Zongyuan Ge^{15,16,17,18,19}

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14228))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

4880 Accesses
24 Citations

Abstract

Reconstructing soft tissues from stereo endoscope videos is an essential prerequisite for many medical applications. Previous methods struggle to produce high-quality geometry and appearance due to their inadequate representations of 3D scenes. To address this issue, we propose a novel neural-field-based method, called EndoSurf, which effectively learns to represent a deforming surface from an RGBD sequence. In EndoSurf, we model surface dynamics, shape, and texture with three neural fields. First, 3D points are transformed from the observed space to the canonical space using the deformation field. The signed distance function (SDF) field and radiance field then predict their SDFs and colors, respectively, with which RGBD images can be synthesized via differentiable volume rendering. We constrain the learned shape by tailoring multiple regularization strategies and disentangling geometry and appearance. Experiments on public endoscope datasets demonstrate that EndoSurf significantly outperforms existing solutions, particularly in reconstructing high-fidelity shapes. Code is available at https://github.com/Ruyi-Zha/endosurf.git.

R. Zha and X. Cheng—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 67.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 84.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SDFPlane: Explicit Neural Surface Reconstruction of Deformable Tissues

SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction

DnFPlane for Efficient and High-Quality 4D Reconstruction of Deformable Tissues

References

Allan, M., et al.: Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:2101.01133 (2021)
Atzmon, M., Lipman, Y.: SAL: sign agnostic learning of shapes from raw data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2565–2574 (2020)
Google Scholar
Bozic, A., Zollhofer, M., Theobalt, C., Nießner, M.: DeepDeform: learning non-rigid rgb-d reconstruction with semi-supervised data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7002–7012 (2020)
Google Scholar
Cai, H., Feng, W., Feng, X., Wang, Y., Zhang, J.: Neural surface reconstruction of dynamic scenes with monocular RGB-D camera. arXiv preprint arXiv:2206.15258 (2022)
Cheng, X., et al.: Hierarchical neural architecture search for deep stereo matching. Adv. Neural. Inf. Process. Syst. 33, 22158–22169 (2020)
Google Scholar
Cheng, X., Zhong, Y., Harandi, M., Drummond, T., Wang, Z., Ge, Z.: Deep laparoscopic stereo matching with transformers. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, Part VII. LNCS, vol. 13437, pp. 464–474. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_44
Chapter Google Scholar
Chong, N., Si, Y., Zhao, W., Zhang, Q., Yin, B., Zhao, Y.: Virtual reality application for laparoscope in clinical surgery based on Siamese network and census transformation. In: Su, R., Zhang, Y.-D., Liu, H. (eds.) MICAD 2021. LNEE, vol. 784, pp. 59–70. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-3880-0_7
Chapter Google Scholar
Crandall, M.G., Lions, P.L.: Viscosity solutions of Hamilton-Jacobi equations. Trans. Am. Math. Soc. 277(1), 1–42 (1983)
Article MathSciNet Google Scholar
Gao, W., Tedrake, R.: SurfelWarp: efficient non-volumetric single view dynamic reconstruction. arXiv preprint arXiv:1904.13073 (2019)
Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric regularization for learning shapes. arXiv preprint arXiv:2002.10099 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, Y., et al.: SuPer: a surgical perception framework for endoscopic tissue manipulation with surgical robotics. IEEE Robot. Autom. Lett. 5(2), 2294–2301 (2020)
Article Google Scholar
Li, Z., et al.: Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6197–6206 (2021)
Google Scholar
Long, Y., et al.: E-DSSR: efficient dynamic surgical scene reconstruction with transformer-based stereoscopic depth perception. In: de Bruijne, M., et al. (eds.) MICCAI 2021, Part IV. LNCS, vol. 12904, pp. 415–425. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_40
Chapter Google Scholar
Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3D surface construction algorithm. ACM SIGGRAPH Comput. Graph. 21(4), 163–169 (1987)
Article Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part I. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 343–352 (2015)
Google Scholar
Nicolau, S., Soler, L., Mutter, D., Marescaux, J.: Augmented reality in laparoscopic surgical oncology. Surg. Oncol. 20(3), 189–201 (2011)
Article Google Scholar
Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Differentiable volumetric rendering: learning implicit 3D representations without 3D supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3504–3515 (2020)
Google Scholar
Oechsle, M., Peng, S., Geiger, A.: UNISURF: unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5589–5599 (2021)
Google Scholar
Overley, S.C., Cho, S.K., Mehta, A.I., Arnold, P.M.: Navigation and robotics in spinal surgery: where are we now? Neurosurgery 80(3S), S86–S99 (2017)
Article Google Scholar
Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-NeRF: neural radiance fields for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10318–10327 (2021)
Google Scholar
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: Dynamic reconstruction of deformable soft-tissue with stereo scope in minimal invasive surgery. IEEE Robot. Autom. Lett. 3(1), 155–162 (2017)
Article Google Scholar
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. Adv. Neural. Inf. Process. Syst. 33, 7537–7547 (2020)
Google Scholar
Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: NeuS: learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021)
Wang, Y., Long, Y., Fan, S.H., Dou, Q.: Neural rendering for stereo 3D reconstruction of deformable tissues in robotic surgery. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds.) MICCAI 2022, Part VII. LNCS, vol. 13437, pp. 431–441. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16449-1_41
Chapter Google Scholar
Yariv, L., et al.: Multiview neural surface reconstruction by disentangling geometry and appearance. Adv. Neural. Inf. Process. Syst. 33, 2492–2502 (2020)
Google Scholar
Zhou, H., Jayender, J.: EMDQ-SLAM: real-time high-resolution reconstruction of soft tissue surface from stereo laparoscopy videos. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021, Part IV. LNCS, vol. 12904, pp. 331–340. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_32
Chapter Google Scholar

Download references

Acknowledgments

This research is funded in part via an ARC Discovery project research grant (DP220100800).

Author information

Authors and Affiliations

Australian National University, Canberra, Australia
Ruyi Zha & Hongdong Li
Faculty of Engineering, Monash University, Melbourne, Australia
Xuelian Cheng, Mehrtash Harandi & Zongyuan Ge
Faculty of IT, Monash University, Melbourne, Australia
Zongyuan Ge
AIM for Health Lab, Monash University, Melbourne, Australia
Xuelian Cheng & Zongyuan Ge
Monash Medical AI, Monash University, Melbourne, Australia
Xuelian Cheng & Zongyuan Ge
Airdoc-Monash Research Lab, Monash University, Melbourne, Australia
Zongyuan Ge

Authors

Ruyi Zha
View author publications
You can also search for this author in PubMed Google Scholar
Xuelian Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hongdong Li
View author publications
You can also search for this author in PubMed Google Scholar
Mehrtash Harandi
View author publications
You can also search for this author in PubMed Google Scholar
Zongyuan Ge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruyi Zha .

Editor information

Editors and Affiliations

Icahn School of Medicine, Mount Sinai, NYC, NY, USA, Tel Aviv University, Tel Aviv, Israel
Hayit Greenspan
Emory University, Atlanta, GA, USA
Anant Madabhushi
Queen’s University, Kingston, ON, Canada
Parvin Mousavi
The University of British Columbia, Vancouver, BC, Canada
Septimiu Salcudean
Yale University, New Haven, CT, USA
James Duncan
IBM Research, San Jose, CA, USA
Tanveer Syeda-Mahmood
Johns Hopkins University, Baltimore, MD, USA
Russell Taylor

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 10305 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zha, R., Cheng, X., Li, H., Harandi, M., Ge, Z. (2023). EndoSurf: Neural Surface Reconstruction of Deformable Tissues with Stereo Endoscope Videos. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-43996-4_2
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)