Abstract
Suturing technical skill scores are strong predictors of patient functional recovery following robot-assisted radical prostatectomy (RARP), but manual assessment of these skills is a time and resource-intensive process. By automating suturing skill scoring through computer vision methods, we can significantly reduce the burden on healthcare professionals and enhance the quality and quantity of educational feedbacks. Although automated skill assessment on simulated virtual reality (VR) environments have been promising, applying vision methods to live (‘real’) surgical videos has been challenging due to: 1) the lack of kinematic data from the da Vinci® surgical system, a key source of information for determining the movement and trajectory of robotic manipulators and suturing needles, and 2) the lack of training data due to the labor-intensive task of segmenting and scoring individual stitches from live videos. To address these challenges, we developed a self-supervised pre-training paradigm whereby sim-to-real generalizable representations are learned without requiring any live kinematic annotations. Our model is based on a masked autoencoder (MAE), termed as LiveMAE. We augment live stitches with VR images during pre-training and require LiveMAE to reconstruct images from both domains while also predicting the corresponding kinematics. This process learns a visual-to-kinematic mapping that seeks to locate the positions and orientations of surgical manipulators and needles, deriving “kinematics” from live videos without requiring supervision. With an additional skill-specific finetuning step, LiveMAE surpasses supervised learning approaches across 6 technical skill assessments, ranging from 0.56–0.84 AUC (0.70–0.91 AUPRC), with particular improvements of 35.78% in AUC for wrist rotation skills and 8.7% for needle driving skills. Mean-squared error for test VR kinematics was as low as 0.045 for each element of the instrument poses. Our contributions provide the foundation to deliver personalized feedback to surgeons training in VR and performing live prostatectomy procedures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Birkmeyer, J.D., et al.: Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369(15), 1434–1442 (2013). https://doi.org/10.1056/NEJMsa130062
Hung, A.J., et al.: A deep-learning model using automated performance metrics and clinical features to predict urinary continence recovery after robot-assisted radical prostatectomy. BJU Int. 124(3), 487–495 (2019). https://doi.org/10.1111/bju.14735
Trinh, L., et al.: Survival analysis using surgeon skill metrics and patient factors to predict urinary continence recovery after robot-assisted radical prostatectomy. Eur. Urol. Focus. S2405–4569(21), 00107–00113 (2021). https://doi.org/10.1016/j.euf.2021.04.001
Chen, J., et al.: Objective assessment of robotic surgical technical skill: a systematic review. J. Urol. 201(3), 461–469 (2019). https://doi.org/10.1016/j.juro.2018.06.078
Lendvay, T.S., White, L., Kowalewski, T.: Crowdsourcing to assess surgical skill. JAMA Surg. 150(11), 1086–1087 (2015). https://doi.org/10.1001/jamasurg.2015.2405
Hung, A.J., et al.: Road to automating robotic suturing skills assessment: battling mislabeling of the ground truth. Surgery S0039–6060(21), 00784–00794 (2021). https://doi.org/10.1016/j.surg.2021.08.014
Hung, A.J., Bao, R., Sunmola, I.O., Huang, D.A., Nguyen, J.H., Anandkumar, A.: Capturing fine-grained details for video-based automation of suturing skills assessment. Int. J. Comput. Assist. Radiol. Surg. 18(3), 545–552 (2023). Epub 2022 Oct 25. PMID: 36282465; PMCID: PMC9975072. https://doi.org/10.1007/s11548-022-02778-x
Sanford, D.I., et al.: Technical skill impacts the success of sequential robotic suturing substeps. J. Endourol. 36(2), 273–278 (2022 ). PMID: 34779231; PMCID: PMC8861914. https://doi.org/10.1089/end.2021.0417
Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385, pp. 37–45. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-24797-2_4
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Dosovitskiy A, et al.:. An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Balvardi, S., et al.: The association between video-based assessment of intraoperative technical performance and patient outcomes: a systematic review. Surg. Endosc. 36(11), 7938–7948 (2022). Epub 2022 May 12. PMID: 35556166. https://doi.org/10.1007/s00464-022-09296-6
Fecso, A.B., Szasz, P., Kerezov, G., Grantcharov, T.P.: The effect of technical performance on patient outcomes in surgery: a systematic review. Ann Surg. 265(3), 492–501 (2017). PMID: 27537534. https://doi.org/10.1097/SLA.0000000000001959
Acknowledgements
This study is supported in part by the National Cancer Institute under Award Number 1RO1CA251579-01A1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Trinh, L. et al. (2023). Self-supervised Sim-to-Real Kinematics Reconstruction for Video-Based Assessment of Intraoperative Suturing Skills. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14228. Springer, Cham. https://doi.org/10.1007/978-3-031-43996-4_68
Download citation
DOI: https://doi.org/10.1007/978-3-031-43996-4_68
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43995-7
Online ISBN: 978-3-031-43996-4
eBook Packages: Computer ScienceComputer Science (R0)