Anatomical Embedding-Based Training Method for Medical Image Segmentation Foundation Models

Mingrui Zhuang ORCID: orcid.org/0000-0002-2991-0919¹⁴,
Rui Xu¹⁴,
Qinhe Zhang¹⁵,
Ailian Liu¹⁵,
Xin Fan¹⁴ &
…
Hongkai Wang ORCID: orcid.org/0000-0002-1813-2162^14,16

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15184))

Included in the following conference series:

International Workshop on Foundation Models for General Medical AI

153 Accesses

Abstract

Existing training methods for medical image foundation models primarily focus on tasks such as image restoration, overlooking the potential of harnessing the inherent anatomical knowledge of the human body. The discrepancy between the training tasks of foundation models and downstream tasks often necessitates model fine-tuning for each specific application. An insufficient scale of the downstream training set can lead to catastrophic forgetting of the foundational model. To address these issues, we propose a novel unsupervised training method for medical image foundation models. Our approach incorporates an anatomical embedding task, enabling the model to generate anatomically related embeddings for each voxel. To expedite the training and accommodate large-scale models, we employ the strategy of momentum contrast learning, which is further enhanced to adapt to the task of anatomical embedding. To improve the model's performance for specific targets, we introduce the region contrastive loss, utilizing a small set of segmentation labels (e.g., five samples) to identify the focused regions during training. In our experiments, we pre-train the foundation model using a dataset of 4000 unlabeled abdominal CT scans with the downstream task being the few-shot learning of 13 abdominal organ segmentation. The results showed significant improvements in the downstream segmentation task, particularly in the scenarios with limited segmentation annotations, compared to methods without pre-training and similar foundation models. The trained models and the downstream training code have been open sourced at https://github.com/DlutMedimgGroup/Anatomy-Embedding-Foundation-Model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 34.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation

Anatomical Positional Embeddings

VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation

Article 23 July 2022

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

References

Cardoso, M.J., et al.: MONAI: an open-source framework for deep learning in healthcare, http://arxiv.org/abs/2211.02701 (2022). https://doi.org/10.48550/arXiv.2211.02701
Chaitanya, K., et al.: Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation. Med. Image Anal. 87, 102792 (2023). https://doi.org/10.1016/j.media.2023.102792
Article Google Scholar
Dosovitskiy, A., et al.: An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. http://arxiv.org/abs/2010.11929, (2021). https://doi.org/10.48550/arXiv.2010.11929
Hatamizadeh, A., et al.: Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. In: Crimi, A., Bakas, S. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 272–284. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08999-2_22
Hatamizadeh, A., et al.: UNETR: transformers for 3D Medical Image Segmentation. http://arxiv.org/abs/2103.10504 (2021). https://doi.org/10.48550/arXiv.2103.10504
He, K., et al.: Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9726–9735. IEEE, Seattle, WA, USA (2020). https://doi.org/10.1109/CVPR42600.2020.00975
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. Presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Ma, J., et al.: Segment anything in medical images. Nat Commun. 15, 1, 654 (2024). https://doi.org/10.1038/s41467-024-44824-z
Ma, J., et al.: Unleashing the strengths of unlabeled data in pan-cancer abdominal organ quantification: the FLARE22 challenge. http://arxiv.org/abs/2308.05862 (2023). https://doi.org/10.48550/arXiv.2308.05862
McMahan, B., et al.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Google Scholar
Park, T., et al.: Contrastive learning for unpaired image-to-image translation. http://arxiv.org/abs/2007.15651 (2020). https://doi.org/10.48550/arXiv.2007.15651
Saldanha, O.L., et al.: Swarm learning for decentralized artificial intelligence in cancer histopathology. Nat. Med. 28(6), 1232–1239 (2022). https://doi.org/10.1038/s41591-022-01768-5
Article Google Scholar
Tang, Y., et al.: Self-supervised pre-training of swin transformers for 3D medical image analysis. Presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)
Google Scholar
Warnat-Herresthal, S., et al.: Swarm Learning for decentralized and confidential clinical machine learning. Nature 594(7862), 265–270 (2021). https://doi.org/10.1038/s41586-021-03583-3
Article Google Scholar
Wu, J., et al.: Medical SAM adapter: adapting segment anything model for medical image segmentation. http://arxiv.org/abs/2304.12620 (2023)
Yan, K., et al.: SAM: self-supervised learning of pixel-wise anatomical embeddings in radiological images. IEEE Trans. Med. Imaging 41(10), 2658–2669 (2022). https://doi.org/10.1109/TMI.2022.3169003
Article Google Scholar
Yu, Z., et al.: Cross-grained contrastive representation for unsupervised lesion segmentation in medical images. In: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 2339–2346 (2023). https://doi.org/10.1109/ICCVW60793.2023.00248
Zhang, Y., et al.: Input augmentation with SAM: boosting medical image segmentation with segmentation foundation model. In: Celebi, M.E., et al. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 Workshops. pp. 129–139. Springer Nature Switzerland, Cham (2023). https://doi.org/10.1007/978-3-031-47401-9_13

Download references

Acknowledgments

This work was supported in part by the National Key Research and Development Program No. 2020YFB1711500, 2020YFB1711501 and 2020YFB1711503, the general program of National Natural Science Fund of China (No. 81971693, 61971445), the funding of Dalian Key Laboratory of Digital Medicine for Critical Diseases, the Fundamental Research Funds for the Central Universities (No. DUT22YG229 and DUT22YG205), the funding of Liaoning Key Lab of IC & BME System and Dalian Engineering Research Center for Artificial Intelligence in Medical Imaging.

Author information

Authors and Affiliations

Dalian University of Technology, Dalian, China
Mingrui Zhuang, Rui Xu, Xin Fan & Hongkai Wang
The First Affiliated Hospital of Dalian Medical University, Dalian, China
Qinhe Zhang & Ailian Liu
Liaoning Key Laboratory of Integrated Circuit and Biomedical Electronic System, Dalian, China
Hongkai Wang

Authors

Mingrui Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qinhe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ailian Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Hongkai Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongkai Wang .

Editor information

Editors and Affiliations

University of Cambridge, Cambridge, UK
Zhongying Deng
Johns Hopkins University, Baltimore, MD, USA
Yiqing Shen
Korea University, Seoul, Korea (Republic of)
Hyunwoo J. Kim
Korea University, Seoul, Korea (Republic of)
Won-Ki Jeong
University of Cambridge, Cambridge, UK
Angelica I. Aviles-Rivero
Shanghai AI Laboratory, Shanghai, China
Junjun He
Shanghai AI Laboratory, Shanghai, China
Shaoting Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhuang, M., Xu, R., Zhang, Q., Liu, A., Fan, X., Wang, H. (2025). Anatomical Embedding-Based Training Method for Medical Image Segmentation Foundation Models. In: Deng, Z., et al. Foundation Models for General Medical AI. MedAGI 2024. Lecture Notes in Computer Science, vol 15184. Springer, Cham. https://doi.org/10.1007/978-3-031-73471-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-73471-7_15
Published: 28 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73470-0
Online ISBN: 978-3-031-73471-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Anatomical Embedding-Based Training Method for Medical Image Segmentation Foundation Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others