Abstract
End-to-end medical image segmentation is of great value for computer-aided diagnosis dominated by task-specific models, usually suffering from poor generalization. With recent breakthroughs brought by the segment anything model (SAM) for universal image segmentation, extensive efforts have been made to adapt SAM for medical imaging but still encounter two major issues: 1) severe performance degradation and limited generalization without proper adaptation, and 2) semi-automatic segmentation relying on accurate manual prompts for interaction. In this work, we propose SAMUS as a universal model tailored for ultrasound image segmentation and further enable it to work in an end-to-end manner denoted as AutoSAMUS. Specifically, in SAMUS, a parallel CNN branch is introduced to supplement local information through cross-branch attention, and a feature adapter and a position adapter are jointly used to adapt SAM from natural to ultrasound domains while reducing training complexity. AutoSAMUS is realized by introducing an auto prompt generator (APG) to replace the manual prompt encoder of SAMUS to automatically generate prompt embeddings. A comprehensive ultrasound dataset, comprising about 30k images and 69k masks and covering six object categories, is collected for verification. Extensive comparison experiments demonstrate the superiority of SAMUS and AutoSAMUS against the state-of-the-art task-specific and SAM-based foundation models. We believe the auto-prompted SAM-based model has the potential to become a new paradigm for end-to-end medical image segmentation and deserves more exploration. Code and data are available at https://github.com/xianlin7/SAMUS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Liu, X., Song, L., Liu, S., Zhang, Y.: A review of deep-learning-based medical image segmentation methods. Sustainability 13(3), 1224 (2021)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Huang, X., Deng, Z., Li, D., Yuan, X., Fu, Y.: MISSFormer: an effective transformer for 2D medical image segmentation. IEEE Trans. Med. Imag. 42(5), 1484–1494 (2022)
Kirillov, A., et al.: Segment anything (2023). arXiv preprint arXiv:2304.02643
Huang, Y., et al.: Segment anything model for medical images? Med. Image Anal. 92, 103061 (2024)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024)
Zhang, K., Liu, D.: Customized segment anything model for medical image segmentation (2023). arXiv preprint arXiv:2304.13785
Wu, J., et al.: Medical SAM adapter: Adapting segment anything model for medical image segmentation (2023). arXiv preprint arXiv:2304.12620
Jie, L., Zhang, H.: AdapterShadow: Adapting segment anything model for shadow detection (2023). arXiv preprint arXiv:2311.08891
Na, S., Guo, Y., Jiang, F., Ma, H., Huang, J.: Segment any cell: a SAM-based auto-prompting fine-tuning framework for nuclei segmentation (2024). arXiv preprint arXiv:2401.13220
Zhang, X., Liu, Y., Lin, Y., Liao, Q., Li, Y.: UV-SAM: Adapting segment anything model for urban village identification (2024). arXiv preprint arXiv:2401.08083
Biswas, R.: Polyp-SAM++: Can a text guided SAM perform better for polyp segmentation? (2023). arXiv preprint arXiv:2308.06623
Paranjape, J.N., Nair, N.G., Sikder, S., Vedula, S.S., Patel, V.M.: AdaptiveSAM: Towards efficient tuning of SAM for surgical scene segmentation (2023). arXiv preprint arXiv:2308.03726
Yue, W., et al.: Part to whole: Collaborative prompting for surgical instrument segmentation (2023). arXiv preprint arXiv:2312.14481
Yue, W., Zhang, J., Hu, K., Xia, Y., Luo, J., Wang, Z.: SurgicalSAM: Efficient class promptable surgical instrument segmentation (2023). arXiv preprint arXiv:2308.08746
Li, C., Khanduri, P., Qiang, Y., Sultan, R. I., Chetty, I., Zhu, D.: Auto-prompting SAM for mobile friendly 3D medical image segmentation (2023). arXiv preprint arXiv:2308.14936
Gong, H., Chen, J., Chen, G., Li, H., Li, G., Chen, F.: Thyroid region prior guided attention for ultrasound segmentation of thyroid nodules. Comput. Biol. Med. 155, 106389 (2023)
Pedraza, L., Vargas, C., Narváez, F., Durán, O., Muñoz, E., Romero, E.: An open access thyroid ultrasound image database. In: 10th International Symposium on Medical Information Processing and Analysis, pp. 188–193 (2015)
Wunderling, T., Golla, B., Poudel, P., Arens, C., Friebe, M., Hansen, C.: Comparison of thyroid segmentation techniques for 3D ultrasound. In: Image Processing 2017, pp. 346–352 (2017)
Al-Dhabyani, W., Gomaa, M., Khaled, H., Fahmy, A.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020)
Yap, M.H., et al.: Breast ultrasound region of interest detection and lesion localisation. Artif. Intell. Med. 107, 101880 (2020)
Leclerc, S., et al.: Deep learning for segmentation using an open large-scale dataset in 2D echocardiography. IEEE Trans. Med. Imag. 38(9), 2198–2210 (2019)
Kiranyaz, S., et al.: Left ventricular wall motion estimation by active polynomials for acute myocardial infarction detection. IEEE Access 8, 210301–210317 (2020)
Feng, S., et al.: CPFNet: context pyramid fusion network for medical image segmentation. IEEE Trans. Med. Imag. 39(10), 3008–3018 (2020)
Gu, R., et al.: CA-Net: comprehensive attention convolutional neural networks for explainable medical image segmentation. IEEE Trans. Med. Imag. 40(2), 699–711 (2020)
Gu, Z., et al.: CE-Net: context encoder network for 2D medical image segmentation. IEEE Trans. Med. Imag. 38(10), 2281–2292 (2019)
Chen, G., Li, L., Dai, Y., Zhang, J., Yap, M.H.: AAU-Net: an adaptive attention U-Net for breast lesions segmentation in ultrasound images. IEEE Trans. Med. Imag. 42(5), 1289–1300 (2023)
Cao, H., et al.: Swin-Unet: Unet-like pure transformer for medical image segmentation. In: European Conference on Computer Vision, pp. 205–218 (2022)
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6881–6890 (2021)
Chen, J., et al.: TransUNet: Transformers make strong encoders for medical image segmentation (2021). arXiv preprint arXiv:2102.04306
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Wu, H., Chen, S., Chen, G., Wang, W., Lei, B., Wen, Z.: FAT-Net: feature adaptive transformers for automated skin lesion segmentation. Med. Image Anal. 76, 102327 (2022)
He, A., Wang, K., Li, T., Du, C., Xia, S., Fu, H.: H2Former: an efficient hierarchical hybrid transformer for medical image segmentation. IEEE Trans. Med. Imag. 42(9), 2763–2775 (2023)
Acknowledgement
This work was supported in part by the National Natural Science Foundation of China under Grant 62271220 and Grant 62202179, in part by the Natural Science Foundation of Hubei Province of China under Grant 2022CFB585, and in part by the Fundamental Research Funds for the Central Universities, HUST: 2024JYCXJJ032. The computation is supported by the HPC Platform of HUST.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, X., Xiang, Y., Yu, L., Yan, Z. (2024). Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15008. Springer, Cham. https://doi.org/10.1007/978-3-031-72111-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-72111-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72110-6
Online ISBN: 978-3-031-72111-3
eBook Packages: Computer ScienceComputer Science (R0)