Swin-HAUnet: A Swin-Hierarchical Attention Unet For Enhanced Medical Image Segmentation

Jiarong Chen¹⁵,
Xuyang Zhang¹⁵,
Rongwen Li¹⁵ &
…
Peng Zhou¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15044))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

147 Accesses

Abstract

Medical image segmentation plays a pivotal role in computer-aided diagnosis and treatment planning. Traditional segmentation approaches often struggle to balance global and local context, either capturing overall anatomical structures or focusing on minute details, but not both. This paper introduces the Swin-Hierarchical Attention Unet (Swin-HAUnet), which harmonizes this dichotomy by integrating global contextual insights with local feature enhancement. The proposed network architecture employs a hybrid approach, leveraging an advanced transformer-based encoder to process wide-ranging contextual information and an attention-enhanced decoder to refine the segmentation of nuanced and intricate anatomical features. We performed experiments on two publicly available datasets, the Synapse multi-organ segmentation CT dataset and the UW-Madison dataset. The Swin-HAUnet shows a marked improvement in performance, achieving a Dice similarity coefficient of 79.91%, a notable increase of 1.26% over the baseline model on Synapse datasets. These results underscore the model’s effectiveness in complex segmentation tasks and the importance of attention mechanisms in medical image analysis.

Jiarong Chen, Xuyang Zhang, and Rongwen Li are co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 54.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LIT-Unet: a lightweight and effective model for medical image segmentation

Article 20 September 2024

HCT-Unet: multi-target medical image segmentation via a hybrid CNN-transformer Unet incorporating multi-axis gated multi-layer perceptron

Article 06 September 2024

SBC-UNet: A Network Based on Improved Hourglass Attention Mechanism and U-Net for Medical Image Segmentation

Notes

References

Cai, S., Tian, Y., Lui, H., Zeng, H., Wu, Y., Chen, G.: Dense-Unet: a novel multiphoton in vivo cellular image segmentation model based on a convolutional neural network. Quant. Imaging Med. Surg. 10(6), 1275 (2020)
Article Google Scholar
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-Unet: Unet-like pure transformer for medical image segmentation. In: ECCV, pp. 205–218. Springer (2022)
Google Scholar
Chang, Y., Menghan, H., Guangtao, Z., Xiao-Ping, Z.: Transclaw u-net: Claw u-net with transformers for medical image segmentation. arXiv preprint arXiv:2107.05188 (2021)
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3d u-net: learning dense volumetric segmentation from sparse annotation. In: MICCAI 2016, pp. 424–432. Springer (2016)
Google Scholar
Ding, F., Yang, G., Liu, J., Wu, J., Ding, D., Xv, J., Cheng, G., Li, X.: Hierarchical attention networks for medical image segmentation. arXiv preprint arXiv:1911.08777 (2019)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fu, S., Lu, Y., Wang, Y., Zhou, Y., Shen, W., Fishman, E., Yuille, A.: Domain adaptive relational reasoning for 3d multi-organ segmentation. In: MICCAI 2020, pp. 656–666. Springer (2020)
Google Scholar
Gumaei, A., Hassan, M.M., Hassan, M.R., Alelaiwi, A., Fortino, G.: A hybrid feature extraction method with regularized extreme learning machine for brain tumor classification. IEEE Access 7, 36266–36273 (2019)
Article Google Scholar
Hu, Q., Chen, Y., Xiao, J., Sun, S., Chen, J., Yuille, A.L., Zhou, Z.: Label-free liver tumor segmentation. In: Proceedings of the IEEE/CVF CVPR, pp. 7422–7432 (2023)
Google Scholar
Liu, W., Li, D., Su, H.: Hana: Hierarchical attention network assembling for semantic segmentation. Cogn. Comput. 13(5), 1128–1135 (2021)
Article Google Scholar
Liu, Y., Wu, Y.H., Sun, G., Zhang, L., Chhatkuli, A., Van Gool, L.: Vision transformers with hierarchical attention. arXiv preprint arXiv:2106.03180 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF ICCV, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF ICCV, pp. 10012–10022 (2021)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on CVPR, pp. 3431–3440 (2015)
Google Scholar
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: MICCAI 2015, pp. 234–241. Springer (2015)
Google Scholar
Salehi, S.S.M., Erdogmus, D., Gholipour, A.: Tversky loss function for image segmentation using 3d fully convolutional deep networks. In: International Workshop on Machine Learning in Medical Imaging, pp. 379–387. Springer (2017)
Google Scholar
Shu, Z., Entezari, A.: Sparse-view and limited-angle CT reconstruction with untrained networks and deep image prior. Comput. Methods Programs Biomed. 226, 107167 (2022)
Article Google Scholar
Siddique, N., Paheding, S., Elkin, C.P., Devabhaktuni, V.: U-net and its variants for medical image segmentation: A review of theory and applications. IEEE Access 9, 82031–82057 (2021)
Article Google Scholar
Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Google Scholar
Wang, J., Zhou, P., Han, X., Chen, Y.: Medical image super-resolution via diagnosis-guided attention. In: 2023 IEEE International Conference on Multimedia and Expo (ICME), pp. 462–467 (2023)
Google Scholar
Wang, Z., Yin, Y., Shi, J., Fang, W., Li, H., Wang, X.: Zoom-in-net: Deep mining lesions for diabetic retinopathy detection. In: MICCAI 2017, pp. 267–275. Springer (2017)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: Convolutional block attention module. In: ECCV, pp. 3–19 (2018)
Google Scholar
Xiao, X., Lian, S., Luo, Z., Li, S.: Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331. IEEE (2018)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Yuan, Y., Zhang, L., Wang, L., Huang, H.: Multi-level attention network for retinal vessel segmentation. IEEE J. Biomed. Health Inform. 26(1), 312–323 (2021)
Article Google Scholar
Zhao, L., Peng, X., Tian, Y., Kapadia, M., Metaxas, D.N.: Semantic graph convolutional networks for 3d human pose regression. In: Proceedings of the IEEE/CVF CVPR, pp. 3425–3435 (2019)
Google Scholar
Zhou, F., Luo, F., Efio-Akolly, K., Bbosa, R., Huang, W.C., Zou, J.N., Chen, Y.P.P., Liu, F.: Haunet-3d: a novel hierarchical attention 3D Unet for lung nodule segmentation. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1213–1220. IEEE (2021)
Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: a nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp. 3–11. Springer (2018)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China grants 62176001 and Natural Science Project of Anhui Provincial Education Department grants 2023AH030004.

Author information

Authors and Affiliations

Anhui Provincial International Joint Research Center for Advanced Technology in Medical Imaging, School of Computer Science and Technology, Anhui University, Hefei, China
Jiarong Chen, Xuyang Zhang, Rongwen Li & Peng Zhou

Authors

Jiarong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xuyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Rongwen Li
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peng Zhou .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Zhouchen Lin
Nankai University, Tianjin, China
Ming-Ming Cheng
Chinese Academy of Sciences, Beijing, China
Ran He
Xinjiang University, Urumqi, Xinjiang, China
Kurban Ubul
Xinjiang University, Urumqi, China
Wushouer Silamu
Peking University, Beijing, China
Hongbin Zha
Tsinghua University, Beijing, China
Jie Zhou
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 37 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Zhang, X., Li, R., Zhou, P. (2025). Swin-HAUnet: A Swin-Hierarchical Attention Unet For Enhanced Medical Image Segmentation. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2024. Lecture Notes in Computer Science, vol 15044. Springer, Singapore. https://doi.org/10.1007/978-981-97-8496-7_26

Download citation

DOI: https://doi.org/10.1007/978-981-97-8496-7_26
Published: 03 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-8495-0
Online ISBN: 978-981-97-8496-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics