DAMFormer: Enhancing Polyp Segmentation Through Dual Attention Mechanism

Huy Trinh Quang¹²,
Mai Nguyen¹²,
Quan Nguyen Van¹²,
Linh Doan Bao¹²,
Thanh Dang Hong¹²,
Thanh Nguyen Tung¹² &
…
Toan Pham Van¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14450))

Included in the following conference series:

International Conference on Neural Information Processing

891 Accesses

Abstract

Polyp segmentation has been a challenging problem for researchers because it does not define a specific shape, color, or size. Traditional deep learning models, based on convolutional neural networks (CNNs), struggle to generalize well on unseen datasets. However, the Transformer architecture has shown promising potential in addressing medical problems by effectively capturing long-range dependencies through self-attention. This paper introduces the DAMFormer model based on Transformer for high accuracy while keeping lightness. The DAMFormer utilizes a Transformer encoder to extract better global information. The Transformer outputs are strategically fed into the ConvBlock and Enhanced Dual Attention Module to effectively capture high-frequency and low-frequency information. These outputs are further processed through the Effective Feature Fusion module to combine global and local features efficiently. In our experiment, five standard benchmark datasets were used Kvasir, CVC-Clinic DB, CVC-ColonDB, CVC-T, and ETIS-Larib.

H.T. Quang and M. Nguyen—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 55.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 69.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: Wm-dova maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015)
Article Google Scholar
Chen, J., et al.: Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Contributors, M.: MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark. https://github.com/open-mmlab/mmsegmentation (2020)
Dong, B., Wang, W., Fan, D.P., Li, J., Fu, H., Shao, L.: Polyp-pvt: polyp segmentation with pyramid vision transformers. arXiv preprint arXiv:2108.06932 (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Duc, N.T., Oanh, N.T., Thuy, N.T., Triet, T.M., Dinh, V.S.: Colonformer: an efficient transformer based method for colon polyp segmentation. IEEE Access 10, 80575–80586 (2022)
Article Google Scholar
Fan, D.-P., et al.: PraNet: Parallel Reverse Attention Network for Polyp Segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_26
Chapter Google Scholar
Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020)
Fu, J., et al.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)
Google Scholar
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Huang, C.H., Wu, H.Y., Lin, Y.L.: Hardnet-mseg: a simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172 (2021)
Islam, M.A., Jia, S., Bruce, N.D.: How much position information do convolutional neural networks encode? arXiv preprint arXiv:2001.08248 (2020)
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Chapter Google Scholar
Jha, D., et al.: Resunet++: an advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255. IEEE (2019)
Google Scholar
Kim, T., Lee, H., Kim, D.: Uacanet: uncertainty augmented context attention for polyp segmentation. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 2167–2175 (2021)
Google Scholar
Li, X., Chen, H., Qi, X., Dou, Q., Fu, C.W., Heng, P.A.: H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018)
Article Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Lou, A., Guan, S., Ko, H., Loew, M.H.: Caranet: context axial reverse attention network for segmentation of small medical objects. In: Medical Imaging 2022: Image Processing, vol. 12032, pp. 81–92. SPIE (2022)
Google Scholar
Nguyen, M., Thanh Bui, T., Van Nguyen, Q., Nguyen, T.T., Van Pham, T.: LAPFormer: A Light and Accurate Polyp Segmentation Transformer https://doi.org/10.48550/arXiv.2210.04393. arXiv e-prints arXiv:2210.04393 (2022)
Oktay, O., et al.: Attention u-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Pacal, I., Karaboga, D., Basturk, A., Akay, B., Nalbantoglu, U.: A comprehensive review of deep learning in colon cancer. Comput. Biol. Med. 126, 104003 (2020)
Article Google Scholar
Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. arxiv preprint arxiv:1807.06514 (2018)
Park, N., Kim, S.: How do vision transformers work? arXiv preprint arXiv:2202.06709 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Seo, H., Huang, C., Bassenne, M., Xiao, R., Xing, L.: Modified u-net (mu-net) with incorporation of object-dependent high level features for improved liver and liver-tumor segmentation in ct images. IEEE Trans. Med. Imaging 39(5), 1316–1325 (2019)
Article Google Scholar
Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 9(2), 283–293 (2014)
Article Google Scholar
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE Trans. Med. Imaging 35(2), 630–644 (2015)
Article Google Scholar
Vázquez, D., et al.: A benchmark for endoluminal scene segmentation of colonoscopy images. J. Healthcare Eng. 2017 (2017)
Google Scholar
Wang, J., Huang, Q., Tang, F., Meng, J., Su, J., Song, S.: Stepwise feature fusion: local guides global. In: Medical Image Computing and Computer Assisted Intervention-MICCAI 2022: 25th International Conference, Singapore, 18–22 September 2022, Proceedings, Part III, pp. 110–120. Springer (2022). https://doi.org/10.1007/978-3-031-16437-8_11
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Google Scholar
Wei, J., Hu, Y., Zhang, R., Li, Z., Zhou, S.K., Cui, S.: Shallow attention network for polyp segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 699–708. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_66
Chapter Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)
Google Scholar
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chapter Google Scholar
Zhao, X., Zhang, L., Lu, H.: Automatic polyp segmentation via multi-scale subtraction network. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 120–130. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_12
Chapter Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: A Nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

R &D Lab, Sun* Inc., Hanoi, Vietnam
Huy Trinh Quang, Mai Nguyen, Quan Nguyen Van, Linh Doan Bao, Thanh Dang Hong, Thanh Nguyen Tung & Toan Pham Van

Authors

Huy Trinh Quang
View author publications
You can also search for this author in PubMed Google Scholar
Mai Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Quan Nguyen Van
View author publications
You can also search for this author in PubMed Google Scholar
Linh Doan Bao
View author publications
You can also search for this author in PubMed Google Scholar
Thanh Dang Hong
View author publications
You can also search for this author in PubMed Google Scholar
Thanh Nguyen Tung
View author publications
You can also search for this author in PubMed Google Scholar
Toan Pham Van
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huy Trinh Quang .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Beijing, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quang, H.T. et al. (2024). DAMFormer: Enhancing Polyp Segmentation Through Dual Attention Mechanism. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14450. Springer, Singapore. https://doi.org/10.1007/978-981-99-8070-3_8

Download citation

DOI: https://doi.org/10.1007/978-981-99-8070-3_8
Published: 15 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8069-7
Online ISBN: 978-981-99-8070-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics