[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
brief-report

Tiny polyp detection from endoscopic video frames using vision transformers

Published: 04 April 2024 Publication History

Abstract

Deep learning techniques can be effective in helping doctors diagnose gastrointestinal polyps. Currently, processing video frame sequences containing a large amount of spurious noise in polyp detection suffers from elevated recall and mean average precision. Moreover, the mean average precision is also low when the polyp target in the video frame has large-scale variability. Therefore, we propose a tiny polyp detection from endoscopic video frames using Vision Transformers, named TPolyp. The proposed method uses a cross-stage Swin Transformer as a multi-scale feature extractor to extract deep feature representations of data samples, improves the bidirectional sampling feature pyramid, and integrates the prediction heads of multiple channel self-attention mechanisms. This approach focuses more on the feature information of the tiny object detection task than convolutional neural networks and retains relatively deeper semantic information. It additionally improves feature expression and discriminability without increasing the computational complexity. Experimental results show that TPolyp improves detection accuracy by 7%, recall by 7.3%, and average accuracy by 7.5% compared to the YOLOv5 model, and has better tiny object detection in scenarios with blurry artifacts.

References

[1]
Ahn SB, Han DS, Bae JH, Byun TJ, et al. The miss rate for colorectal adenoma determined by quality-adjusted, back-to-back colonoscopies Gut Liver 2012 6 1 64
[2]
Lee J, Park SW, Kim YS, et al. Risk factors of missed colorectal lesions after colonoscopy Medicine 2017 96 27 e7468
[3]
Pu LZCT et al. Computer-aided diagnosis for characterisation of colorectal lesions: a comprehensive software including serrated lesions Gastrointest Endosc 2020 92 891-899
[4]
Ren S et al. Faster R-CNN: towards real-time object detection with region proposal networks IEEE Trans Pattern Anal Mach Intell 2017 39 6 1137-1149
[5]
Wang R, Zhang W, Nie W, Yu Y (2020) Gastric polyps detection by improved faster R-CNN. In: Proceedings of the 2019 8th international conference on computing and pattern recognition (ICCPR '19). Association for Computing Machinery, New York, NY, USA, pp 128–133.
[6]
Ren S et al. Towards real-time object detection with region proposal networks IEEE Trans Pattern Anal Mach Intell 2017 39 6 1137-1149
[7]
Al-Fedaghi S and Bayoumi M Authentication modeling with five generic processes Int J Adv Comput Sci Appl (IJACSA) 2019
[8]
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767
[9]
Bochkovskiy A et al (2020) YOLOv5: improved performance, and on-device training. arXiv preprint arXiv:2006.05597
[10]
Vaswani A et al. Attention is all you need Adv Neural Inf Process Syst 2017 30 5998-6008
[11]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T et al (2021). An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
[12]
Su J, Zhou B, Jie Z, Zhu J, Ding C, Zhuang Y, Liu S, Li G, Wang Y, Li Z, Xiao B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10257–10266
[13]
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z et al. (2021). Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030
[14]
Siegel R, DeSantis C, and Jemal A Colorectal cancer statistics, 2014 CA A Cancer J Clin 2014 64 2 104-117
[15]
Wang Y, Dorner S, and Ecker R A framework for automatic polyp detection in colonoscopy images Med Image Anal 2010 14 4 616-629
[16]
Zheng Y, Wang X, Song Y, and Wang H Computer-aided diagnosis for colonoscopy by using bag-of-visual-words and Fisher vector techniques J Med Syst 2018 42 2 31
[17]
Zhang X, Chen Y, and Song Y A novel approach for automated polyp detection in colonoscopy images via SIFT features J Med Syst 2016 40 6 136
[18]
Zhou SK et al. A review of deep learning in medical imaging: imaging traits, technology trends, case studies with progress highlights, and future promises Proc IEEE 2021 109 5 820-838
[19]
Zacharaki et al (2009) A comparative study of texture features for the detection of colonic polyps in computed tomography colonography
[20]
Tajbakhsh N et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging 2016 35 5 1299-1312
[21]
Wang P, Xiao X, Glissen Brown JR, and Berzin TM Automatic detection of colonic polyps in endoscopic images using region-based convolutional neural networks IEEE J Biomed Health Inform 2018 22 5 1495-1505
[22]
Fang Y, Zhang J, Zhang Y, Gao Y (2016) Polyp detection using convolutional neural networks and region-based fully convolutional networks. In: International conference on medical image computing and computer-assisted intervention, vol 9902, pp 62–70
[23]
Wang Y, Li L, Wang H, Gao X, Xia Y (2016) Polyp detection in colonoscopy videos using region-based convolutional neural networks. In: International conference on medical image computing and computer-assisted intervention, vol 9901, pp 473–481
[24]
Urban G, Tripathi P, Alkayali T, Mittal M, Jalali F, Karnes W, et al. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy Gastroenterology 2018 155 4 1069-1078
[25]
Xu Y, Chen W, Zhang X, and Wang J EfficientDet-based colonic polyp detection in colonoscopy images IEEE Trans Med Imaging 2021 40 1 73-83
[26]
Li H, Li X, Liang J, and Li F EfficientDet-based automatic polyp detection for colonoscopy images IEEE J Biomed Health Inform 2020 24 2 566-574
[27]
Tan M, Le QV (2020) EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
[28]
Bychkov D, Linder N, Annus P, and Kõks S Detecting lesions in colorectal cancer with deep learning Med Image Anal 2018 49 88-97
[29]
Wang Z, Dong D, Wu L, Chen S, Liu F (2018) Towards accurate polyp detection with YOLO. In: 2018 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 1576–1580.
[30]
Bertrand R, Marion R, Boudiaf M, Chambon S (2019) Towards real-time lesion detection in colonoscopy using single shot detectors. In: 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019), pp 1003–1007.
[31]
Wang S, Wang R, Zhang X, Wang L, and Zhang J Polyp detection in colonoscopy using focal loss convolutional neural networks J Healthcare Eng 2020 2020 8895832
[32]
Pu LZCT, Maicas G, Tian Y, Yamamura T, Nakamura M, Suzuki H, Singh G, Rana K, Hirooka Y, Burt AD, et al. Computer-aided diagnosis for characterisation of colorectal lesions: a comprehen-sive software including serrated lesions Gastrointest Endosc 2020 92 891-899
[33]
Liu Y, Tian Y, Maicas G, Pu LZCT, Singh R, Verjans JW, Carneiro G (2020) Photoshopping colonoscopy video frames. In: 2020 IEEE 17th international symposium on biomedical imaging (ISBI). IEEE, pp 1–5
[34]
Tajbakhsh N et al (2015) Automatic polyp detection in colonoscopy videos using an ensemble of convolutional neural networks. In: 2015 IEEE 12th international symposium on biomedical imaging (ISBI).
[35]
Bogusz A, Moscicki J, Skomorowski M, et al. Polyp detection in colonoscopy images using panoramic attention network IEEE J Biomed Health Inform 2020 24 10 2926-2935
[36]
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
[37]
Smith J Simplified PANet for polyp detection in colonoscopic images IEEE Trans Med Imaging 2020 39 8 2560-2569
[38]
Ma Y, Chen X, Cheng K, Li Y, Sun B (2021) LDPolypvideo benchmark: a large-scale colonoscopy video dataset of diverse polyps. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 387–396
[39]
Borgli H et al. Hyperkvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy Scientific Data 2020 7 1 1-14
[40]
MacKay DJC Information theory, inference, and learning algorithms 2003 Cambridge Cambridge University Press
[41]
Rezatofighi H, Tsoi N, Gwak JY et al (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 658–666
[42]
Zheng Z, Wang P, Liu W et al (2020) Distance-IoU loss: faster and better learning for bounding box regression. In: AAAI, pp 12993–13000
[43]
Zhang H et al (2017) mixup: Beyond empirical risk minimization
[44]
Zhou X, Wang D, Philipp K (2019) Objects as points
[45]
Zhou Q et al (2022) TransVOD: end-to-end video object detection with spatial-temporal transformers

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Analysis & Applications
Pattern Analysis & Applications  Volume 27, Issue 2
Jun 2024
615 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 04 April 2024
Accepted: 18 February 2024
Received: 14 April 2023

Author Tags

  1. Polyp detection
  2. Endoscopic video analysis
  3. Tiny object detection
  4. Vision transformers
  5. Gastrointestinal diseases

Qualifiers

  • Brief-report

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media