[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3638837.3638879acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicnccConference Proceedingsconference-collections
research-article

Table Detection Method Based on Faster-RCNN and Window Attention

Published: 07 March 2024 Publication History

Abstract

As an important carrier of information, tables possess the characteristics of high data storage density, conciseness, and intuitiveness, and are widely applied in offices and daily life. Due to the complexity of table structures and diverse presentation formats, the automated processing of a large number of image-based tables has always been a challenge in the field of document recognition. This algorithm addresses the task of table detection in table processing and proposes a table detection algorithm based on an improved window self-attention network for feature extraction of image-based tables. It utilizes a two-stage object detection algorithm, introduces local feature extraction blocks and backward feed-forward residual network blocks, and designs a feature pyramid network within the backbone to enhance the model's detection performance by improving its ability to learn document spatial layout features. The effectiveness of the proposed method is verified through experimental comparisons on publicly available datasets.

References

[1]
Schreiber S, Agne S, Wolf I,et al.DeepDeSRT: Deep Learning for Detection and Structure Recognition of Tables in Document Images[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE Computer Society, 2017.
[2]
Tang B, Jiang J, Xu X,et al. Triangle Coordinate Diagram Localization for Academic Literature Based on Line Segment Detection in Cloud Computing[J].Springer, Cham, 2022.
[3]
Kavasidis I, Pino C, Palazzo S,et al. A Saliency-Based Convolutional Neural Network for Table and Chart Detection in Digitized Documents[C]//International Conference on Image Analysis and Processing.Springer, Cham, 2019.
[4]
S. A. Siddiqui, I. A. Fateh, S. T. R. Rizvi, A. Dengel and S. Ahmed. DeepTabStR: Deep Learning based Table Structure Recognition[C]// 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 1403-1409.
[5]
Fernandes J, Simsek M, Kantarci B,et al. TableDet: An end-to-end deep learning approach for table detection and table image classification in data sheet images[J].Neurocomputing, 2022(Jan.11):468.
[6]
HUANG Y,YAN Q,LI Y, A YOLO-Based Table Detection Method[C]// 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 813-818.
[7]
Fengchang Yu, Jiani Huang, Zhuoran Luo, An effective method for figures and tables detection in academic literature[J]. Information Processing and Management, 2023, 103286.
[8]
Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, Zhoujun Li. TablBank: A Benchmark Dataset for Table Detection and Recognition[J]. arXiv preprint arXiv:1903.01949, 2019.
[9]
K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016[C], pp. 770–778.
[10]
Liu Z, Lin Y, Cao Y, Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 10.1109/ICCV48922.2021.00986.
[11]
RAO Y, CHENG Y, XUE J, FPSiamRPN: feature pyramid siamese network with region proposal network for target tracking. IEEE Access, 2020, 8: 176158-176169.
[12]
Xuran Pan, Chunjiang Ge, Rui Lu, On the Integration of Self-Attention and Convolution. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2022
[13]
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, Mobilenetv2: Inverted residuals and linear bottlenecks. in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018, pp. 4510-4520.
[14]
Y. He, J. Liu, Z. Wang, L. Tong, T. Lan, and W. Zuo, CMT: Convolutional Neural Networks Meet Vision Transformers. arXiv Preprint arXiv:2107.06263, 2021.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICNCC '23: Proceedings of the 2023 12th International Conference on Networks, Communication and Computing
December 2023
310 pages
ISBN:9798400709265
DOI:10.1145/3638837
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 March 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Inverted residual feed-forward network
  2. Self-attention
  3. Table detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICNCC 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 17
    Total Downloads
  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)3
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media