More Web Proxy on the site http://driver.im/

research-article

Research on Text Recognition Method of Financial Documents

Authors:

Yongchao ShiAuthors Info & Claims

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

Pages 266 - 272

https://doi.org/10.1145/3488933.3489008

Published: 25 February 2022 Publication History

Abstract

Aiming at the problems of text box overlap and missing text content in the recognition of financial documents in the Detecting Text in Natural Image with Connectionist Text Proposal Network(CTPN) model, it is proposed to improve the CTPN network structure based on the Inception network and optimize the feature extraction process. At the same time, the confidence loss function is introduced in the classification to optimize the candidates. Selection of the position of the frame anchor. A comparative experiment with the CTPN model proves that the improved network model has a certain improvement effect on text target detection. Created and implemented a text recognition model of DenseNet+Bi-directional Long Short-Term Memory(BLSTM)+CTC. The DenseNet network establishes a connection between feature maps, achieves the reuse of features, reduces the amount of parameters, and reduces the amount of calculations for each layer of the network. BLSTM can associate contextual information to assist prediction. The CTC loss function solves the text prediction and recognition when the input and output are not aligned. Experiments show that the model designed in this paper has a certain improvement in the recognition performance of traditional recognition systems and CRNN models.

References

[1]

Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: Aciq: Analytical clipping for integer quantization of neural networks (2018).Baoguang S, Mingkun Y, Xinggang W, ASTER: An Attentional Scene Text Recognizer with Flexible Rectification[J].IEEE Transactions on Pattern Machine Intelligence,2018:1-1.

[2]

Cai, J., Takemoto, M., Nakajo, H.: A deep look into logarithmic quantization of model parameters in neural networks. In: Proceedings of the 10th International Conference on Advances in Information Technology. pp. 1–8 (2018).

Digital Library

[3]

Chung, J., Shin, T.: Simplifying deep neural networks for neuromorphic architec- tures. In: 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC). pp. 1–6. IEEE (2016).

[4]

Courbariaux,M.,Bengio,Y.,David,J.P.:Binaryconnect:Trainin gdeepneuralnet- works with binary weights during propagations. In: Advances in neural information processing systems. pp. 3123–3131 (2015).

[5]

Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning. pp. 1737– 1746 (2015).

[6]

Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural net- works with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[7]

Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarizedneural networks. In: Advances in neural information processing systems. pp. 4107–4115 (2016).

[8]

Hwang, K., Sung, W.: Fixed-point feedforward deep neural network design using weights+ 1, 0, and- 1. In: 2014 IEEE Workshop on Signal Processing Systems (SiPS). pp. 1–6. IEEE (2014).

[9]

Jain, S.R., Gural, A., Wu, M., Dick, C.H.: Trained quantization thresholds for accurate and efficient fixed-point inference of deep neural networks. arXiv preprint arXiv:1903.08066 2(3), 7 (2019).

[10]

Krizhevsky, A., Sutskever, I., Hinton, G.: ‘2012 alexnet. Advances In Neural Infor- mation Processing Systems pp. 1– 9 (2012).

[11]

Krizhevsky, A., Hinton, G., : Learning multiple layers of features from tiny images (2009).

[12]

Lee, E.H., Miyashita, D., Chai, E., Murmann, B., Wong, S.S.: Lognet: Energy- efficient neural networks using logarithmic computation. In: 2017 IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 5900– 5904. IEEE (2017).

[13]

Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016).

[14]

Liu, X., Ye, M., Zhou, D., Liu, Q.: Post-training quantization with multiple points: Mixed precision without mixed precision. arXiv preprint arXiv:2002.09049 (2020).

[15]

Migacz, S.: 8-bit inference with tensorrt. In: GPU technology conference. vol. 2, p. 5 (2017).

[16]

Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor- net: Imagenet classification using binary convolutional neural networks. In: European conference on computer vision. pp. 525–542. Springer (2016).

[17]

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[18]

Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization. arXiv preprint arXiv:1511.06488 (2015).

[19]

Wu, S., Li, G., Chen, F., Shi, L.: Training and inference with integers in deep neural networks. arXiv preprint arXiv:1802.04680 (2018).

[20]

Zhao, R., Hu, Y., Dotzel, J., De Sa, C., Zhang, Z.: Improving neural network quantization using outlier channel splitting. arXiv preprint arXiv:1901.09504 (2019)

[21]

Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: Towards lossless cnns with low- precision weights. arXiv preprint arXiv:1702.03044 (2017).

Index Terms

Research on Text Recognition Method of Financial Documents
1. Applied computing
2. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

A knowledge-based recognition system for historical Mongolian documents

This paper proposes a knowledge-based system to recognize historical Mongolian documents in which the words exhibit remarkable variation and character overlapping. According to the characteristics of Mongolian word formation, the system combines a ...
DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition
Abstract
Devanagari script is the most widely used script in India and other Asian countries. There is a rich collection of ancient Devanagari manuscripts, which is a wealth of knowledge. To make these manuscripts available to people, efforts are being ...
A segmentation-free approach to text recognition with application to Arabic text

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

September 2021

715 pages

ISBN:9781450384087

DOI:10.1145/3488933

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

AIPR 2021

AIPR 2021: 2021 4th International Conference on Artificial Intelligence and Pattern Recognition

September 24 - 26, 2021

Xiamen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
56
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents