[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3488933.3489008acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

Research on Text Recognition Method of Financial Documents

Published: 25 February 2022 Publication History

Abstract

Aiming at the problems of text box overlap and missing text content in the recognition of financial documents in the Detecting Text in Natural Image with Connectionist Text Proposal Network(CTPN) model, it is proposed to improve the CTPN network structure based on the Inception network and optimize the feature extraction process. At the same time, the confidence loss function is introduced in the classification to optimize the candidates. Selection of the position of the frame anchor. A comparative experiment with the CTPN model proves that the improved network model has a certain improvement effect on text target detection. Created and implemented a text recognition model of DenseNet+Bi-directional Long Short-Term Memory(BLSTM)+CTC. The DenseNet network establishes a connection between feature maps, achieves the reuse of features, reduces the amount of parameters, and reduces the amount of calculations for each layer of the network. BLSTM can associate contextual information to assist prediction. The CTC loss function solves the text prediction and recognition when the input and output are not aligned. Experiments show that the model designed in this paper has a certain improvement in the recognition performance of traditional recognition systems and CRNN models.

References

[1]
Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: Aciq: Analytical clipping for integer quantization of neural networks (2018).Baoguang S, Mingkun Y, Xinggang W, ASTER: An Attentional Scene Text Recognizer with Flexible Rectification[J].IEEE Transactions on Pattern Machine Intelligence,2018:1-1.
[2]
Cai, J., Takemoto, M., Nakajo, H.: A deep look into logarithmic quantization of model parameters in neural networks. In: Proceedings of the 10th International Conference on Advances in Information Technology. pp. 1–8 (2018).
[3]
Chung, J., Shin, T.: Simplifying deep neural networks for neuromorphic architec- tures. In: 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC). pp. 1–6. IEEE (2016).
[4]
Courbariaux,M.,Bengio,Y.,David,J.P.:Binaryconnect:Trainin gdeepneuralnet- works with binary weights during propagations. In: Advances in neural information processing systems. pp. 3123–3131 (2015).
[5]
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning. pp. 1737– 1746 (2015).
[6]
Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural net- works with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
[7]
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarizedneural networks. In: Advances in neural information processing systems. pp. 4107–4115 (2016).
[8]
Hwang, K., Sung, W.: Fixed-point feedforward deep neural network design using weights+ 1, 0, and- 1. In: 2014 IEEE Workshop on Signal Processing Systems (SiPS). pp. 1–6. IEEE (2014).
[9]
Jain, S.R., Gural, A., Wu, M., Dick, C.H.: Trained quantization thresholds for accurate and efficient fixed-point inference of deep neural networks. arXiv preprint arXiv:1903.08066 2(3), 7 (2019).
[10]
Krizhevsky, A., Sutskever, I., Hinton, G.: ‘2012 alexnet. Advances In Neural Infor- mation Processing Systems pp. 1– 9 (2012).
[11]
Krizhevsky, A., Hinton, G., : Learning multiple layers of features from tiny images (2009).
[12]
Lee, E.H., Miyashita, D., Chai, E., Murmann, B., Wong, S.S.: Lognet: Energy- efficient neural networks using logarithmic computation. In: 2017 IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 5900– 5904. IEEE (2017).
[13]
Li, F., Zhang, B., Liu, B.: Ternary weight networks. arXiv preprint arXiv:1605.04711 (2016).
[14]
Liu, X., Ye, M., Zhou, D., Liu, Q.: Post-training quantization with multiple points: Mixed precision without mixed precision. arXiv preprint arXiv:2002.09049 (2020).
[15]
Migacz, S.: 8-bit inference with tensorrt. In: GPU technology conference. vol. 2, p. 5 (2017).
[16]
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor- net: Imagenet classification using binary convolutional neural networks. In: European conference on computer vision. pp. 525–542. Springer (2016).
[17]
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[18]
Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization. arXiv preprint arXiv:1511.06488 (2015).
[19]
Wu, S., Li, G., Chen, F., Shi, L.: Training and inference with integers in deep neural networks. arXiv preprint arXiv:1802.04680 (2018).
[20]
Zhao, R., Hu, Y., Dotzel, J., De Sa, C., Zhang, Z.: Improving neural network quantization using outlier channel splitting. arXiv preprint arXiv:1901.09504 (2019)
[21]
Zhou, A., Yao, A., Guo, Y., Xu, L., Chen, Y.: Incremental network quantization: Towards lossless cnns with low- precision weights. arXiv preprint arXiv:1702.03044 (2017).

Index Terms

  1. Research on Text Recognition Method of Financial Documents
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Other conferences
          AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition
          September 2021
          715 pages
          ISBN:9781450384087
          DOI:10.1145/3488933
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 25 February 2022

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. Optical character recognition
          2. convolutional neural network
          3. deep learning
          4. text recognition

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Conference

          AIPR 2021

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 56
            Total Downloads
          • Downloads (Last 12 months)5
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 19 Dec 2024

          Other Metrics

          Citations

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media