[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3395027.3419603acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models

Published: 29 September 2020 Publication History

Abstract

Offline Handwritten Text Recognition (HTR) is a task that offers a challenge in computer vision, where images are the only source of information. In fact, several approaches to optical models have been developed, such as through of Hidden Markov Model (HMM) or recurrent Bidirectional/Multidimensional layers. The current state-of-the-art consists of combined deep learning techniques, the Convolutional Recurrent Neural Networks (CRNN), in which recurrent layers still suffer from vanishing gradient problem when processing very long texts. In a way, high-performance models generally have millions of trainable parameters and a high computational cost. However, recently a new optical model architecture, Gated-CNN, demonstrated improvements to complement CRNN modeling. Thus, in this work, we present a new small architecture for HTR (based on Gated-CNN) integrated with two steps of language model at the character and word levels, respectively. Therefore, we used 9 state-of-the-art approaches and validated the results using the IAM public dataset. Finally, the proposed model surpasses the results obtained by different approaches in the literature, reaching recognition rates of CER 2.7% and WER 5.6%, which means an improvement of 13% over the best results on IAM dataset.

References

[1]
T. Bluche and R. Messina. 2017. Gated Convolutional Recurrent Neural Networks for Multilingual Handwriting Recognition. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01 (11 2017), 646--651.
[2]
D. Castro, B. Bezerra, and M. Valenca. 2018. Boosting the Deep Multidimensional Long-Short-Term Memory Network for Handwritten Recognition Systems. In 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, Niagara Falls, USA, 127--132.
[3]
K.-N. Chen, C.-H. Chen, and C.-C. Chang. 2012. Efficient illumination compensation techniques for text images. Digital Signal Processing 22, 5 (2012), 726--733.
[4]
K. Cho, B. van Merriënboer, D. Bahdanau, and Y. Bengio. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. Association for Computational Linguistics, Doha, Qatar, 103--111.
[5]
Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier. 2017. Language Modeling with Gated Convolutional Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML'17). JMLR.org, Sydney, NSW, Australia, 933--941.
[6]
P. Doetsch, M. Kozielski, and H. Ney. 2014. Fast and Robust Training of Recurrent Neural Networks for Offline Handwriting Recognition. Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR 2014 (12 2014), 279--284.
[7]
K. He, X. Zhang, S. Ren, and J. Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, Las Condes, Chile, 1026--1034.
[8]
R. R. Ingle, Y. Fujii, T. Deselaers, J. Baccash, and A. C. Popat. 2019. A Scalable Handwritten Text Recognition System. 2019 International Conference on Document Analysis and Recognition (ICDAR) 01 (2019), 17--24.
[9]
S. Ioffe. 2017. Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 1942--1950.
[10]
M. Kozielski, P. Doetsch, and H. Ney. 2013. Improvements in RWTH's System for Off-Line Handwriting Recognition. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. IEEE, Washington, USA, 935--939.
[11]
U.-V. Marti and H. Bunke. 2002. The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition 5 (11 2002), 39--46.
[12]
V. Pham, T. Bluche, C. Kermorvant, and J. Louradour. 2014. Dropout Improves Recurrent Neural Networks for Handwriting Recognition. In 2014 14th International Conference on Frontiers in Handwriting Recognition. IEEE, Crete Island, Greece, 285--290.
[13]
A. Poznanski and L. Wolf. 2016. CNN-N-Gram for Handwriting Word Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 01 (2016), 2305--2314.
[14]
J. Puigcerver. 2017. Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition? 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01 (11 2017), 67--72.
[15]
A. Vinciarelli and J. Luettin. 2001. A new normalization technique for cursive handwritten words. Pattern Recognition Letters 22 (2001), 1043--1050.
[16]
P. Voigtlaender, P. Doetsch, and H. Ney. 2016. Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks. In 15th International Conference on Frontiers in Handwriting Recognition (ICFHR). IEEE, Shenzhen, China, 228--233.

Cited By

View all
  • (2024)Recognizing text lines in handwritten archival document images using octave convolutional and attention recurrent neural networksMultimedia Tools and Applications10.1007/s11042-024-19717-4Online publication date: 9-Jul-2024
  • (2024)Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source LibraryDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70549-6_23(387-404)Online publication date: 30-Aug-2024
  • (2024)BRESSAY: A Brazilian Portuguese Dataset for Offline Handwritten Text RecognitionDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70536-6_19(315-333)Online publication date: 30-Aug-2024
  • Show More Cited By

Index Terms

  1. HTR-Flor++: A Handwritten Text Recognition System Based on a Pipeline of Optical and Language Models

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DocEng '20: Proceedings of the ACM Symposium on Document Engineering 2020
      September 2020
      130 pages
      ISBN:9781450380003
      DOI:10.1145/3395027
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 September 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Deep Neural Networks
      2. Gated-CNN
      3. Language Models
      4. Offline Handwritten Text Recognition
      5. Optical Character Recognition

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Funding Sources

      • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
      • Conselho Nacional de Desenvolvimento Científico e Tecnológico

      Conference

      DocEng '20
      Sponsor:
      DocEng '20: ACM Symposium on Document Engineering 2020
      September 29 - October 1, 2020
      CA, Virtual Event, USA

      Acceptance Rates

      Overall Acceptance Rate 194 of 564 submissions, 34%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 11 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Recognizing text lines in handwritten archival document images using octave convolutional and attention recurrent neural networksMultimedia Tools and Applications10.1007/s11042-024-19717-4Online publication date: 9-Jul-2024
      • (2024)Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source LibraryDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70549-6_23(387-404)Online publication date: 30-Aug-2024
      • (2024)BRESSAY: A Brazilian Portuguese Dataset for Offline Handwritten Text RecognitionDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70536-6_19(315-333)Online publication date: 30-Aug-2024
      • (2024)The Learnable Typewriter: A Generative Approach to Text AnalysisDocument Analysis and Recognition - ICDAR 202410.1007/978-3-031-70536-6_18(297-314)Online publication date: 3-Sep-2024
      • (2023)An end-to-end pipeline for historical censuses processingInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-023-00428-926:4(419-432)Online publication date: 17-Mar-2023
      • (2022)Refocus attention span networks for handwriting line recognitionInternational Journal on Document Analysis and Recognition (IJDAR)10.1007/s10032-022-00422-7Online publication date: 25-Dec-2022
      • (2022)Active Transfer Learning for Handwriting RecognitionFrontiers in Handwriting Recognition10.1007/978-3-031-21648-0_17(245-258)Online publication date: 4-Dec-2022
      • (2022)Case Study of Few-Shot Learning in Text Recognition ModelsWeb Information Systems Engineering – WISE 202110.1007/978-3-030-91560-5_29(394-401)Online publication date: 1-Jan-2022
      • (2021)HTR for Greek Historical Handwritten DocumentsJournal of Imaging10.3390/jimaging71202607:12(260)Online publication date: 2-Dec-2021
      • (2021)Boosting Offline Handwritten Text Recognition in Historical Documents With Few Labeled LinesIEEE Access10.1109/ACCESS.2021.30826899(76674-76688)Online publication date: 2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media