Abstract
Optical Character Recognition (OCR) for Arabic text (printed and handwritten) has been widely studied by researchers in the last two decades. Some commercial solutions have emerged with good recognition rates for printed text (on white or uniform backgrounds) or handwritten text with limited vocabulary. In addition to being naturally cursive, the Arabic language comes with additional challenges due to its calligraphy resulting in a variety of fonts and styles. In this work, recent advances in recurrent neural networks are explored for the recognition of Arabic text in identity documents captured in the wild. The unconstrained captures bring additional difficulties as the text has to be first localized before being able to recognize it. Various pre-processing steps are introduced to overcome the difficulties related to the Arabic text itself and also due to the capturing conditions. The presented approach outperforms existing solutions when evaluated using a private dataset and also using the recent MIDV2020 dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bulatovich, B.K., et al. "MIDV-2020: A comprehensive benchmark dataset for identity document analysis". In: 46.2 (2022), pp. 252–270
Al-Hashim A.G., Mahmoud, S.A.: Benchmark database and GUI environment for printed Arabic text recognition research. In: WSEAS Trans. Inf. Sci. Appl. 7.4 (2010), pp. 587–597
Jaiem, F.K., et al. "Database for Arabic printed text recognition research". In: ICIAP. 2013, pp. 251–259
Maalej, R., Tagougui, N., Kherallah, M.:Online Arabic handwriting recognition with dropout applied in deep recurrent neural networks. In: 12th IAPR DAS. IEEE. 2016, pp. 417–421
El-Mahallawy, M. S. M.: A Large scale HMM-based omni front-written OCR system for cursive scripts. In: Ph.D. thesis, Cairo University. 2008
Mahmoud, S. A., et al.: "KHATT: an open Arabic offline handwritten text database". In: Pattern Recognition 47.3 (2014), pp. 1096–1112
Ngoc, M., Fabrizio, J., éraud, T.G.,: Saliency-based detection of identy documents captured by smartphones. In: 13th DAS. 2018, pp. 387–392
Ramdan, J., Omar, K., Faidzul. M.: A Novel method to detect segmentation points of Arabic words using peaks and neural network. In: IJASEIT 7.2 (2017), pp. 625–631
de Sá Soares, A., das Neves Junior, R. B., Bezerra, B.L.D.:BID Dataset: a challenge dataset for document processing tasks. In: 18th Conference on Graphics, Patterns and Images. SBC. 2020, pp. 143–146
Shi, B., Bai, X., Yao. C.: An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. In: CoRR abs/1507.05717 (2015). arXiv: 1507.05717
Shivakumara, P., et al.:"CNN-RNN based method for license plate recognition". In: CAAI Transactions on Intelligence Technology 3.3 (2018), pp. 169–175
Shu, Y., Xu, Y.: End-to-End Captcha Recognition Using Deep CNNRNN Network. In: IEEE 3rd IMCEC. IEEE. 2019, pp. 54–58
Slimane, F., et al.: A new Arabic printed text image database and evaluation protocols. In: 2009 10th International Conference on Document Analysis and Recognition. IEEE. 2009, pp. 946–950
Suvarnam, B., Ch, V.S.: Combination of CNN-GRU model to recognize characters of a license plate number without segmentation. In: 5th ICACCS. IEEE. 2019, pp. 317–322
Yousef, M., Hussain, K.F., Mohammed, U.S.: Accurate, Data-Efficient, unconstrained text recognition with convolutional neural networks. In: CoRR abs/1812.11894 (2018). arXiv: 1812.11894
Yousfi, S.: Embedded Arabic text detection and recognition in videos. PhD thesis. Université de Lyon, 2016
Zeki, A., Zakaria, M., Liong, C.: The Use of Area-Voronoi diagram in separating Arabic text connected components. In: 3rd ACEA. 2007, pp. 251–288
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ghanmi, N., Belhakimi, A., Awal, AM. (2024). CNN-BLSTM Model for Arabic Text Recognition in Unconstrained Captured Identity Documents. In: Foresti, G.L., Fusiello, A., Hancock, E. (eds) Image Analysis and Processing - ICIAP 2023 Workshops. ICIAP 2023. Lecture Notes in Computer Science, vol 14365. Springer, Cham. https://doi.org/10.1007/978-3-031-51023-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-51023-6_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51022-9
Online ISBN: 978-3-031-51023-6
eBook Packages: Computer ScienceComputer Science (R0)