Abstract
Optical character recognition for the English text may be considered one of the most important research topics, whether, printed or handwritten. Although excellent results have been reached in the English text, there is a lack of this type of research in the Arabic text. This is because of the nature of the Arabic alphabet, and the multiplicity of forms of the same letter. Arabic handwritten character recognition (AHCR) systems involve several issues, and challenges from finding a suitable, and public Arabic handwritten text dataset phase to recognition, and classification phase passing through segmentation, and feature extraction phases. The paper objectives are: Firstly, a large, and complex Arabic handwritten characters’ dataset (HMBD) is presented for training, testing, and validation phases, as well as, discussing its collection, preparation, cleaning, and preprocessing. Secondly, we introduce a deep learning (DL) system with two convolutional neural network (CNN) architectures (named HMB1 and HMB2); with the appliance of optimization, regularization, and dropout techniques. This system can serve as a baseline for future research on handwritten Arabic text. Different performance metrics were calculated such as accuracy, recall, precision, and F1. 16 experiments were applied to the described system using HMBD, and another two datasets: CMATER, and AIA9k. Experiments’ results were captured and compared to study the effects of weight initializers, optimizers, data augmentation, and regularization on overfitting, and accuracy. He Uniform weight initializer and AdaDelta optimizer reported the highest accuracies. Data augmentation showed an improvement in the accuracies. HMB1 reported testing accuracy of 98.4% with 865,840 records using augmentation on HMBD. CMATER and AIA9k datasets were used for validating the generalization. Data augmentation was applied, and the best results were 100%, and 99.0% for testing accuracies, respectively. A cross-over validation between the described architectures, and a previous state-of-the-art architecture, and dataset was performed in two phases. First, the previous control architecture cannot generalize for the presented dataset in the current study. Second, the study described architectures generalize for the control dataset, with higher accuracies (97.3%, and 96.8% for HMB1, and HMB2, respectively), than the reported accuracy in the selected control study.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Abbreviations
- AdaDelta:
-
An adaptive learning rate method
- Adam:
-
A method for stochastic optimization
- AHCR:
-
Arabic handwritten character recognition
- AHCR-DLS:
-
Arabic handwritten character recognition deep learning system
- CNN:
-
Convolutional neural network
- DL:
-
Deep learning
- ReLU:
-
Rectified linear unit
- SGD:
-
Stochastic gradient descent
- UN:
-
United Nations
References
Ridout S (2019) Complete list of Arabic speaking countries—2020 update. http://istizada.com/complete-list-of-arabic-speaking-countries-2014/. Accessed 18-12-2019
Versteegh K (2014) Arabic language. Edinburgh University Press, Edinburgh
Suleiman Y (2003) The Arabic language and national identity. Edinburgh University Press, Edinburgh
Shaalan K, Al-Sheikh S, Oroumchian F (2012) Query expansion based-on similarity of terms for improving Arabic information retrieval. In: International conference on intelligent information processing, pp 167–176
El-Desouky AI, Salem MM, El-Gwad AOA, Arafat H (1991) A handwritten Arabic character recognition technique for machine reader. In: Third international conference on software engineering for real time systems, pp 212–216
Shirko O, Omar N, Arshad H, Albared M (2010) Machine translation of noun phrases from Arabic to English using transfer-based approach. J Comput Sci 6:350
Klatt DH (1987) Review of text-to-speech conversion for English. J Acoust Soc Am 82:737–793
Bijl D, Hyde-Thomson H (2001) Speech to text conversion. Google Patents
Ashiquzzaman A, Tushar AK, Rahman A, Mohsin F (2019) An efficient recognition method for handwritten arabic numerals using CNN with data augmentation and dropout. In: Data management, analytics and innovation. Springer, 2019, pp 299–309
Deng D, Liu H, Li X, Cai D (2018) Pixellink: detecting scene text via instance segmentation. In: Thirty-second AAAI conference on artificial intelligence
Korns MF, May T (2019) Strong typing, swarm enhancement, and deep learning feature selection in the pursuit of symbolic regression-classification. In: Genetic programming theory and practice XVI. Springer, pp 59–84
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146
Wang Y, Xu W (2018) Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis Support Syst 105:87–95
Chatterjee A, Gupta U, Chinnakotla MK, Srikanth R, Galley M, Agrawal P (2019) Understanding emotions in text using deep learning and big data. Comput Hum Behav 93:309–317
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems 25 (NIPS 2012)
Habibi Aghdam H, Jahani Heravi E (2017) Convolutional neural networks. In: Guide to convolutional neural networks: a practical application to traffic-sign detection and classification. Springer, Cham, pp 85–130
Govindan V, Shivaprasad A (1990) Character recognition—a review. Pattern Recognit 23:671–683
Hamid A, Haraty R (2001) A neuro-heuristic approach for segmenting handwritten Arabic text. In: Proceedings ACS/IEEE international conference on computer systems and applications, 2001, pp 110–113
Pal U, Chaudhuri B (2004) Indian script character recognition: a survey. Pattern Recognit 37:1887–1899
Biadsy F, Saabni R, El-Sana J (2011) Segmentation-free online Arabic handwriting recognition. Int J Pattern Recognit Artif Intell 25:1009–1033
Tappert CC, Suen CY, Wakahara T (1990) The state of the art in online handwriting recognition. IEEE Trans Pattern Anal Mach Intell 12:787–808
Plamondon R, Srihari SN (2000) Online and off-line handwriting recognition: a comprehensive survey. IEEE Trans Pattern Anal Mach Intell 22:63–84
El-Sawy A, Loey M, Hazem E (2017) Arabic handwritten characters recognition using convolutional neural network. WSEAS Trans Comput Res 5:11–19
Younis KS (2017) Arabic handwritten character recognition based on deep convolutional neural networks. Jordan J Comput Inf Technol 3:2017
El-Melegy M, Abdelbaset A, Abdel-Hakim A, El-Sayed G (2019) Recognition of Arabic handwritten literal amounts using deep convolutional neural networks, Cham, pp 169–176
Torki M, Husseiny ME, Elsallamy A, Fayyaz M, Yaser S (2014) Window-based descriptors for Arabic handwritten alphabet recognition: a comparative study on a novel dataset. arXiv preprint arXiv:1411.3519
Loey M (31-08-2019) Arabic handwritten characters dataset. https://www.kaggle.com/mloey1/ahcd1
Alamri H, Sadri J, Suen CY, Nobile N (2008) A novel comprehensive database for Arabic off-line handwriting recognition. In: Proceedings of 11th international conference on frontiers in handwriting recognition, ICFHR, 2008, pp 664–669
Eikvil L (1993) OCR-optical character recognition. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.3684
Lensu A (2002) Computationally intelligent methods for qualitative data analysis. No. 23. University of Jyväskylä. https://www.semanticscholar.org/paper/Anssi-Lensu-Computationally-Intelligent-Methods-for-Lensu-Olsbo/f58234bfae6de53aa39110ed69f3438c59cb0304
Vadwala MA, Suthar MK, Karmakar MY, Thakkar N (2017) Survey paper on different speech recognition algorithm: challenges and techniques. Int J Comput Appl 175(1):31–36
Lawgali A (2015) A survey on Arabic character recognition. https://doi.org/10.14257/ijsip.2015.8.2.37
Tanner MA, Wong WH (1987) The calculation of posterior distributions by data augmentation. J Am Stat Assoc 82:528–540
Frühwirth-Schnatter S (1994) Data augmentation and dynamic linear models. J Time Ser Anal 15:183–202
Hamida S, Cherradi B, Ouajji H, Raihani A (2020) Convolutional neural network architecture for offline handwritten characters recognition. In: International conference Europe Middle East & North Africa information systems and technologies to support learning. Springer, Cham, pp 368–377. https://doi.org/10.1007/978-3-030-36778-7_41
Neri CG, Villegas OOV, Sánchez VGC, Nandayapa M, Azuela JHS (2020) A convolutional neural network for handwritten digit recognition. Int J Comb Optim Probl Inform 11:97–105
Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289
Younis K (2018) Arabic handwritten character recognition based on deep convolutional neural networks. Jordanian J Comput Inform Technol 3(3)
Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29:141–142
Torrey L, Shavlik J (2010) Transfer learning. In: Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. IGI Global, 2010, pp 242–264
Pan SJ (2009) Q. J. I. T. o. k. Yang, and d. engineering, A survey on transfer learning, vol 22, pp 1345–1359
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Hidaka A, Kurita T (2017) Consecutive dimensionality reduction by canonical correlation analysis for visualization of convolutional neural networks. In: Proceedings of the ISCIE international symposium on stochastic systems theory and its applications, vol 2017. The ISCIE symposium on stochastic systems theory and its applications 2017
Undrestanding Convolutional Layers in Convolutional Neural Networks (CNNs). http://machinelearninguru.com/computer_vision/basics/convolution/convolution_layer.html
Mallick S, Nayak S (2018, May 22) Number of parameters and tensor sizes in a convolutional neural network (CNN). https://www.learnopencv.com/number-of-parameters-and-tensor-sizes-in-convolutional-neural-network/
van Laarhoven T (2017) L2 regularization versus batch and weight normalization. arXiv preprint arXiv:1706.05350
Hara K, Saito D, Shouno H (2015) Analysis of function of rectified linear unit used in deep learning. In: 2015 international joint conference on neural networks (IJCNN), pp 1–8
Dunne RA, Campbell NA (1997) On the pairing of the softmax activation and cross-entropy penalty functions and the derivation of the softmax activation function. In: Proceedings of 8th Australian conference on neural networks, Melbourne, 1997, p 185
Koturwar S, Merchant S (2017) Weight initialization of deep neural networks (DNNS) using data statistics. arXiv preprint arXiv:1710.10570
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: International conference on machine learning, 2013, pp 1139–1147
LeCun YA, Bottou L, Orr GB, Müller K-R (2012) Efficient backprop. In: Neural networks: tricks of the trade. Springer, pp 9–48
Klambauer G, Unterthiner T, Mayr A, Hochreiter S (2017) Self-normalizing neural networks. In: Advances in neural information processing systems, pp 971–980
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, 2015, pp 1026–1034
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Reddi SJ, Kale S, Kumar S (2019) On the convergence of adam and beyond. arXiv preprint arXiv:1904.09237
Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
Bottou L (2012) Stochastic gradient descent tricks. In: Neural networks: tricks of the trade. Springer, pp 421–436
Gulli A, Pal S (2017) Deep learning with Keras. Packt Publishing Ltd, Birmingham
Bisong E (2019) Google colaboratory. In: Building machine learning and deep learning models on Google cloud platform. Springer, pp 59–64
Joshi R (2016, September 9) Accuracy, precision, recall & F1 score: interpretation of performance measures. https://blog.exsilio.com/all/accuracy-precision-recall-f1-score-interpretation-of-performance-measures/
Accuracy, Precision, Recall and F1 Scores for Lawyers. (October 10, 2019). https://lawtomated.com/accuracy-precision-recall-and-f1-scores-for-lawyers/
Nicholson C. Evaluation metrics for machine learning—accuracy, precision, recall, and F1 defined. https://pathmind.com/wiki/accuracy-precision-recall-f1
Chase Lipton Z, Elkan C, Narayanaswamy B (2014) Thresholding classifiers to maximize F1 score. arXiv preprint arXiv:1402.1892
Powers DM (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. https://dspace2.flinders.edu.au/xmlui/handle/2328/27165
Goutte C, Gaussier E (2005) A probabilistic interpretation of precision, recall and F-score, with implication for evaluation, vol 3408
Reed R, Marks RJ II (1999) Neural smithing: supervised learning in feedforward artificial neural networks. MIT Press, Cambridge
Al-Ayyoub M, Nuseir A, Alsmearat K, Jararweh Y, Gupta B (2018) Deep learning for Arabic NLP: a survey. J Comput Sci 26:522–531
Abdelazeem S, El-Sherif E. The Arabic handwritten digits databases: ADBase & MADBase. http://datacenter.aucegypt.edu/shazeem/
Alani A (2017) Arabic handwritten digit recognition based on restricted boltzmann machine and convolutional neural networks. Information 8:142
Ashiquzzaman A, Tushar AK, Rahman A (2017) Applying data augmentation to handwritten arabic numeral recognition using deep learning neural networks, arXiv preprint arXiv:1708.05969
Acknowledgements
We would like to express gratitude and appreciation to Prof. Dr. Magdy H. Balaha, who provided guidance, and assistance in this research work and to the Mansoura university volunteers who decided to cooperate in the dataset construction.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Balaha, H.M., Ali, H.A., Saraya, M. et al. A new Arabic handwritten character recognition deep learning system (AHCR-DLS). Neural Comput & Applic 33, 6325–6367 (2021). https://doi.org/10.1007/s00521-020-05397-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05397-2