Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction
<p>Key OCR steps.</p> "> Figure 2
<p>Binarization exemplified.</p> "> Figure 3
<p>Different levels of segmentation from words to isolated characters.</p> "> Figure 4
<p>Example of tilting.</p> "> Figure 5
<p>Elucidation of how sub-words are formed in Arabic.</p> "> Figure 6
<p>Richness and variation in Arabic fonts.</p> "> Figure 7
<p>Simplified CNN with two Conv layers.</p> "> Figure 8
<p>The general framework of GANs.</p> "> Figure 9
<p>Taxonomy of the analyzed works.</p> ">
Abstract
:1. Introduction
1.1. Significance: Why and How Is This Work Different?
1.2. Motivation and Why This Is an Important and Interesting Work
1.3. Contribution
- For the first time, a thorough review of 83 papers is presented, mainly from 2020 to 2023, to build a holistic review of Arabic OCR using deep learning and Arabic text correction. Moreover, this review elucidates the challenges to overcome for Arabic OCR. In addition, metrics for evaluating the accuracy of OCR systems are presented. Overall, this work presents the challenges, techniques, recent work analysis, and future directions for the use of deep learning for Arabic OCR and text correction.
- Unprecedented in scope, a critical analysis and deep discussion are presented on 25 recent works, from 2020 to 2023, that apply deep learning for Arabic OCR and Arabic text correction. The findings determine the current research frontier, trends, and limitations. This is a gap that no work filled, to the knowledge of the author, in the recent time frame defined above.
- Current challenges and future trends are extrapolated from the gaps that need to be filled in the works analyzed in this study and the possibilities of leveraging applications in other OCR systems for use in Arabic OCR, including datasets, models, and applications.
1.4. Survey Methodology
1.4.1. Scope and Overview
1.4.2. Research Questions
- RQ0: What is OCR? What is deep learning? What are the steps and challenges in Arabic OCR? How is OCR accuracy evaluated?
- RQ1: What are the dee- learning techniques used in Arabic OCR during the period 2020–2023? Which are the notable works in the literature?
- RQ2: What are the deep-learning techniques used in Arabic text correction for the period 2020–2023? Which are the notable works in the literature?
- RQ3: What were the limitations and challenges involved in the use of deep-learning techniques in Arabic OCR and text correction for the period 2020–2023?
- RQ4: What are the findings, recommendations, and future trends for deep-learning techniques to be used in Arabic OCR and text correction?
1.4.3. Sources
1.4.4. Keywords
1.4.5. Inclusion–Exclusion
- The work must be written in English;
- The work must be published as a research article, whether in journals or conferences, or as a thesis. Preprints are included;
- In the analysis part, the work must apply deep-learning techniques;
- In the analysis part, the work must include Arabic OCR or text correction;
- The work should not be a review;
- The work must be from the period from 2020 to 2023;
- The work must not be duplicated.
1.4.6. Screening and Selection
1.5. Structure
2. OCR Steps [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36]
- Image acquisition: Obtain the photograph that is captured from an exterior origin through a tool such as a camera or a scanner;
- Pre-processing: Techniques that are applied to the acquired images in order to enhance their quality. For example, removing the noise or the skew, thinning, and binarization, such as in Figure 2. In addition, sometimes, lossy filtering is performed as a preprocessing step to create a graph representation of characters by discarding non-isomorphic pairs based on simple graph properties such as the number of edges and degree distribution before moving to fine-grained filtering using deep neural networks;
- Segmentation: The splitting of the image into the characters, words, or lines constituting it. Basic techniques include component analysis and projection profiles. Moreover, histograms can be utilized to study segmentation. An example is shown in Figure 3. In addition, thinning and segmentation are performed after making the character rotation invariant;
- Feature extraction: The extraction of features that help in learning the representation. The features can be geometrical, such as points and loops, or statistical, such as moments. The most closely related features that are representative and not redundant should be chosen after the extraction;
- Classification: Used to predict the mapping between the character and its class. Different classification techniques can be used and, recently, deep-learning-based techniques have used models that are based on deep neural networks;
- Post-processing: Further corrections of the output of the OCR to reduce the number of errors. Different techniques can be used for post-processing. Some are based on devising a dictionary to check spellings; others are based on utilizing probabilistic models, such as Markov chains and N-grams. With the advancement in natural language processing, in the case of word misclassification, using dictionaries to identify alternatives can provide multiple options. To determine the word that is most likely to be accurate, the context at the sentence level of the text needs to be explored, which requires the techniques of NLP and semantic tagging. Currently, text-embedding models and techniques, such as BERT, and their variations can be utilized for this step.
3. Arabic OCR Challenges
3.1. OCR Challenges [37,38,39]
- Scene complexity: The segregation of text from non-text can be difficult in cases in which the rate of noise is high or when there are strokes similar to text, but which are part of the background and are not texts themselves;
- Conditions of uneven lighting: Due to the presence of shadows, pictures taken of any document can have the problem of uneven light distribution, in which part of the image has more light while the other is quite dark. Thus, scanned images are preferred, since they eliminate such unevenness;
- Skewness (rotation): The text has skewed lines and deviates from the distinctive orientation. This can be a consequence of manually taking pictures;
- Blurring and degradation: The grade of the image can be that of low quality. A remedy for this problem is to sharpen the characters;
- Aspect ratios: Different aspect ratios can require different search procedures to detect the text;
- Tilting (perspective distortion): The page itself may not appear in a correct, parallel perspective. The taking of photographs of texts by non-experts without the use of a scanner can account for such non-parallelism, making the task difficult. A simple solution is to use scanners or to try to take photographs for each page while perceiving the perspectives. An example is shown in Figure 4;
- Warping: This problem is related to the text itself. The page may be perfectly parallel at the time at which the picture is taken; however, the nature of the text and how it is written are warped, which is not usually the case in writing;
- Fonts: Printed fonts and styles vary from those that are handwritten. The same writer can write the same passage in the same space in a different way each time. On the other hand, printed text is easier because the differences tend to be less frequent from one print to another;
- Multilingual and multi-font environments: Different languages feature different complexities. Some languages use cursive scripts with connected characters, as in the Arabic language. Moreover, recent individual pages can contain multiple languages, while old writings can have multiple fonts within the same page if they are written by different writers.
3.2. Arabic OCR Challenges [19,20,21]
- Arabic is written from right to left, while most models are more used to the opposite direction;
- Arabic consists of a minimum of 28 characters, and each character has a minimum of two and up to four shapes, depending on the position. Fortunately, no upper- or lower-case characters are used in Arabic;
- The position determines the shape of a character. Therefore, a character can have different shapes depending on its position: alone, at the start, in the middle, or at the end. For example, the letter Meem has different shapes according to its position;
- Ijam: Dots can be written above or below characters, and they differentiate between one character and another, such as Baa and Tha;
- Tashkeel: Marks such as Tanween and Harakat can change the context of a piece of writing. Examples applied to the letter Raa are illustrated in Table 1;
- Hamza, in various locations, such as above or below Alif, above Waw, after or above Yaa, and in the middle or end of a word;
- Loops, such as Saad and Taa, and Meem and Ain, where the difference between characters is difficult to distinguish due to the similarity of the skeleton;
- Separated characters separate words, such as Raa, Dal, and Waw. Figure 5 presents an example with different kinds of separation;
- The total number of shapes is at least 81. Some letters share the same shape but differ in terms of dots;
- Without dots, there are at least 52 letter shapes;
- The structural characteristics of the characters, given their high similarity in shape, can be perplexing. Moreover, a slight modification can turn one character into another;
- Arabic is cursively written. Therefore, connections are formed when two characters are connected;
- The presence of other diacritics, such as Shadda, and punctuation, such as full stops can be confused with the dots that are used in Arabic;
- Most models suffer from overfitting, since generalization is not a concern, partly because of the lack of annotated datasets and partly because complex models are trained on small sets or sets that are not balanced;
- The available datasets have the issue of distorted and unclear samples;
- The majority of datasets suffer from imbalances, in which the distribution is not even between the samples, which can affect the model’s performance by overfitting to these specific samples;
- The writing style can vary from one author to another. Figure 6 shows different types of writing style.
4. Deep-Learning Techniques
4.1. Convolutional Neural Network (CNN) [43]
4.2. Long Short-Term Memory (LSTM) [44]
4.3. Connectionist Temporal Classification (CTC) [44]
4.4. Transformer [45]
4.5. Generative Adversarial Networks (GANs) [46]
5. Evaluation Metrics
6. Analysis of Recent Works
6.1. OCR Deep-Learning Works
6.1.1. CNN-LSTM-CTC Based-Works [47,48,49,50,51,52,53]
6.1.2. Transformer-Based Works [55,56]
6.1.3. GANs-Based Works [57,58,59,60]
6.2. Post-OCR Correction Works [61,62,63,64,65,66,67,68,69,70,71]
7. Discussion
7.1. OCR Deep-Learning Works
7.1.1. CNN-LSTM-CTC Based-Works [47,48,49,50,51,52,53]
- CNN-LSTM-CTC works are tested on test sets that have no fair split in proportion to the training sets or are tested on sets that are not sufficiently challenging. Therefore, high accuracies could be a sign of overfitting. This issue can be fixed by training with an appropriate proportion between the training and testing splits. Moreover, this indicates the need for more challenging datasets;
- CNN-LSTM-CTC-based works can take a long time to train, especially if BiLSTM is devised. Moreover, they achieve low word-recognition rates in challenging settings;
- CNN-LSTM-CTC is currently the modus operandi architecture due to the fact that it handles spatial relations, long textual dependencies, and text alignment.
7.1.2. Transformer-Based Works [55,56]
- Transformer-based works are limited by computational resources and take significantly longer to train. Moreover, they require more annotated training data than other architectures;
- Transformer architecture has the potential to provide robust results. Nevertheless, latency is produced during the search;
- Transformer-based works have a higher number of parameters than other architectures;
- Transformer-based works are preferred if more models are developed that have compatible weights, avoiding the random initialization of weights, which, it is suggested, is responsible for lowering the recognition rate.
7.1.3. GANs-Based Works [57,58,59,60]
- GANs works are sensitive to balanced/unbalanced datasets. This is due to the fact that GANs are usually devised as a way to augment data before they are classified using another classifier. The use of GANs is a method to combat the limitations of Arabic datasets. However, the overall performance becomes dependent on the application of another model for classification. In essence, this adds layers of complexity to the model and does not guarantee that it can achieve high accuracy on challenging sets;
- GANs suffer when generating letters with dots, as they can be considered noise;
- GANs works accuracy is mostly inappropriately reported.
- LSTM-CNN-CTC is the current preferred architecture for Arabic OCR. This is due to the fact that this combination allows the handling of spatial relations, long text dependencies, and their alignments, and it does not require as many data as Transformer and GANs;
- Transformer is the most promising architecture for future works when more annotated and domain-specific data are made available;
- GANs are usually used to augment datasets, but they are sensitive to the dataset’s state of balance. In addition, GANs must be combined with a classifier, which contributes to enhancing the accuracy on one hand and the model’s complexity on the other;
- There is a crucial need for more public handwritten Arabic OCR datasets in terms of quantity, quality, scope, and font diversity in order for OCR research to progress.
7.2. Post-OCR-Correction Works [61,62,63,64,65,66,67,68,69,70,71]
- Uneven distribution and a lack of error types are common issues in text-correction datasets;
- Text-correction models suffer when faced with challenging examples, such as the presence of dialect words and the correction of punctuation errors;
- The problem of the ineffective testing of models is further extended to text-correction works, as this problem also exists in OCR works;
- More complex models, such as DBNs, take longer to train;
- There is a need for more text-correction datasets that are context-aware;
- The effectiveness of the application of embedding techniques to correct text, such as FastText and BERT, was shown; however, this approach is still in its infancy. Moreover, it has potential and should be explored further in Arabic.
7.3. Robust OCR and Text-Correction Techniques
8. Challenges and Future Trends
8.1. Arabic-OCR-Dataset Challenges
- Most Arabic OCR datasets are simple, made for simple use cases, unaware of context, and domain-agnostic. As problems become more complex, the need arises for more sophisticated, high-quality, and challenging datasets that are domain-specific; for example, datasets of common handwriting are not sufficient when there is a need to recognize ancient writings;
- Annotation is a problem faced by any user working on any Arabic OCR due to the lack of Arabic datasets. This can be overcome either by producing more advanced Arabic datasets or by developing solutions to accelerate the Arabic-dataset-annotation process;
- There is a need for more challenging multilingual datasets, including the Arabic language, that can be used in challenging cross-lingual and cross-cultural scenarios as a step towards general artificial intelligence;
- Multi-font is a property that current Arabic OCR datasets lack. Considerable efforts should be centralized around the gathering or creation of datasets with multiple authentic types of handwritten font, preferably through collaboration with calligraphists, to generalize the recognition ability beyond the commonly used fonts. This step will help to cover a large spectrum of fonts, enabling the era-wise geospatial–temporal typographical analysis of Arabic fonts and scripts.
8.2. Arabic OCR Models Challenges
- Models are still in their infancy stage regarding the investigation of the effect of applying different Arabic text embeddings, such as BERT and FastText. This is a promising direction for researchers to follow;
- Transformer-based architectures showed promised results. However, it is still a challenge for practitioners to develop solutions using these architectures unless ways to deal with the lack of Arabic datasets and computational resources are developed;
- Experiments have been conducted with approaches that use N-shot in complex languages that feature strokes, such as Chinese. Nevertheless, these approaches have not yet been explored in Arabic. It is postulated that these approaches may show effectiveness if transferred to a rich language, such as Arabic [72];
- It remains an open challenge for researchers to develop models that are able to recognize handwritten Arabic text in cross-lingual and multi-font environments [73].
8.3. Potential Future Arabic OCR Applications
- Education: Arabic OCR can be used by educators to automate the scoring of handwritten essays [74]. Moreover, calligraphists can apply this technique in calligraphy classes to assess handwriting calligraphy quantitatively instead of through time-consuming qualitative assessment [75]. In addition, universities, colleges, and schools can develop solutions to automate their admissions processes and the paperwork required during registration [76];
- Legal: Arabic legal sectors frequently use handwritten paperwork, as do courts during trial sessions. The harnessing of OCR capabilities could accelerate these processes and decrease the burden of documenting handwritten text in typed form [77];
- Governmental and healthcare sectors: The conversion to electronic records of documents that are not computer-based or that are usually filled out by hand can take tremendous amounts of time. This can be overcome by developing methodologies for applying OCR solutions in such use cases for the purpose of effective digitization;
- Studies: Many fields of study and focus on the handwritten form of writing. For instance, typography involves the anatomical design of the font, which can be studied in depth and can be further used by deep -earning researchers to generate handwritten Arabic that is as close as possible to human handwriting [78]. Furthermore, historical studies can gain enormous benefits from OCR in converting ancient texts to digital form. However, this necessarily requires the development of Arabic OCR solutions that can appropriately deal with complex font conditions, as there is a lack of investigations on this topic [79];
- Forensics and IoT: In forensics, there is a lack of work that addresses the issue of identifying writers from their handwriting. Much more advanced Arabic OCR is required for this purpose. Additionally, this can be further extended semantically to differentiate writers through both the ability to recognize images and the meanings of words [80]. Moreover, the IoT can be integrated with OCR to provide a mechanism for signature verification [81];
- Psychological Research: Psychological studies involving the use of Arabic handwriting to examine personal traits are examples of the use of OCR within psychological research methods. This field is still in its infancy. Moreover, if appropriate training data are fed to the model, OCR can automate the classification process in the research and help psychology researchers make predictions [82];
- Videos: The application of OCR to recognize Arabic handwriting in sequences of photographs in sequential time frames is an area that is open to research. Moreover, the application of OCR in real-time adds a layer of complexity, but makes OCR more practical [83].
9. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Raj, R.; Kos, A. A Comprehensive Study of Optical Character Recognition. In Proceedings of the 29th International Conference on Mixed Design of Integrated Circuits and System (MIXDES), Lodz, Poland, 25–27 June 2022; pp. 151–154. [Google Scholar] [CrossRef]
- de Oliveira, L.L.; Vargas, D.S.; Alexandre, A.M.A.; Cordeiro, F.C.; Gomes, D.D.S.M.; Rodrigues, M.D.C.; Romeu, R.K.; Moreira, V.P. Evaluating and mitigating the impact of OCR errors on information retrieval. Int. J. Digit. Libr. 2023, 24, 45–62. [Google Scholar] [CrossRef]
- Xu, Q.; Wang, L.; Liu, H.; Liu, N. LayoutLM-Critic: Multimodal Language Model for Text Error Correction of Optical Character Recognition. In Proceedings of the Artificial Intelligence and Robotics 7th International Symposium, Shanghai, China, 21–23 October 2022; pp. 136–146. [Google Scholar] [CrossRef]
- Hajiali, M.; Cacho, J.R.F.; Taghva, K. Generating Correction Candidates for OCR Errors using BERT Language Model and FastText SubWord Embeddings. In Intelligent Computing, Proceedings of the 2021 Computing Conference, Online, 15–16 July 2021; Springer: Cham, Switzerland, 2022; pp. 1045–1053. [Google Scholar] [CrossRef]
- Ravi, S.; Chauhan, S.; Yadlapallii, S.H.; Jagruth, K.; Manikandan, V.M. A Novel Educational Video Retrieval System Based on the Textual Information. In Lecture Notes in Networks and Systems, Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021), Online, 15–17 December 2021; Springer: Cham, Switzerland, 2022; pp. 502–511. [Google Scholar] [CrossRef]
- Hsu, T.-C.; Chang, C.; Jen, T.-H. Artificial Intelligence image recognition using self-regulation learning strategies: Effects on vocabulary acquisition, learning anxiety, and learning behaviours of English language learners. Interact. Learn. Environ. 2023, 31, 1–19. [Google Scholar] [CrossRef]
- Bhattamisra, S.K.; Banerjee, P.; Gupta, P.; Mayuren, J.; Patra, S.; Candasamy, M. Artificial Intelligence in Pharmaceutical and Healthcare Research. Big Data Cogn. Comput. 2023, 7, 10. [Google Scholar] [CrossRef]
- Ranjan, S.; Sanket, S.; Singh, S.; Tyagi, S.; Kaur, M.; Rakesh, N.; Nand, P. OCR based Automated Number Plate Text Detection and Extraction. In Proceedings of the 9th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 23–25 March 2022; pp. 621–627. [Google Scholar] [CrossRef]
- Onim, S.H.; Nyeem, H.; Roy, K.; Hasan, M.; Ishmam, A.; Akif, A.H.; Ovi, T.B. BLPnet: A new DNN model and Bengali OCR engine for Automatic Licence Plate Recognition. Array 2022, 15, 100244. [Google Scholar] [CrossRef]
- Azadbakht, A.; Kheradpisheh, S.R.; Farahani, H. MultiPath ViT OCR: A Lightweight Visual Transformer-based License Plate Optical Character Recognition. In Proceedings of the 12th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, 17–18 November 2022; pp. 92–95. [Google Scholar] [CrossRef]
- Bi, S.; Wang, C.; Zhang, J.; Huang, W.; Wu, B.; Gong, Y.; Ni, W. A Survey on Artificial Intelligence Aided Internet-of-Things Technologies in Emerging Smart Libraries. Sensors 2022, 22, 2991. [Google Scholar] [CrossRef]
- Qureshi, F.; Rajput, A.; Mujtaba, G.; Fatima, N. A novel offline handwritten text recognition technique to convert ruled-line text into digital text through deep neural networks. Multimed. Tools Appl. 2022, 81, 18223–18249. [Google Scholar] [CrossRef]
- Singh, T.P.; Gupta, S.; Garg, M. A Review on Online and Offline Handwritten Gurmukhi Character Recognition. In Proceedings of the 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 13 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Tan, Y.F.; Connie, T.; Goh, M.K.O.; Teoh, A.B.J. A Pipeline Approach to Context-Aware Handwritten Text Recognition. Appl. Sci. 2022, 12, 1870. [Google Scholar] [CrossRef]
- Ott, F.; Rügamer, D.; Heublein, L.; Bischl, B.; Mutschler, C. Representation Learning for Tablet and Paper Domain Adaptation in Favor of Online Handwriting Recognition. arXiv 2023. [Google Scholar] [CrossRef]
- Ghosh, T.; Sen, S.; Obaidullah, S.; Santosh, K.; Roy, K.; Pal, U. Advances in online handwritten recognition in the last decades. Comput. Sci. Rev. 2022, 46, 100515. [Google Scholar] [CrossRef]
- Statista. The Most Spoken Languages Worldwide 2022. Available online: https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/ (accessed on 6 February 2023).
- Haddad, B.; Awwad, A.; Hattab, M.; Hattab, A. PRo-Pat: Probabilistic Root–Pattern Bi-gram data language model for Arabic based morphological analysis and distribution. Data Brief 2023, 46, 108875. [Google Scholar] [CrossRef]
- Al Waqfi, Y.M.; Mohamad, M. A Review of Arabic Optical Character Recognition Techniques & Performance. Int. J. Eng. Trends Technol. 2020, 1, 44–51. [Google Scholar] [CrossRef]
- Mohd, M.; Qamar, F.; Al-Sheikh, I.; Salah, R. Quranic Optical Text Recognition Using Deep Learning Models. IEEE Access 2021, 9, 38318–38330. [Google Scholar] [CrossRef]
- Alrobah, N.; Albahli, S. Arabic Handwritten Recognition Using Deep Learning: A Survey. Arab. J. Sci. Eng. 2022, 47, 9943–9963. [Google Scholar] [CrossRef]
- Moudgil, A.; Singh, S.; Gautam, V. Recent Trends in OCR Systems: A Review. In Machine Learning for Edge Computing; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
- Avyodri, R.; Lukas, S.; Tjahyadi, H. Optical Character Recognition (OCR) for Text Recognition and its Post-Processing Method: A Literature Review. In Proceedings of the 1st International Conference on Technology Innovation and Its Applications (ICTIIA), Tangerang, Indonesia, 23 September 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Xu, M.; Yoon, S.; Fuentes, A.; Park, D.S. A Comprehensive Survey of Image Augmentation Techniques for Deep Learning. Pattern Recognit. 2023, 137, 109347. [Google Scholar] [CrossRef]
- Mijwil, M.; Salem, I.E.; Ismaeel, M.M. The Significance of Machine Learning and Deep Learning Techniques in Cybersecurity: A Comprehensive Review. Iraqi J. Comput. Sci. Math. 2023, 4, 87–101. [Google Scholar] [CrossRef]
- Aggarwal, K.; Mijwil, M.M.; Sonia; Al-Mistarehi, A.-H.; Alomari, S.; Gök, M.; Alaabdin, A.M.Z.; Abdulrhman, S.H. Has the Future Started? The Current Growth of Artificial Intelligence, Machine Learning, and Deep Learning. Iraqi J. Comput. Sci. Math. 2022, 3, 115–123. [Google Scholar] [CrossRef]
- Gupta, J.; Pathak, S.; Kumar, G. Deep Learning (CNN) and Transfer Learning: A Review. J. Physics Conf. Ser. 2022, 2273, 012029. [Google Scholar] [CrossRef]
- Setyanto, A.; Laksito, A.; Alarfaj, F.; Alreshoodi, M.; Kusrini; Oyong, I.; Hayaty, M.; Alomair, A.; Almusallam, N.; Kurniasari, L. Arabic Language Opinion Mining Based on Long Short-Term Memory (LSTM). Appl. Sci. 2022, 12, 4140. [Google Scholar] [CrossRef]
- Atenco, J.C.; Moreno, J.C.; Ramirez, J.M. Audiovisual Biometric Network with Deep Feature Fusion for Identification and Text Prompted Verification. Algorithms 2023, 16, 66. [Google Scholar] [CrossRef]
- Zhou, C.; Li, Q.; Li, C.; Yu, J.; Liu, Y.; Wang, G.; Zhang, K.; Ji, C.; Yan, Q.; He, L.; et al. A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT. arXiv 2023. [Google Scholar] [CrossRef]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2020, 21, 5485–5551. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 6–12 December 2020; Volume 33, pp. 1877–1901. Available online: https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html (accessed on 6 February 2023).
- Zhuo, T.Y.; Huang, Y.; Chen, C.; Xing, Z. Exploring AI Ethics of ChatGPT: A Diagnostic Analysis. arXiv 2023. [Google Scholar] [CrossRef]
- CER-a Hugging Face Space by Evaluate-Metric. Available online: https://huggingface.co/spaces/evaluate-metric/cer (accessed on 6 February 2023).
- WER-a Hugging Face Space by Evaluate-Metric. Available online: https://huggingface.co/spaces/evaluate-metric/wer (accessed on 6 February 2023).
- Darwish, S.M.; Elzoghaly, K.O. An Enhanced Offline Printed Arabic OCR Model Based on Bio-Inspired Fuzzy Classifier. IEEE Access 2020, 8, 117770–117781. [Google Scholar] [CrossRef]
- Emon, I.H.; Iqbal, K.N.; Mehedi, H.K.; Mahbub, M.J.A.; Rasel, A.A. A Review of Optical Character Recognition (OCR) Techniques on Bengali Scripts. In Emerging Technologies in Computing; Springer: Cham, Switzerland, 2023; pp. 85–94. [Google Scholar] [CrossRef]
- Gan, J.; Chen, Y.; Hu, B.; Leng, J.; Wang, W.; Gao, X. Characters as graphs: Interpretable handwritten Chinese character recognition via Pyramid Graph Transformer. Pattern Recognit. 2023, 137, 109317. [Google Scholar] [CrossRef]
- Singh, S.; Garg, N.K.; Kumar, M. Feature extraction and classification techniques for handwritten Devanagari text recognition: A survey. Multimed. Tools Appl. 2022, 82, 747–775. [Google Scholar] [CrossRef]
- Hijji, M.; Iqbal, R.; Pandey, A.K.; Doctor, F.; Karyotis, C.; Rajeh, W.; Alshehri, A.; Aradah, F. 6G Connected Vehicle Framework to Support Intelligent Road Maintenance Using Deep Learning Data Fusion. IEEE Trans. Intell. Transp. Syst. 2023, 1–10. [Google Scholar] [CrossRef]
- de Oliveira, R.A.; Bollen, M.H. Deep learning for power quality. Electr. Power Syst. Res. 2023, 214, 108887. [Google Scholar] [CrossRef]
- Wang, R.; Bashyam, V.; Yang, Z.; Yu, F.; Tassopoulou, V.; Chintapalli, S.S.; Skampardoni, I.; Sreepada, L.P.; Sahoo, D.; Nikita, K.; et al. Applications of generative adversarial networks in neuroimaging and clinical neuroscience. Neuroimage 2023, 269, 119898. [Google Scholar] [CrossRef]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
- Huang, X.; Qiao, L.; Yu, W.; Li, J.; Ma, Y. End-to-End Sequence Labeling via Convolutional Recurrent Neural Network with a Connectionist Temporal Classification Layer. Int. J. Comput. Intell. Syst. 2020, 13, 341–351. [Google Scholar] [CrossRef] [Green Version]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. Available online: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html (accessed on 7 February 2023).
- Sajun, A.R.; Zualkernan, I. Survey on Implementations of Generative Adversarial Networks for Semi-Supervised Learning. Appl. Sci. 2022, 12, 1718. [Google Scholar] [CrossRef]
- Noubigh, Z.; Mezghani, A.; Kherallah, M. Transfer Learning to improve Arabic handwriting text Recognition. In Proceedings of the 21st International Arab Conference on Information Technology (ACIT), Giza, Egypt, 28–30 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Boualam, M.; Elfakir, Y.; Khaissidi, G.; Mrabti, M. Arabic Handwriting Word Recognition Based on Convolutional Recurrent Neural Network. In WITS 2020, Proceeding of the 6th International Conference on Wireless Technologies, Embedded, and Intelligent Systems, Fez, Morocco, 14–16 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 877–885. [Google Scholar]
- Shtaiwi, R.E.; Abandah, G.A.; Sawalhah, S.A. End-to-End Machine Learning Solution for Recognizing Handwritten Arabic Documents. In Proceedings of the 13th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 21–23 June 2022; pp. 180–185. [Google Scholar] [CrossRef]
- Alzrrog, N.; Bousquet, J.-F.; El-Feghi, I. Deep Learning Application for Handwritten Arabic Word Recognition. In Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Halifax, NS, Canada, 18–20 September 2022; pp. 95–100. [Google Scholar] [CrossRef]
- Alkhawaldeh, R.S. Arabic (Indian) digit handwritten recognition using recurrent transfer deep architecture. Soft Comput. 2020, 25, 3131–3141. [Google Scholar] [CrossRef]
- Fasha, M.; Hammo, B.; Obeid, N.; Al Widian, J. A Hybrid Deep Learning Model for Arabic Text Recognition. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 31. [Google Scholar] [CrossRef]
- Dölek, I.; Kurt, A. A deep learning model for Ottoman OCR. Concurr. Comput. Pract. Exp. 2022, 34, e6937. [Google Scholar] [CrossRef]
- Khaled, M.M.; Alzebdeh, A.; Lataifeh, M.; Lulu, L.; Elnagar, A.M. A Hybrid Deep Learning Approach for Arabic Handwritten Recognition: Exploring the Complexities of the Arabic Language; Rochester: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
- Mostafa, A.; Mohamed, O.; Ashraf, A.; Elbehery, A.; Jamal, S.; Khoriba, G.; Ghoneim, A.S. OCFormer: A Transformer-Based Model For Arabic Handwritten Text Recognition. In Proceedings of the International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 27 May 2021; pp. 182–186. [Google Scholar] [CrossRef]
- Momeni, S.; Babaali, B. Arabic Offline Handwritten Text Recognition with Transformers. Res. Sq. 2022. [Google Scholar] [CrossRef]
- Alwaqfi, Y.; Mohamad, M.; Al-Taani, A. Generative Adversarial Network for an Improved Arabic Handwritten Characters Recognition. Int. J. Adv. Soft Comput. Its Appl. 2022, 14, 177–195. [Google Scholar] [CrossRef]
- Eltay, M.; Zidouri, A.; Ahmad, I.; Elarian, Y. Generative adversarial network based adaptive data augmentation for handwritten Arabic text recognition. PeerJ Comput. Sci. 2022, 8, e861. [Google Scholar] [CrossRef]
- Mustapha, I.B.; Hasan, S.; Nabus, H.; Shamsuddin, S.M. Conditional Deep Convolutional Generative Adversarial Networks for Isolated Handwritten Arabic Character Generation. Arab. J. Sci. Eng. 2021, 47, 1309–1320. [Google Scholar] [CrossRef]
- Jemni, S.K.; Souibgui, M.A.; Kessentini, Y.; Fornés, A. Enhance to read better: A Multi-Task Adversarial Network for Handwritten Document Image Enhancement. Pattern Recognit. 2021, 123, 108370. [Google Scholar] [CrossRef]
- Ben Aichaoui, S.; Hiri, N.; Dahou, A.H.; Cheragui, M.A. Automatic Building of a Large Arabic Spelling Error Corpus. SN Comput. Sci. 2022, 4, 108. [Google Scholar] [CrossRef]
- Alkhatib, M.; Monem, A.A.; Shaalan, K. Deep Learning for Arabic Error Detection and Correction. ACM Trans. Asian Low-Resource Lang. Inf. Process. 2020, 19, 71. [Google Scholar] [CrossRef]
- Solyman, A.; Wang, Z.; Tao, Q.; Elhag, A.A.M.; Zhang, R.; Mahmoud, Z. Automatic Arabic Grammatical Error Correction based on Expectation-Maximization routing and target-bidirectional agreement. Knowl. Based Syst. 2022, 241, 108180. [Google Scholar] [CrossRef]
- Abandah, G.A.; Suyyagh, A.; Khedher, M.Z. Correcting Arabic Soft Spelling Mistakes using BiLSTM-based Machine Learning. arXiv 2021. [Google Scholar] [CrossRef]
- Solyman, A.; Zhenyu, W.; Qian, T.; Elhag, A.A.M.; Toseef, M.; Aleibeid, Z. Synthetic data with neural machine translation for automatic correction in Arabic grammar. Egypt. Inform. J. 2020, 22, 303–315. [Google Scholar] [CrossRef]
- Zribi, C.B.O. “Easy” meta-embedding for detecting and correcting semantic errors in Arabic documents. Multimed. Tools Appl. 2023, 82, 21161–21175. [Google Scholar] [CrossRef]
- Almajdoubah, A.N.; Abandah, G.A.; Suvvagh, A.E. Investigating Recurrent Neural Networks for Diacritizing Arabic Text and Correcting Soft Spelling Mistakes. In Proceedings of the IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Dead Sea, Jordan, 16–17 November 2021; pp. 266–271. [Google Scholar] [CrossRef]
- Irani, M.; Elahimanesh, M.H.; Ghafouri, A.; Bidgoli, B.M. A Supervised Deep Learning-based Approach for Bilingual Arabic and Persian Spell Correction. In Proceedings of the 8th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), Mazandaran, Iran, 28–29 December 2022; pp. 1–7. [Google Scholar] [CrossRef]
- Abbad, H.; Xiong, S. Simple Extensible Deep Learning Model for Automatic Arabic Diacritization. ACM Trans. Asian Low-Resource Lang. Inf. Process. 2021, 21, 1–16. [Google Scholar] [CrossRef]
- Abandah, G.A.; Suyyagh, A.E.; Abdel-Majeed, M.R. Transfer learning and multi-phase training for accurate diacritization of Arabic poetry. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 3744–3757. [Google Scholar] [CrossRef]
- Almanaseer, W.; Alshraideh, M.; Alkadi, O. A Deep Belief Network Classification Approach for Automatic Diacritization of Arabic Text. Appl. Sci. 2021, 11, 5228. [Google Scholar] [CrossRef]
- Luo, G.-F.; Wang, D.-H.; Du, X.; Yin, H.-Y.; Zhang, X.-Y.; Zhu, S. Self-information of radicals: A new clue for zero-shot Chinese character recognition. Pattern Recognit. 2023, 140, 109598. [Google Scholar] [CrossRef]
- Puri, S. Image Classification with Information Extraction by Evaluating the Text Patterns in Bilingual Documents. In Advanced Network Technologies and Intelligent Computing; Springer: Cham, Switzerland, 2023; pp. 115–137. [Google Scholar] [CrossRef]
- Sharma, A.; Katlaa, R.; Kaur, G.; Jayagopi, D.B. Full-page handwriting recognition and automated essay scoring for in-the-wild essays. Multimed. Tools Appl. 2023. [Google Scholar] [CrossRef]
- Lomte, V.M.; Doye., D.D. Devanagari Text and Calligraphy Recognition Using ICF & ACF. CIMS 2023, 29, 88–114. [Google Scholar]
- Gupta, A.; Soneji, P.; Mangaonkar, N. Robotic Process Automation Powered Admission Management System. In Inventive Computation and Information Technologies; Smys, S., Kamel, K.A., Palanisamy, R., Eds.; Springer Nature: Singapore, 2023; pp. 169–178. [Google Scholar] [CrossRef]
- de Araujo, P.H.L.; de Almeida, A.P.G.S.; Braz, F.A.; da Silva, N.C.; Vidal, F.D.B.; de Campos, T.E. Sequence-aware multimodal page classification of Brazilian legal documents. Int. J. Doc. Anal. Recognit. (IJDAR) 2022, 26, 33–49. [Google Scholar] [CrossRef]
- Tanveer, M.; Wang, Y.; Amiri, A.; Zhang, H. DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion. arXiv 2023. [Google Scholar] [CrossRef]
- Fischer, N.; Hartelt, A.; Puppe, F. Line-Level Layout Recognition of Historical Documents with Background Knowledge. Algorithms 2023, 16, 136. [Google Scholar] [CrossRef]
- Paul, J.; Dutta, K.; Sarkar, A.; Das, N.; Roy, K. A survey on different feature extraction methods for writer identification and verification. Int. J. Appl. Pattern Recognit. 2023, 7, 122–144. [Google Scholar] [CrossRef]
- Longjam, T.; Kisku, D.R.; Gupta, P. Writer independent handwritten signature verification on multi-scripted signatures using hybrid CNN-BiLSTM: A novel approach. Expert Syst. Appl. 2023, 214, 119111. [Google Scholar] [CrossRef]
- Samsuryadi; Kurniawan, R.; Supardi, J.; Sukemi; Mohamad, F.S. A Framework for Determining the Big Five Personality Traits Using Machine Learning Classification through Graphology. J. Electr. Comput. Eng. 2023, 2023, 1249004. [Google Scholar] [CrossRef]
- Chen, H.-J.; Fathoni, H.; Wang, Z.-Y.; Lien, K.-Y.; Yang, C.-T. A Real-Time Streaming Application for License Plate Recognition Using OpenALPR. In Smart Grid and Internet of Things, in Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Springer Nature: Cham, Switzerland, 2023; pp. 368–381. [Google Scholar] [CrossRef]
Fatha | Fathatain | Dammah | Dammatain | Kasra | Kasratain | Sukoon | Shaddah |
---|---|---|---|---|---|---|---|
رَ | رً | رُ | رٌ | رِ | رٍ | رْ | رّ |
Work | Year | Technique | Dataset | CER | WER | Description | Limitation(s) |
---|---|---|---|---|---|---|---|
Z. Noubigh [47] | 2020 | CNN-BLSTM-CTC | KHATT, AHTID | 2.03 | 15.85 | Trained on 25,888 images of four printed fonts from P-KHATT, and then re-trained on the mentioned datasets | Test was conducted on 901 samples from AHTID dataset |
M. Boualam et al. [48] | 2022 | CNN-RNN-CTC | IFN/ENIT | 2.10 | 8.21 | Five CNN and two RNN layers trained on 487,350 samples after the augmentation | No real testing was performed of generalization |
R. E. Shtaiwi et al. [49] | 2022 | CRNN-BiLSTM | MADCAT | 3.96 | N/A | This model allows text detection and segmentation simultaneously | Requires significant time to train due to additional layers of BiLSTM |
N. Alzrrog et al. [50] | 2022 | DCNN | AHWD | 4.61 | N/A | Training of a DCNN on a dataset that was balanced and randomly selected, along with applying image regularization | Details of the part of the dataset on which the testing was performed was missing and performed blindly |
R. S. Alkhawaldeh et al. [51] | 2021 | CNN-LSTM | AHDBase, MAHDBase | N/A | N/A | VGG-16 or LeNet are used for CNN. For training, 6K images were used; 1K images were used for testing | Validation set was not specified. No CER or WER reported; however, 98.47 accuracy achieved, which does not guarantee OCR model’s performance |
M. Fasha et al. [52] | 2020 | CNN-BiLSTM-CTC | Custom dataset | 1.24 | 9.78 | Five CNN layers, followed by two BiLSTM layers, were trained on eighteen fonts | When tested on noisy images, the accuracy drops drastically to 22.71 CER and 85.82 WER |
İ. Dölek et al. [53] | 2022 | CNN-LSTM-CTC | Custom dataset | 3.88 | 42 | Trained on printed Ottoman documents written in Naskh font, and tested on 21 pages | Word-error rate differs considerably from character-error rate in best settings |
M. Khaled [54] | 2023 | CNN-RNN-CTC | Alexu, SRU-PHN, AHDB, ADAB, and IFN/ENIT | 0.94 | 2.62 | Beam-search decoder with data augmentation improved the results of using 63,403 words | Only 5% of the training dataset was used for validation |
A. Mustafa et al. [55] | 2021 | CNN-Transformer | Custom dataset, KHATT | 7.27 | 8.29 | Resnet101 with 101 layers and Transformer with four encoders and decoders were trained on multiple length sequences of words | Shortage of resources and computational power; complex model |
S. Momeni et al. [56] | 2022 | Transformer with cross-attention | Custom dataset, KHATT | 18.45 | N/A | DeIT-base and asafaya-BERT-base-arabic were used to initialize the model, which had twelve attention heads for encoder and eight for decoder | Due to the variation between the expected inputs for the model and how they were initialized, some weights were randomly initialized, leading to a limited CER |
Y. Alwaqfi et al. [57] | 2022 | GANs-CNN | AHCD | N/A | N/A | No error rate was reported; however, it achieved 99.78% accuracy using ten CNN layers in generator and nine layers in discriminator | Error rate, training and testing ratios, and number of images after applying augmentation were unclear |
M. Eltay et al. [58] | 2022 | GANs-BiLSTM | IFN/ENIT, AHDB | N/A | 4.13 | ScrabbleGAN was deployed to balance the frequencies of generated characters, followed by 200 blocks of LSTM as a recognizer | Still suffers when tested on challenging test sets |
I. Mustapha et al. [59] | 2022 | CDCGAN-CNN | AHCD | N/A | N/A | Used a mixture of real samples and the augmented characters to train LeNet5, and achieved 95.08 accuracy | Generation of characters with dots still a challenging task. Additionally, classes can be unbalanced |
S. Jemni et al. [60] | 2022 | GANs-CNN-RNN-CTC | KHATT | 24.33 | 47.67 | The generator employs U-net, while the discriminator uses CNN. For OCR model, CNN, followed by two layers of BiGRU and then a CTC layer, are stacked | Very complex model in each step of the OCR process, yet it produces an accuracy close to that of the baseline model |
Work | Year | Technique | Dataset | CER | WER | Other Results | Description | Limitation(s) |
---|---|---|---|---|---|---|---|---|
S. B. Aichaoui et al. [61] | 2022 | AraBART | SPIRAL | N/A | N/A | Best recall: 0.863 for space-issue errors | Generated eight kinds of error on which to train and test | Uneven distribution between the types and categories of error |
M. Alkhatib et al. [62] | 2020 | AraVec, Bi-LSTM, attention mechanism, polynomial classifier | Arabic-word list for spell checking, Arabic Corpora3, KALIMAT4, EASC5 | N/A | N/A | F-measure: 93.89% | 150 types of error are derived and classified in binary fashion | Semantic errors, style problems, and sentence-level problems are neglected |
A. Solyman et al. [63] | 2022 | Seq2Seq with multi-head attention Transformer | QALB-2014, QALB-2015 | N/A | N/A | F-1 score: 74.18 | Expectation-maximization routing is devised. Additionally, Kullback–Leibler divergence is applied as a bidirectional regularization term | Suffers when facing challenging examples. Additionally, takes a long time to train |
G. Abandah et al. [64] | 2021 | BiLSTM | Tashkeela, MSA Arabic | 1.28 | N/A | N/A | Stochastic error injection and transformed input are used to bypass the use of a costly encoder–decoder model | Test was performed on only 0.10% of Tashkeela set |
A. Solyman et al. [65] | 2021 | Convolutional Seq2Seq, attention mechanism, FastText | QALB, Alwatan | N/A | N/A | Recall: 63.59 | Data are preprocessed using FastText with pre-training embedding, and BPE algorithm is employed for infrequent words | Lacks the ability to correct dialectic and punctuation errors |
C. Zribi [66] | 2023 | SkipGram, FastText and BERT | ANC-KACST | N/A | N/A | Recall: 88.9 | Combines current embedding techniques to detect and correct Arabic semantic errors | The test set is different for each model |
A. Almajdoubah [67] | 2021 | Encoder–decoder LSTM and Transformer | ATB3, Tashkeela | N/A | N/A | Accuracy: 99.78% | Addressed the problem of ornamentation in Arabic text with diacritics and correcting mistakes in spelling. Training with target diacritized output corrects text with high probability. | Takes 11 h to train, while a comparable model achieved similar results in 3 h |
M. Irani et al. [68] | 2022 | Conditional random field (CRF) RNN | Custom dataset | N/A | N/A | Accuracy: 83.36% | Artificial errors created by deletion, substitution, and insertion on mixed Arabic and Persian text | Increase in error numbers injected into sentences reduce the accuracy drastically |
H. Abbad et al. [69] | 2021 | Four BiLSTMs | Tashkeela | 3 | 8.99 | N/A | Diacritics are grouped into four categories. Each BiLSTM layer is fed with the corresponding embedding of a concatenation between the input and output vectors | Faces some challenges when Shadda diacritic is to be corrected |
G. Abandah et al. [70] | 2022 | Four BiLSTMs | APCD2 | 3.54 | N/A | N/A | First, freeze first stack layer and train second stack layer on DS1. Second, retrain the whole model on DS1, and third, retrain the whole model on DS2, which has a higher diacritics ratio | Letters with diacritics with a ratio less than half in the dataset are ignored |
W. Almanaseer et al. [71] | 2021 | Deep belief network (DBN), borderline-SMOTE | Tashkeela and LDC ATB3 | 2.21 | 6.73 | N/A | Three RBM layers are deployed, borderline-SMOTE is used to over-sample and balance the classes, and ReLU activation is employed | DBN suffers from poor results if the batch size is small, and large batches take longer to train |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Najam, R.; Faizullah, S. Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction. Appl. Sci. 2023, 13, 7568. https://doi.org/10.3390/app13137568
Najam R, Faizullah S. Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction. Applied Sciences. 2023; 13(13):7568. https://doi.org/10.3390/app13137568
Chicago/Turabian StyleNajam, Rayyan, and Safiullah Faizullah. 2023. "Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction" Applied Sciences 13, no. 13: 7568. https://doi.org/10.3390/app13137568
APA StyleNajam, R., & Faizullah, S. (2023). Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction. Applied Sciences, 13(13), 7568. https://doi.org/10.3390/app13137568