[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (25)

Search Parameters:
Keywords = Arabic handwritten

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 4420 KiB  
Article
Machine Learning Approach for Arabic Handwritten Recognition
by A. M. Mutawa, Mohammad Y. Allaho and Monirah Al-Hajeri
Appl. Sci. 2024, 14(19), 9020; https://doi.org/10.3390/app14199020 - 6 Oct 2024
Viewed by 1798
Abstract
Text recognition is an important area of the pattern recognition field. Natural language processing (NLP) and pattern recognition have been utilized efficiently in script recognition. Much research has been conducted on handwritten script recognition. However, the research on the Arabic language for handwritten [...] Read more.
Text recognition is an important area of the pattern recognition field. Natural language processing (NLP) and pattern recognition have been utilized efficiently in script recognition. Much research has been conducted on handwritten script recognition. However, the research on the Arabic language for handwritten text recognition received little attention compared with other languages. Therefore, it is crucial to develop a new model that can recognize Arabic handwritten text. Most of the existing models used to acknowledge Arabic text are based on traditional machine learning techniques. Therefore, we implemented a new model using deep machine learning techniques by integrating two deep neural networks. In the new model, the architecture of the Residual Network (ResNet) model is used to extract features from raw images. Then, the Bidirectional Long Short-Term Memory (BiLSTM) and connectionist temporal classification (CTC) are used for sequence modeling. Our system improved the recognition rate of Arabic handwritten text compared to other models of a similar type with a character error rate of 13.2% and word error rate of 27.31%. In conclusion, the domain of Arabic handwritten recognition is advancing swiftly with the use of sophisticated deep learning methods. Full article
(This article belongs to the Special Issue Applied Intelligence in Natural Language Processing)
Show Figures

Figure 1

Figure 1
<p>Block diagram of the proposed study.</p>
Full article ">Figure 2
<p>Our model’s architecture.</p>
Full article ">Figure 3
<p>Samples from the KHATT dataset.</p>
Full article ">Figure 4
<p>Samples from the AHTID/MW dataset.</p>
Full article ">Figure 5
<p>Sample image from the KHATT dataset after removing the white spaces.</p>
Full article ">Figure 6
<p>Sample image from the KHATT dataset after removing the upper line.</p>
Full article ">Figure 7
<p>Sample image from KHATT dataset after applying the Max and Min filter.</p>
Full article ">Figure 8
<p>Upper and lower baselines in Arabic text.</p>
Full article ">Figure 9
<p>ResNet model for text feature extraction.</p>
Full article ">Figure 10
<p>Feature map for characters.</p>
Full article ">Figure 11
<p>CER versus epoch number.</p>
Full article ">Figure 12
<p>WER versus epoch number.</p>
Full article ">
13 pages, 1585 KiB  
Article
Analyzing Arabic Handwriting Style through Hand Kinematics
by Vahan Babushkin, Haneen Alsuradi, Muhamed Osman Al-Khalil and Mohamad Eid
Sensors 2024, 24(19), 6357; https://doi.org/10.3390/s24196357 - 30 Sep 2024
Cited by 1 | Viewed by 1111
Abstract
Handwriting style is an important aspect affecting the quality of handwriting. Adhering to one style is crucial for languages that follow cursive orthography and possess multiple handwriting styles, such as Arabic. The majority of available studies analyze Arabic handwriting style from static documents, [...] Read more.
Handwriting style is an important aspect affecting the quality of handwriting. Adhering to one style is crucial for languages that follow cursive orthography and possess multiple handwriting styles, such as Arabic. The majority of available studies analyze Arabic handwriting style from static documents, focusing only on pure styles. In this study, we analyze handwriting samples with mixed styles, pure styles (Ruq’ah and Naskh), and samples without a specific style from dynamic features of the stylus and hand kinematics. We propose a model for classifying handwritten samples into four classes based on adherence to style. The stylus and hand kinematics data were collected from 50 participants who were writing an Arabic text containing all 28 letters and covering most Arabic orthography. The parameter search was conducted to find the best hyperparameters for the model, the optimal sliding window length, and the overlap. The proposed model for style classification achieves an accuracy of 88%. The explainability analysis with Shapley values revealed that hand speed, pressure, and pen slant are among the top 12 important features, with other features contributing nearly equally to style classification. Finally, we explore which features are important for Arabic handwriting style detection. Full article
(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)
Show Figures

Figure 1

Figure 1
<p>(<b>a</b>) Experiment setup, (<b>b</b>) sample text dictated to subjects, (<b>c</b>) user interface with a sample of subject’s handwriting (text is shown using the Naskh style).</p>
Full article ">Figure 2
<p>Distribution of the expert’s evaluation of style consistency by paragraphs. (<b>a</b>) Original distribution, (<b>b</b>) after retaining paragraphs that correspond to the prevailing style of the subject.</p>
Full article ">Figure 3
<p>The proposed architecture with two temporal convolution layers: <span class="html-italic">w</span>—the length of the window, <span class="html-italic">s</span>—overlap, <span class="html-italic">n</span>—number of windows, <span class="html-italic">T</span>—length of the entire paragraph, <math display="inline"><semantics> <msub> <mi>K</mi> <mn>1</mn> </msub> </semantics></math>—kernel sizes of the first/second 1D-CNN layers, <math display="inline"><semantics> <msub> <mi>C</mi> <mn>1</mn> </msub> </semantics></math>—number of channels in first/second 1D-CNN layers.</p>
Full article ">Figure 4
<p>Searching for optimal (<b>a</b>) overlap and (<b>b</b>) window size.</p>
Full article ">Figure 5
<p>Average of confusion matrices over 5 folds. The diagonal shows average recall across 5 folds for each of the 4 classes.</p>
Full article ">Figure 6
<p>Average of normalized Shapley values across 5 folds.</p>
Full article ">
26 pages, 5883 KiB  
Article
Real-Time Air-Writing Recognition for Arabic Letters Using Deep Learning
by Aseel Qedear, Aldanh AlMatrafy, Athary Al-Sowat, Abrar Saigh and Asmaa Alayed
Sensors 2024, 24(18), 6098; https://doi.org/10.3390/s24186098 - 20 Sep 2024
Viewed by 1134
Abstract
Learning to write the Arabic alphabet is crucial for Arab children’s cognitive development, enhancing their memory and retention skills. However, the lack of Arabic language educational applications may hamper the effectiveness of their learning experience. To bridge this gap, SamAbjd was developed, an [...] Read more.
Learning to write the Arabic alphabet is crucial for Arab children’s cognitive development, enhancing their memory and retention skills. However, the lack of Arabic language educational applications may hamper the effectiveness of their learning experience. To bridge this gap, SamAbjd was developed, an interactive web application that leverages deep learning techniques, including air-writing recognition, to teach Arabic letters. SamAbjd was tailored to user needs through extensive surveys conducted with mothers and teachers, and a comprehensive literature review was performed to identify effective teaching methods and models. The development process involved gathering data from three publicly available datasets, culminating in a collection of 31,349 annotated images of handwritten Arabic letters. To enhance the dataset’s quality, data preprocessing techniques were applied, such as image denoising, grayscale conversion, and data augmentation. Two models were experimented with using a convolution neural network (CNN) and Visual Geometry Group (VGG16) to evaluate their effectiveness in recognizing air-written Arabic characters. Among the CNN models tested, the standout performer was a seven-layer model without dropout, which achieved a high testing accuracy of 96.40%. This model also demonstrated impressive precision and F1-score, both around 96.44% and 96.43%, respectively, indicating successful fitting without overfitting. The web application, built using Flask and PyCharm, offers a robust and user-friendly interface. By incorporating deep learning techniques and user feedback, the web application meets educational needs effectively. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Figure 1
<p>Overview of the methodological steps.</p>
Full article ">Figure 2
<p>Handwritten Arabic letters dataset.</p>
Full article ">Figure 3
<p>Data preprocessing techniques applied to the dataset.</p>
Full article ">Figure 4
<p>The architecture of the CNN model, inspired by [<a href="#B30-sensors-24-06098" class="html-bibr">30</a>].</p>
Full article ">Figure 5
<p>The architecture of the VGG16 model, inspired by [<a href="#B30-sensors-24-06098" class="html-bibr">30</a>].</p>
Full article ">Figure 6
<p>The architecture of the CNN model used in this study, inspired by [<a href="#B30-sensors-24-06098" class="html-bibr">30</a>].</p>
Full article ">Figure 7
<p>Training and validation accuracy of CNN 7 layers with no dropout model.</p>
Full article ">Figure 8
<p>Confusion matrix of CNN 7 layers with no dropout model.</p>
Full article ">Figure 9
<p>The flow for the backend system.</p>
Full article ">Figure 10
<p>Hand landmarks for fingertips, adopted from [<a href="#B50-sensors-24-06098" class="html-bibr">50</a>].</p>
Full article ">Figure 11
<p>Steps of writing a letter in the air.</p>
Full article ">Figure 12
<p>Creating a bounding box around the letter.</p>
Full article ">Figure 13
<p>Image processing on the canvas.</p>
Full article ">Figure 14
<p>Initial cropping process.</p>
Full article ">Figure 15
<p>Modified cropping process.</p>
Full article ">Figure 16
<p>Samples of nine air-written Arabic letters.</p>
Full article ">Figure 17
<p>User interface of the SamAbjd web application.</p>
Full article ">
28 pages, 26533 KiB  
Article
End-to-End Deep Learning Framework for Arabic Handwritten Legal Amount Recognition and Digital Courtesy Conversion
by Hakim A. Abdo, Ahmed Abdu, Mugahed A. Al-Antari, Ramesh R. Manza, Muhammed Talo, Yeong Hyeon Gu and Shobha Bawiskar
Mathematics 2024, 12(14), 2256; https://doi.org/10.3390/math12142256 - 19 Jul 2024
Viewed by 1271
Abstract
Arabic handwriting recognition and conversion are crucial for financial operations, particularly for processing handwritten amounts on cheques and financial documents. Compared to other languages, research in this area is relatively limited, especially concerning Arabic. This study introduces an innovative AI-driven method for simultaneously [...] Read more.
Arabic handwriting recognition and conversion are crucial for financial operations, particularly for processing handwritten amounts on cheques and financial documents. Compared to other languages, research in this area is relatively limited, especially concerning Arabic. This study introduces an innovative AI-driven method for simultaneously recognizing and converting Arabic handwritten legal amounts into numerical courtesy forms. The framework consists of four key stages. First, a new dataset of Arabic legal amounts in handwritten form (“.png” image format) is collected and labeled by natives. Second, a YOLO-based AI detector extracts individual legal amount words from the entire input sentence images. Third, a robust hybrid classification model is developed, sequentially combining ensemble Convolutional Neural Networks (CNNs) with a Vision Transformer (ViT) to improve the prediction accuracy of single Arabic words. Finally, a novel conversion algorithm transforms the predicted Arabic legal amounts into digital courtesy forms. The framework’s performance is fine-tuned and assessed using 5-fold cross-validation tests on the proposed novel dataset, achieving a word level detection accuracy of 98.6% and a recognition accuracy of 99.02% at the classification stage. The conversion process yields an overall accuracy of 90%, with an inference time of 4.5 s per sentence image. These results demonstrate promising potential for practical implementation in diverse Arabic financial systems. Full article
Show Figures

Figure 1

Figure 1
<p>Proposed legal amount recognition end-to-end framework. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 2
<p>Samples of legal amount sentences with English explanation for non-Arabic native speaker. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 3
<p>YOLOv5 model structure for Arabic handwritten word detection. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 4
<p>The proposed hybrid classification pipeline for Arabic handwritten word recognition. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 5
<p>Sample of legal amount image outputted from word detection phase. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 6
<p>Sample of applying LegalToCourtesy algorithm to calculate the courtesy amount value. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 7
<p>Training and validation convergence in terms of loss function for the YOLOv5s-based word extraction model.</p>
Full article ">Figure 8
<p>Evaluation prediction performance of the YOLOv5s-based word extraction model through training.</p>
Full article ">Figure 9
<p>Confusion matrix of the YOLOV5-based word extraction model.</p>
Full article ">Figure 10
<p>Sample of Arabic legal amount words detection results with English explanation for non-Arabic native speaker. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 11
<p>Performance assessment using confusion matrices for the Hybrid A model.</p>
Full article ">Figure 12
<p>Performance assessment using confusion matrices for the Hybrid B model.</p>
Full article ">Figure 13
<p>Samples of the proposed method results with correctly generated courtesy amounts. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 14
<p>Samples of the ability of the proposed approach to detect and classify in some complex cases: (<b>a</b>) word detection in overlapping letters case and (<b>b</b>) spelling mistake word classification. The English explanation is provided specifically for non-Arabic speakers.</p>
Full article ">Figure 15
<p>Samples of improperly generated courtesy amounts: (<b>a</b>) incorrect word recognition and (<b>b</b>) inaccurate word detection.</p>
Full article ">
33 pages, 9169 KiB  
Article
Dhad—A Children’s Handwritten Arabic Characters Dataset for Automated Recognition
by Sarab AlMuhaideb, Najwa Altwaijry, Ahad D. AlGhamdy, Daad AlKhulaiwi, Raghad AlHassan, Haya AlOmran and Aliyah M. AlSalem
Appl. Sci. 2024, 14(6), 2332; https://doi.org/10.3390/app14062332 - 10 Mar 2024
Cited by 1 | Viewed by 1571
Abstract
This study delves into the intricate realm of recognizing handwritten Arabic characters, specifically targeting children’s script. Given the inherent complexities of the Arabic script, encompassing semi-cursive styles, distinct character forms based on position, and the inclusion of diacritical marks, the domain demands specialized [...] Read more.
This study delves into the intricate realm of recognizing handwritten Arabic characters, specifically targeting children’s script. Given the inherent complexities of the Arabic script, encompassing semi-cursive styles, distinct character forms based on position, and the inclusion of diacritical marks, the domain demands specialized attention. While prior research has largely concentrated on adult handwriting, the spotlight here is on children’s handwritten Arabic characters, an area marked by its distinct challenges, such as variations in writing quality and increased distortions. To this end, we introduce a novel dataset, “Dhad”, refined for enhanced quality and quantity. Our investigation employs a tri-fold experimental approach, encompassing the exploration of pre-trained deep learning models (i.e., MobileNet, ResNet50, and DenseNet121), custom-designed Convolutional Neural Network (CNN) architecture, and traditional classifiers (i.e., Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP)), leveraging deep visual features. The results illuminate the efficacy of fine-tuned pre-existing models, the potential of custom CNN designs, and the intricacies associated with disjointed classification paradigms. The pre-trained model MobileNet achieved the best test accuracy of 93.59% on the Dhad dataset. Additionally, as a conceptual proposal, we introduce the idea of a computer application designed specifically for children aged 7–12, aimed at improving Arabic handwriting skills. Our concluding reflections emphasize the need for nuanced dataset curation, advanced model architectures, and cohesive training strategies to navigate the multifaceted challenges of Arabic character recognition. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Figure 1

Figure 1
<p>The architecture of our custom CNN model.</p>
Full article ">Figure 2
<p>Sample images from the letters mīm and nūn showing the consecutive data cleansing and preprocessing steps. (<b>a</b>) Letter mīm after scanning and cropping. (<b>b</b>) Letter nūn after scanning and cropping. (<b>c</b>) Letter mīm after cleansing. (<b>d</b>) Letter nūn after cleansing. (<b>e</b>) Letter mīm after applying a Gaussian filter. (<b>f</b>) Letter nūn after applying the Gaussian filter. (<b>g</b>) Letter mīm after applying a high-pass filter. (<b>h</b>) Letter nūn after applying a high-pass filter. (<b>i</b>) Letter mīm after applying binarization. (<b>j</b>) Letter nūn after binarization.</p>
Full article ">Figure 3
<p>Dhad dataset collection and preparation workflow.</p>
Full article ">Figure 4
<p>Training accuracy and loss curves for pre-trained models on Dhad dataset.</p>
Full article ">Figure 5
<p>Training accuracy and loss curves for pre-trained models on Hijja dataset.</p>
Full article ">Figure 6
<p>Confusion matrix for pre-trained models on Dhad dataset. (<b>a</b>) ResNet50, (<b>b</b>) MobileNet, and (<b>c</b>) DenseNet121.</p>
Full article ">Figure 6 Cont.
<p>Confusion matrix for pre-trained models on Dhad dataset. (<b>a</b>) ResNet50, (<b>b</b>) MobileNet, and (<b>c</b>) DenseNet121.</p>
Full article ">Figure 7
<p>Confusion matrix for pre-trained models on Hijja dataset. (<b>a</b>) ResNet50, (<b>b</b>) MobileNet, (<b>c</b>) DenseNet121.</p>
Full article ">Figure 7 Cont.
<p>Confusion matrix for pre-trained models on Hijja dataset. (<b>a</b>) ResNet50, (<b>b</b>) MobileNet, (<b>c</b>) DenseNet121.</p>
Full article ">Figure 8
<p>Layer visualizations for DenseNet121 model on Dhad dataset samples.</p>
Full article ">Figure 9
<p>AUC curves for pre-trained models on Dhad dataset.</p>
Full article ">Figure 10
<p>AUC curves for pre-trained models on Hijja dataset.</p>
Full article ">Figure 11
<p>Loss curves for different dropout variations with MobileNet on Dhad dataset.</p>
Full article ">Figure 12
<p>Loss curves for different dropout variations and data augmentation with MobileNet on Dhad dataset.</p>
Full article ">Figure 13
<p>Training accuracy and loss curves for custom CNN model on Dhad and Hijja datasets.</p>
Full article ">Figure 14
<p>Confusion matrix for custom CNN model on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 14 Cont.
<p>Confusion matrix for custom CNN model on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 15
<p>AUC curves of custom CNN model for Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 16
<p>Confusion matrix of MobileNet (ImageNet pre-trained) + SVM pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 17
<p>Confusion matrix of MobileNet (ImageNet pre-trained) + RF pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 18
<p>Confusion matrix of MobileNet (ImageNet pre-trained) + MLP pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 19
<p>Confusion matrix of MobileNet (Experiment One) + SVM pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 20
<p>Confusion matrix of MobileNet (Experiment One) + RF pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">Figure 21
<p>Confusion matrix of MobileNet (Experiment One) + MLP pipeline on Dhad and Hijja datasets. (<b>a</b>) Dhad and (<b>b</b>) Hijja.</p>
Full article ">
21 pages, 2761 KiB  
Article
Deep Learning-Based Child Handwritten Arabic Character Recognition and Handwriting Discrimination
by Maram Saleh Alwagdani and Emad Sami Jaha
Sensors 2023, 23(15), 6774; https://doi.org/10.3390/s23156774 - 28 Jul 2023
Cited by 7 | Viewed by 2985
Abstract
Handwritten Arabic character recognition has received increasing research interest in recent years. However, as of yet, the majority of the existing handwriting recognition systems have only focused on adult handwriting. In contrast, there have not been many studies conducted on child handwriting, nor [...] Read more.
Handwritten Arabic character recognition has received increasing research interest in recent years. However, as of yet, the majority of the existing handwriting recognition systems have only focused on adult handwriting. In contrast, there have not been many studies conducted on child handwriting, nor has it been regarded as a major research issue yet. Compared to adults’ handwriting, children’s handwriting is more challenging since it often has lower quality, higher variation, and larger distortions. Furthermore, most of these designed and currently used systems for adult data have not been trained or tested for child data recognition purposes or applications. This paper presents a new convolution neural network (CNN) model for recognizing children’s handwritten isolated Arabic letters. Several experiments are conducted here to investigate and analyze the influence when training the model with different datasets of children, adults, and both to measure and compare performance in recognizing children’s handwritten characters and discriminating their handwriting from adult handwriting. In addition, a number of supplementary features are proposed based on empirical study and observations and are combined with CNN-extracted features to augment the child and adult writer-group classification. Lastly, the performance of the extracted deep and supplementary features is evaluated and compared using different classifiers, comprising Softmax, support vector machine (SVM), k-nearest neighbor (KNN), and random forest (RF), as well as different dataset combinations from Hijja for child data and AHCD for adult data. Our findings highlight that the training strategy is crucial, and the inclusion of adult data is influential in achieving an increased accuracy of up to around 93% in child handwritten character recognition. Moreover, the fusion of the proposed supplementary features with the deep features attains an improved performance in child handwriting discrimination by up to around 94%. Full article
(This article belongs to the Special Issue Deep Learning for Information Fusion and Pattern Recognition)
Show Figures

Figure 1

Figure 1
<p>Overview of the framework of the proposed methodology.</p>
Full article ">Figure 2
<p>The proposed CNN architecture.</p>
Full article ">Figure 3
<p>Some preprocessed Hijja and AHCD character data samples: (<b>a</b>) Child writers’ samples; (<b>b</b>) Adult writers’ samples.</p>
Full article ">Figure 4
<p>The FFNN-based feature fusion model for the writer-group classification task.</p>
Full article ">Figure 5
<p>Learning accuracy and loss performance of Experiment 1.</p>
Full article ">Figure 6
<p>Learning accuracy and loss performance of Experiment 2.</p>
Full article ">Figure 7
<p>Learning accuracy and loss performance of Experiment 3.</p>
Full article ">Figure 8
<p>Learning accuracy and loss performance of Experiment 4.</p>
Full article ">Figure 9
<p>Confusion matrix depicting the results of Experiment 4 using three different classifiers: (<b>a</b>) Softmax classifier, (<b>b</b>) SVM classifier; (<b>c</b>) KNN classifier; (<b>d</b>) RF classifier.</p>
Full article ">
16 pages, 3862 KiB  
Article
Pashto Handwritten Invariant Character Trajectory Prediction Using a Customized Deep Learning Technique
by Fazli Khaliq, Muhammad Shabir, Inayat Khan, Shafiq Ahmad, Muhammad Usman, Muhammad Zubair and Shamsul Huda
Sensors 2023, 23(13), 6060; https://doi.org/10.3390/s23136060 - 30 Jun 2023
Cited by 2 | Viewed by 2314
Abstract
Before the 19th century, all communication and official records relied on handwritten documents, cherished as valuable artefacts by different ethnic groups. While significant efforts have been made to automate the transcription of major languages like English, French, Arabic, and Chinese, there has been [...] Read more.
Before the 19th century, all communication and official records relied on handwritten documents, cherished as valuable artefacts by different ethnic groups. While significant efforts have been made to automate the transcription of major languages like English, French, Arabic, and Chinese, there has been less research on regional and minor languages, despite their importance from geographical and historical perspectives. This research focuses on detecting and recognizing Pashto handwritten characters and ligatures, which is essential for preserving this regional cursive language in Pakistan and its status as the national language of Afghanistan. Deep learning techniques were employed to detect and recognize Pashto characters and ligatures, utilizing a newly developed dataset specific to Pashto. A further enhancement was done on the dataset by implementing data augmentation, i.e., scaling and rotation on Pashto handwritten characters and ligatures, which gave us many variations of a single trajectory. Different morphological operations for minimizing gaps in the trajectories were also performed. The median filter was used for the removal of different noises. This dataset will be combined with the existing PHWD-V2 dataset. Various deep-learning techniques were evaluated, including VGG19, MobileNetV2, MobileNetV3, and a customized CNN. The customized CNN demonstrated the highest accuracy and minimal loss, achieving a training accuracy of 93.98%, validation accuracy of 92.08% and testing accuracy of 92.99%. Full article
Show Figures

Figure 1

Figure 1
<p>Partial view of Pashto handwritten ligatures.</p>
Full article ">Figure 2
<p>Pashto handwritten word ligatures.</p>
Full article ">Figure 3
<p>Pashto handwritten character dysconnectivity.</p>
Full article ">Figure 4
<p>Valid and invalid hook detection and recognition.</p>
Full article ">Figure 5
<p>Customized 5-layer CNN model.</p>
Full article ">Figure 6
<p>General overview of the proposed framework.</p>
Full article ">Figure 7
<p>Dataset collection phase.</p>
Full article ">Figure 8
<p>Partial view of dataset without gridlines.</p>
Full article ">Figure 9
<p>Partial view of dataset cropping phase.</p>
Full article ">Figure 10
<p>Partial view of noise-free image dataset.</p>
Full article ">Figure 11
<p>VGG19 training and validation (<b>a</b>) accuracy and (<b>b</b>) loss.</p>
Full article ">Figure 12
<p>MobileNetV2 training and validation (<b>a</b>) accuracy and (<b>b</b>) loss.</p>
Full article ">Figure 13
<p>MobileNetV3Large training and validation (<b>a</b>) accuracy and (<b>b</b>) loss.</p>
Full article ">Figure 14
<p>Customized CNN training and validation (<b>a</b>) accuracy and (<b>b</b>) loss.</p>
Full article ">
33 pages, 3518 KiB  
Review
Analysis of Recent Deep Learning Techniques for Arabic Handwritten-Text OCR and Post-OCR Correction
by Rayyan Najam and Safiullah Faizullah
Appl. Sci. 2023, 13(13), 7568; https://doi.org/10.3390/app13137568 - 27 Jun 2023
Cited by 15 | Viewed by 6213
Abstract
Arabic handwritten-text recognition applies an OCR technique and then a text-correction technique to extract the text within an image correctly. Deep learning is a current paradigm utilized in OCR techniques. However, no study investigated or critically analyzed recent deep-learning techniques used for Arabic [...] Read more.
Arabic handwritten-text recognition applies an OCR technique and then a text-correction technique to extract the text within an image correctly. Deep learning is a current paradigm utilized in OCR techniques. However, no study investigated or critically analyzed recent deep-learning techniques used for Arabic handwritten OCR and text correction during the period of 2020–2023. This analysis fills this noticeable gap in the literature, uncovering recent developments and their limitations for researchers, practitioners, and interested readers. The results reveal that CNN-LSTM-CTC is the most suitable architecture among Transformer and GANs for OCR because it is less complex and can hold long textual dependencies. For OCR text correction, applying DL models to generated errors in datasets improved accuracy in many works. In conclusion, Arabic OCR has the potential to further apply several text-embedding models to correct the resultant text from the OCR, and there is a significant gap in studies investigating this problem. In addition, there is a need for more high-quality and domain-specific OCR Arabic handwritten datasets. Moreover, we recommend the practical development of a space for future trends in Arabic OCR applications, derived from current limitations in Arabic OCR works and from applications in other languages; this will involve a plethora of possibilities that have not been effectively researched at the time of writing. Full article
Show Figures

Figure 1

Figure 1
<p>Key OCR steps.</p>
Full article ">Figure 2
<p>Binarization exemplified.</p>
Full article ">Figure 3
<p>Different levels of segmentation from words to isolated characters.</p>
Full article ">Figure 4
<p>Example of tilting.</p>
Full article ">Figure 5
<p>Elucidation of how sub-words are formed in Arabic.</p>
Full article ">Figure 6
<p>Richness and variation in Arabic fonts.</p>
Full article ">Figure 7
<p>Simplified CNN with two Conv layers.</p>
Full article ">Figure 8
<p>The general framework of GANs.</p>
Full article ">Figure 9
<p>Taxonomy of the analyzed works.</p>
Full article ">
27 pages, 2712 KiB  
Review
A Survey of OCR in Arabic Language: Applications, Techniques, and Challenges
by Safiullah Faizullah, Muhammad Sohaib Ayub, Sajid Hussain and Muhammad Asad Khan
Appl. Sci. 2023, 13(7), 4584; https://doi.org/10.3390/app13074584 - 4 Apr 2023
Cited by 29 | Viewed by 11697
Abstract
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize [...] Read more.
Optical character recognition (OCR) is the process of extracting handwritten or printed text from a scanned or printed image and converting it to a machine-readable form for further data processing, such as searching or editing. Automatic text extraction using OCR helps to digitize documents for improved productivity and accessibility and for preservation of historical documents. This paper provides a survey of the current state-of-the-art applications, techniques, and challenges in Arabic OCR. We present the existing methods for each step of the complete OCR process to identify the best-performing approach for improved results. This paper follows the keyword-search method for reviewing the articles related to Arabic OCR, including the backward and forward citations of the article. In addition to state-of-art techniques, this paper identifies research gaps and presents future directions for Arabic OCR. Full article
(This article belongs to the Special Issue Digital Image Processing: Advanced Technologies and Applications)
Show Figures

Graphical abstract

Graphical abstract
Full article ">Figure 1
<p>Types of OCR systems in Arabic and their modes of processing.</p>
Full article ">Figure 2
<p>Brief overview of OCR process.</p>
Full article ">Figure 3
<p>Character segmentation stages in order to recognize characters with maximum accuracy.</p>
Full article ">Figure 4
<p>The flow of the OCR process along with OCR phases and methods involved.</p>
Full article ">Figure 5
<p>Opening and closing of an image.</p>
Full article ">Figure 6
<p>A skewed document (on the left) is deskewed (on the right) to achieve better OCR results.</p>
Full article ">Figure 7
<p>Processes and techniques in each phase of the OCR system.</p>
Full article ">Figure 8
<p>Hybrid postprocessing technique based on Google’s spelling suggestion algorithm.</p>
Full article ">
26 pages, 4206 KiB  
Article
A Genetic Algorithm Based One Class Support Vector Machine Model for Arabic Skilled Forgery Signature Verification
by Ansam A. Abdulhussien, Mohammad F. Nasrudin, Saad M. Darwish and Zaid Abdi Alkareem Alyasseri
J. Imaging 2023, 9(4), 79; https://doi.org/10.3390/jimaging9040079 - 29 Mar 2023
Cited by 2 | Viewed by 4116
Abstract
Recently, signature verification systems have been widely adopted for verifying individuals based on their handwritten signatures, especially in forensic and commercial transactions. Generally, feature extraction and classification tremendously impact the accuracy of system authentication. Feature extraction is challenging for signature verification systems due [...] Read more.
Recently, signature verification systems have been widely adopted for verifying individuals based on their handwritten signatures, especially in forensic and commercial transactions. Generally, feature extraction and classification tremendously impact the accuracy of system authentication. Feature extraction is challenging for signature verification systems due to the diverse forms of signatures and sample circumstances. Current signature verification techniques demonstrate promising results in identifying genuine and forged signatures. However, the overall performance of skilled forgery detection remains rigid to deliver high contentment. Furthermore, most of the current signature verification techniques demand a large number of learning samples to increase verification accuracy. This is the primary disadvantage of using deep learning, as the figure of signature samples is mainly restricted to the functional application of the signature verification system. In addition, the system inputs are scanned signatures that comprise noisy pixels, a complicated background, blurriness, and contrast decay. The main challenge has been attaining a balance between noise and data loss, since some essential information is lost during preprocessing, probably influencing the subsequent stages of the system. This paper tackles the aforementioned issues by presenting four main steps: preprocessing, multifeature fusion, discriminant feature selection using a genetic algorithm based on one class support vector machine (OCSVM-GA), and a one-class learning strategy to address imbalanced signature data in the practical application of a signature verification system. The suggested method employs three databases of signatures: SID-Arabic handwritten signatures, CEDAR, and UTSIG. Experimental results depict that the proposed approach outperforms current systems in terms of false acceptance rate (FAR), false rejection rate (FRR), and equal error rate (EER). Full article
(This article belongs to the Topic Computer Vision and Image Processing)
Show Figures

Figure 1

Figure 1
<p>Sample of signatures.</p>
Full article ">Figure 2
<p>Flowchart of the proposed model.</p>
Full article ">Figure 3
<p>Designations of the pixels in a 3 × 3 window.</p>
Full article ">Figure 4
<p>Counting the 01 patterns in the ordered set <span class="html-italic">P<sub>2</sub></span>,…, <span class="html-italic">P<sub>9</sub></span>.</p>
Full article ">Figure 5
<p>Locations of points that satisfy the conditions.</p>
Full article ">Figure 6
<p>Preprocessing steps (<b>a</b>) grayscale image, (<b>b</b>) image denoising, (<b>c</b>) segmentation, (<b>d</b>) isolated removal, and (<b>e</b>) thinning and skeletonization.</p>
Full article ">Figure 7
<p>(<b>a</b>) Two neighboring edge pixels, (<b>b</b>) EDMs principal.</p>
Full article ">Figure 8
<p>A strategy of feature fusion.</p>
Full article ">Figure 9
<p>Flowchart of feature selection.</p>
Full article ">Figure 10
<p>One-class support vector machine.</p>
Full article ">Figure 11
<p>The accuracy in the plot curve.</p>
Full article ">
15 pages, 3336 KiB  
Article
High-Performance Embedded System for Offline Signature Verification Problem Using Machine Learning
by Umair Tariq, Zonghai Hu, Rokham Tariq, Muhammad Shahid Iqbal and Muhammad Sadiq
Electronics 2023, 12(5), 1243; https://doi.org/10.3390/electronics12051243 - 4 Mar 2023
Cited by 1 | Viewed by 2929
Abstract
This paper proposes a high-performance embedded system for offline Urdu handwritten signature verification. Though many signature datasets are publicly available in languages such as English, Latin, Chinese, Persian, Arabic, Hindi, and Bengali, no Urdu handwritten datasets were available in the literature. So, in [...] Read more.
This paper proposes a high-performance embedded system for offline Urdu handwritten signature verification. Though many signature datasets are publicly available in languages such as English, Latin, Chinese, Persian, Arabic, Hindi, and Bengali, no Urdu handwritten datasets were available in the literature. So, in this work, an Urdu handwritten signature dataset is created. The proposed embedded system is then used to distinguish genuine and forged signatures based on various features, such as length, pattern, and edges. The system consists of five steps: data acquisition, pre-processing, feature extraction, signature registration, and signature verification. A majority voting (MV) algorithm is used for improved performance and accuracy of the proposed embedded system. In feature extraction, an improved sinusoidal signal multiplied by a Gaussian function at a specific frequency and orientation is used as a 2D Gabor filter. The proposed framework is tested and compared with existing handwritten signature verification methods. Our test results show accuracies of 66.8% for ensemble, 86.34% for k-nearest neighbor (KNN), 93.31% for support vector machine (SVM), and 95.05% for convolutional neural network (CNN). After applying the majority voting algorithm, the overall accuracy can be improved to 95.13%, with a false acceptance rate (FAR) of 0.2% and a false rejection rate (FRR) of 41.29% on private dataset. To test the generalization ability of the proposed model, we also test it on a public dataset of English handwritten signatures and achieve an overall accuracy of 97.46%. Full article
(This article belongs to the Special Issue High-Performance Embedded Computing)
Show Figures

Figure 1

Figure 1
<p>Embedded System for UHSV.</p>
Full article ">Figure 2
<p>(<b>a</b>,<b>c</b>) after and (<b>b</b>,<b>d</b>) before pre-processing the image.</p>
Full article ">Figure 3
<p>Accuracy of k-nearest neighbor (KNN) in different models.</p>
Full article ">Figure 4
<p>Accuracy of support vector machine (SVM) classifier in different models.</p>
Full article ">Figure 5
<p>Bagging Loss of Training.</p>
Full article ">Figure 6
<p>Overall accuracy of the ensemble classifier.</p>
Full article ">Figure 7
<p>CNN Model for Offline Signature Verification.</p>
Full article ">Figure 8
<p>Training and Testing of CNN Model.</p>
Full article ">Figure 9
<p>Comparison of the proposed method with existing published methods.</p>
Full article ">Figure 10
<p>Confusion Matrix of KNN, CNN, and SVM Classifiers.</p>
Full article ">Figure 11
<p>Comparison of FAR and FRR.</p>
Full article ">
15 pages, 3242 KiB  
Article
Towards Accurate Children’s Arabic Handwriting Recognition via Deep Learning
by Anfal Bin Durayhim, Amani Al-Ajlan, Isra Al-Turaiki and Najwa Altwaijry
Appl. Sci. 2023, 13(3), 1692; https://doi.org/10.3390/app13031692 - 29 Jan 2023
Cited by 7 | Viewed by 3269
Abstract
Automatic handwriting recognition has received considerable attention over the past three decades. Handwriting recognition systems are useful for a wide range of applications. Much research has been conducted to address the problem in Latin languages. However, less research has focused on the Arabic [...] Read more.
Automatic handwriting recognition has received considerable attention over the past three decades. Handwriting recognition systems are useful for a wide range of applications. Much research has been conducted to address the problem in Latin languages. However, less research has focused on the Arabic language, especially concerning recognizing children’s Arabic handwriting. This task is essential as the demand for educational applications to practice writing and spelling Arabic letters is increasing. Thus, the development of Arabic handwriting recognition systems and applications for children is important. In this paper, we propose two deep learning-based models for the recognition of children’s Arabic handwriting. The proposed models, a convolutional neural network (CNN) and a pre-trained CNN (VGG-16) were trained using Hijja, a recent dataset of Arabic children’s handwriting collected in Saudi Arabia. We also train and test our proposed models using the Arabic Handwritten Character Dataset (AHCD). We compare the performance of the proposed models with similar models from the literature. The results indicate that our proposed CNN outperforms the pre-trained CNN (VGG-16) and the other compared models from the literature. Moreover, we developed Mutqin, a prototype to help children practice Arabic handwriting. The prototype was evaluated by target users, and the results are reported. Full article
Show Figures

Figure 1

Figure 1
<p>Sample of the Hijja dataset [<a href="#B2-applsci-13-01692" class="html-bibr">2</a>].</p>
Full article ">Figure 2
<p>Our CNN architecture.</p>
Full article ">Figure 3
<p>Homepage interface.</p>
Full article ">Figure 4
<p>The writing board page interface.</p>
Full article ">Figure 5
<p>Feedback interface.</p>
Full article ">
12 pages, 1737 KiB  
Article
Automatic Gender and Age Classification from Offline Handwriting with Bilinear ResNet
by Irina Rabaev, Izadeen Alkoran, Odai Wattad and Marina Litvak
Sensors 2022, 22(24), 9650; https://doi.org/10.3390/s22249650 - 9 Dec 2022
Cited by 8 | Viewed by 2798
Abstract
This work focuses on automatic gender and age prediction tasks from handwritten documents. This problem is of interest in a variety of fields, such as historical document analysis and forensic investigations. The challenge for automatic gender and age classification can be demonstrated by [...] Read more.
This work focuses on automatic gender and age prediction tasks from handwritten documents. This problem is of interest in a variety of fields, such as historical document analysis and forensic investigations. The challenge for automatic gender and age classification can be demonstrated by the relatively low performances of the existing methods. In addition, despite the success of CNN for gender classification, deep neural networks were never applied for age classification. The published works in this area mostly concentrate on English and Arabic languages. In addition to Arabic and English, this work also considers Hebrew, which was much less studied. Following the success of bilinear Convolutional Neural Network (B-CNN) for fine-grained classification, we propose a novel implementation of a B-CNN with ResNet blocks. To our knowledge, this is the first time the bilinear CNN is applied for writer demographics classification. In particular, this is the first attempt to apply a deep neural network for the age classification. We perform experiments on documents from three benchmark datasets written in three different languages and provide a thorough comparison with the results reported in the literature. B-ResNet was top-ranked in all tasks. In particular, B-ResNet outperformed other models on KHATT and QUWI datasets on gender classification. Full article
(This article belongs to the Special Issue Vision and Sensor-Based Sensing in Human Action Recognition)
Show Figures

Figure 1

Figure 1
<p>B-ResNet architecture. The input image patch contains Hebrew handwriting.</p>
Full article ">Figure 2
<p>Sample images from the QUWI dataset (top) with Arabic and English handwriting, the KHATT dataset (middle) with Arabic handwriting, and the HHD dataset (bottom) with Hebrew handwriting.</p>
Full article ">Figure 3
<p>The pipeline of the classification procedure. F and M stand for female and male, respectively. The input image contains Hebrew handwriting.</p>
Full article ">
23 pages, 3097 KiB  
Article
Intelligent Arabic Handwriting Recognition Using Different Standalone and Hybrid CNN Architectures
by Waleed Albattah and Saleh Albahli
Appl. Sci. 2022, 12(19), 10155; https://doi.org/10.3390/app121910155 - 10 Oct 2022
Cited by 23 | Viewed by 4199
Abstract
Handwritten character recognition is a computer-vision-system problem that is still critical and challenging in many computer-vision tasks. With the increased interest in handwriting recognition as well as the developments in machine-learning and deep-learning algorithms, researchers have made significant improvements and advances in developing [...] Read more.
Handwritten character recognition is a computer-vision-system problem that is still critical and challenging in many computer-vision tasks. With the increased interest in handwriting recognition as well as the developments in machine-learning and deep-learning algorithms, researchers have made significant improvements and advances in developing English-handwriting-recognition methodologies; however, Arabic handwriting recognition has not yet received enough interest. In this work, several deep-learning and hybrid models were created. The methodology of the current study took advantage of machine learning in classification and deep learning in feature extraction to create hybrid models. Among the standalone deep-learning models trained on the two datasets used in the experiments performed, the best results were obtained with the transfer-learning model on the MNIST dataset, with 0.9967 accuracy achieved. The results for the hybrid models using the MNIST dataset were good, with accuracy measures exceeding 0.9 for all the hybrid models; however, the results for the hybrid models using the Arabic character dataset were inferior. Full article
(This article belongs to the Special Issue Federated and Transfer Learning Applications)
Show Figures

Figure 1

Figure 1
<p>Stride and feature map.</p>
Full article ">Figure 2
<p>The application of max pooling on the input.</p>
Full article ">Figure 3
<p>A typical convolutional neural network.</p>
Full article ">Figure 4
<p>The architecture of the hybrid models.</p>
Full article ">Figure 5
<p>The Arabic MNIST dataset.</p>
Full article ">Figure 6
<p>Graphs of the loss and accuracy curves for the feed-forward network.</p>
Full article ">Figure 7
<p>Graphs of loss and accuracy curves for the CNN.</p>
Full article ">Figure 8
<p>The confusion matrix of the CNN model.</p>
Full article ">Figure 9
<p>The accuracy of the hybrid models on the MNIST dataset.</p>
Full article ">Figure 10
<p>The accuracy of the hybrid models trained on the Arabic character dataset.</p>
Full article ">Figure 11
<p>Challenges and open problems.</p>
Full article ">
15 pages, 31062 KiB  
Article
Text Line Extraction in Historical Documents Using Mask R-CNN
by Ahmad Droby, Berat Kurar Barakat, Reem Alaasam, Boraq Madi, Irina Rabaev and Jihad El-Sana
Signals 2022, 3(3), 535-549; https://doi.org/10.3390/signals3030032 - 4 Aug 2022
Cited by 16 | Viewed by 4295
Abstract
Text line extraction is an essential preprocessing step in many handwritten document image analysis tasks. It includes detecting text lines in a document image and segmenting the regions of each detected line. Deep learning-based methods are frequently used for text line detection. However, [...] Read more.
Text line extraction is an essential preprocessing step in many handwritten document image analysis tasks. It includes detecting text lines in a document image and segmenting the regions of each detected line. Deep learning-based methods are frequently used for text line detection. However, only a limited number of methods tackle the problems of detection and segmentation together. This paper proposes a holistic method that applies Mask R-CNN for text line extraction. A Mask R-CNN model is trained to extract text lines fractions from document patches, which are further merged to form the text lines of an entire page. The presented method was evaluated on the two well-known datasets of historical documents, DIVA-HisDB and ICDAR 2015-HTR, and achieved state-of-the-art results. In addition, we introduce a new challenging dataset of Arabic historical manuscripts, VML-AHTE, where numerous diacritics are present. We show that the presented Mask R-CNN-based method can successfully segment text lines, even in such a challenging scenario. Full article
Show Figures

Figure 1

Figure 1
<p>An overview of the proposed method. An overlapping sliding window traverses the input image (<bold>a</bold>) and segments the patches (windows) using the trained Mask R-CNN model. An example of the segmented patches are shown in (<bold>b</bold>). The segmented patches are merged together producing the segmented text lines in a page (<bold>c</bold>).</p>
Full article ">Figure 2
<p>Mask R-CNN framework. (<xref ref-type="fig" rid="signals-03-00032-f002">Figure 2</xref> was taken from the Mask R-CNN paper [<xref ref-type="bibr" rid="B17-signals-03-00032">17</xref>]).</p>
Full article ">Figure 3
<p>Example of text line fragments in a patch.</p>
Full article ">Figure 4
<p>Illustration of patch labeling used to train the Mask R-CNN. Patches are extracted from page images and labeled by stacking the text line mask along the channel axis.</p>
Full article ">Figure 5
<p>The red and green masks will be combined, because the red mask is completely contained within the horizontal section of the green mask (the dashed red rectangle).</p>
Full article ">Figure 6
<p>Splitting combined lines: (<bold>a</bold>) the horizontal profile of an image patch, and (<bold>b</bold>) the mask is split at the minima closest to the center of the mask. The split line is shown in red.</p>
Full article ">Figure 7
<p>VML-AHTE dataset includes four ground truth formats: PAGE xml, pixel labels, DIVA pixel labels, and bounding polygons.</p>
Full article ">Figure 8
<p>An example of patch from each collection in the Diva-HisDb dataset.</p>
Full article ">Figure 9
<p>Example pages from the ICDAR2015-HTR dataset.</p>
Full article ">Figure 10
<p>Mask R-CNN segmentation results: (<bold>a</bold>) an input page, (<bold>b</bold>) blobs prediction, and (<bold>c</bold>) pixel level prediction.</p>
Full article ">Figure 11
<p>Disconnected MRCNN blobs are connected to enable applying Diva evaluator. As can be seen in the middle image, the Mask MRCNN segmented each line correctly, however, each line contain several disconnected segmented. On the right is the resulting polygon from combining the MRCNN blob prediction.</p>
Full article ">Figure 12
<p>An example of patch prediction using Mask R-CNN: (<bold>a</bold>) an example patches from DIVA-CB55 and VML-AHTE datasets, (<bold>b</bold>) the ground-truth annotation of the patches, (<bold>c</bold>) the prediction results of the Mask R-CNN, and (<bold>d</bold>) the difference between the prediction results (red) and the ground-truth (green).</p>
Full article ">Figure 13
<p>A sample of the segmentation results from ICDAR2015-HTR dataset. Columns (<bold>a</bold>,<bold>d</bold>) show the input page of the method, columns (<bold>b</bold>,<bold>e</bold>) show the ground-truth, and cloumns (<bold>c</bold>,<bold>f</bold>) are the prediction results. Note that the method successfully segments marginal notes which are not present in the ground truth.</p>
Full article ">
Back to TopTop