Abstract
Digitizing historical documents is crucial for the preservation of cultural heritage. The digitization of documents written in Perso-Arabic scripts, however, presents multiple challenges. The Nastaliq calligraphy can be difficult to read even for a native speaker, and the four contextual forms of alphabet letters pose a complex task to current optical character recognition systems. To address these challenges, the presented study develops an approach for character recognition in Persian historical documents using few-shot learning with Siamese Neural Networks. A small, novel dataset is created from Persian historical documents for training and testing purposes. Experiments on the dataset resulted in a 94.75% testing accuracy for the few-shot learning task, and a 67% character recognition accuracy was observed on unseen documents for 111 distinct character classes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
Available at: https://huggingface.co/datasets/iarata/PHCR-DB25.
- 4.
Available for academic purposes at https://huggingface.co/iarata/Few-Shot-PHCR.
References
Ahranjany, S.S., Razzazi, F., Ghassemian, M.H.: A very high accuracy handwritten character recognition system for Farsi/Arabic digits using convolutional neural networks. In: 2010 IEEE Fifth International Conference on Bio-inspired Computing: Theories and Applications (BIC-TA), pp. 1585–1592. IEEE (2010)
Bonyani, M., Jahangard, S., Daneshmand, M.: Persian handwritten digit, character and word recognition using deep learning. Int. J. Doc. Anal. Recognit. (IJDAR) 24(1–2), 133–143 (2021)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Faizullah, S., Ayub, M.S., Hussain, S., Khan, M.A.: A survey of OCR in Arabic language: applications, techniques, and challenges. Appl. Sci. 13(7), 4584 (2023)
Firdausi: Shah-Nameh by Firdausi. (1600). https://www.loc.gov/item/2012498868/
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256. JMLR Workshop and Conference Proceedings (2010)
Hafiz: Dīvān. (1517). https://www.loc.gov/item/2015481730/
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
KO, M.A., Poruran, S.: OCR-nets: variants of pre-trained CNN for Urdu handwritten character recognition via transfer learning. Procedia Comput. Sci. 171, 2294–2301 (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1, pp. 1097–1105. NIPS’12, Curran Associates Inc., Red Hook, NY, USA (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Maulana, R.: Kitāb-i Rūmī al-Mawlawī. (1498). https://www.loc.gov/item/2016397707/
Mozaffari, S., Faez, K., Faradji, F., Ziaratban, M., Golzan, S.M.: A comprehensive isolated Farsi/Arabic character database for handwritten OCR research. In: Tenth International Workshop on Frontiers in Handwriting Recognition. Suvisoft (2006)
Mushtaq, F., Misgar, M.M., Kumar, M., Khurana, S.S.: Urdudeepnet: offline handwritten Urdu character recognition using deep neural network. Neural Comput. Appl. 33(22), 15229–15252 (2021)
Najam, R., Faizullah, S.: Analysis of recent deep learning techniques for Arabic handwritten-text OCR and Post-OCR correction. Appl. Sci. 13(13), 7568 (2023)
Naseer, A., Zafar, K.: Meta-feature based few-shot Siamese learning for urdu optical character recognition. Comput. Intell. 38(5), 1707–1727 (2022). https://doi.org/10.1111/coin.12530, https://onlinelibrary.wiley.com/doi/abs/10.1111/coin.12530
Potts, D.T.: The Immediate Precursors of Elam, pp. 45–46. Cambridge Univ. Press, Cambridge (2004)
Rahmati, M., Fateh, M., Rezvani, M., Tajary, A., Abolghasemi, V.: Printed Persian OCR system using deep learning. IET Image Process. 14(15), 3920–3931 (2020). https://doi.org/10.1049/iet-ipr.2019.0728, https://ietresearch.onlinelibrary.wiley.com/doi/abs/10.1049/iet-ipr.2019.0728
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017)
Sabbour, N., Shafait, F.: A segmentation-free approach to Arabic and Urdu OCR. In: Document Recognition and Retrieval XX, vol. 8658, pp. 215–226. SPIE (2013)
Sa’dī: Gulistān (1593). https://www.loc.gov/item/2016503247/
Szegedy, C., et al.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015). https://doi.org/10.1109/CVPR.2015.7298594
Ul-Hasan, A., Ahmed, S.B., Rashid, F., Shafait, F., Breuel, T.M.: Offline printed Urdu nastaleeq script recognition with bidirectional LSTM networks. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1061–1065. IEEE (2013)
Unknown: Qajar-era poetry anthology (1800). https://www.loc.gov/item/2017498320/
Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. (CSUR) 53(3), 1–34 (2020)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hajebrahimi, A., Santoso, M.E., Kovacs, M., Kryssanov, V.V. (2024). Few-Shot Learning for Character Recognition in Persian Historical Documents. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14505. Springer, Cham. https://doi.org/10.1007/978-3-031-53969-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-031-53969-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53968-8
Online ISBN: 978-3-031-53969-5
eBook Packages: Computer ScienceComputer Science (R0)