Abstract
Strike-off text poses major challenges in handwritten text recognition as it changes the semantic and structural information of the image. Although significant results have been achieved in identifying and removing such strike-off data using deep learning methodologies, most have been done for Roman scripts only. Deep learning approaches require a large amount of data with a high cost of training for every script individually to derive effective performance. Due to its complex nature and non-availability of sufficient data, research in strike-off removal in Indic scripts is limited. To address this problem, we propose reducing the requirement of a huge amount of data and minimizing the training cost through transfer learning. With the objective of strike-off removal in multiple Indic scripts, we leverage the experiences of a pre-trained model (trained on the Roman script) for strike-off removal in different domains (Indic scripts). We consider handwritten text documents of \({\textbf {10}}\) different Indic scripts and introduce \({\textbf {7}}\) different strike-offs in these documents. We implement Few-Shot Learning (FSL) and Zero-Shot Learning (ZSL) to train various state-of-the-art deep generative models on a few samples of the mentioned Indic texts. An extensive analysis of the results for ZSL and FSL has been presented with the perspective of source hypothesis generalization capability and the strength of relatedness of source and target domains. The results show that the degree of adaptability of the source hypothesis is significant for the right amount of transfer to take place. The scripts with angular structure have performed better than the round structured scripts as there is a higher degree of relatedness of angular scripts with the Roman script (source script). FSL and ZSL approaches promise to reduce data requirements and training costs for strike-off removal.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
Data set and materials will be made available on request.
Code availability
Code will be made available on request.
References
Adak C, Chaudhuri BB (2014) An approach of strike-through text identification from handwritten documents. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, IEEE, pp 643–648
Alvi AM, Siuly S, Wang H et al (2022) A deep learning based framework for diagnosis of mild cognitive impairment. Knowl Based Syst 248(108):815
Arlandis J, Pérez-Cortes JC, Cano J (2002) Rejection strategies and confidence measures for a k-nn classifier in an ocr task. In: Object recognition supported by user interaction for service robots, IEEE, pp 576–579
Atasever S, Azgınoglu N, Terzı DS et al (2022) A comprehensive survey of deep learning research on medical image analysis with focus on transfer learning. Clin Imaging 94:18–41
Banerjee J, Namboodiri AM, Jawahar C (2009) Contextual restoration of severely degraded document images. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 517–524
Bhattacharya U, Chaudhuri B (2005) Databases for research on recognition of handwritten characters of indian scripts. In: Eighth International Conference on Document Analysis and Recognition (ICDAR’05), IEEE, pp 789–793
Brink A, van der Klauw H, Schomaker L (2008) Automatic removal of crossed-out handwritten text and the effect on writer verification and identification. In: Document Recognition and Retrieval XV, International Society for Optics and Photonics, p 68150A
Eltay M, Zidouri A, Ahmad I et al (2022) Generative adversarial network based adaptive data augmentation for handwritten arabic text recognition. Peer J Comput Sci 8:e861
Fogel S, Averbuch-Elor H, Cohen S, et al (2020) Scrabblegan: Semi-supervised varying length handwritten text generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4324–4333
Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: International conference on machine learning, PMLR, pp 1180–1189
Gongidi S, Jawahar C (2021) iiit-indic-hw-words: A dataset for indic handwritten text recognition. In: International Conference on Document Analysis and Recognition, Springer, pp 444–459
Goodfellow I, Pouget-Abadie J, Mirza M, et al (2014) Generative adversarial nets. Advances in neural information processing systems. 27
Heil R, Vats E, Hast A (2021) Strikethrough removal from handwritten words using cyclegans. In: International Conference on Document Analysis and Recognition, Springer, pp 572–586
Heil R, Vats E, Hast A (2022) Paired image to image translation for strikethrough removal from handwritten words. arXiv preprint arXiv:2201.09633
Huang JT, Li J, Yu D, et al (2013) Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, pp 7304–7308
Kubicek J, Penhaker M, Krejcar O et al (2021) Modern trends and applications of intelligent methods in biomedical signal and image processing. Sensors 21(3):847
Liao M, Shi B, Bai X, et al (2017) Textboxes: A fast text detector with a single deep neural network. In: Thirty-First AAAI Conference on Artificial Intelligence
Long M, Cao Y, Wang J, et al (2015) Learning transferable features with deep adaptation networks. In: International conference on machine learning, PMLR, pp 97–105
Long M, Zhu H, Wang J, et al (2017) Deep transfer learning with joint adaptation networks. In: International conference on machine learning, PMLR, pp 2208–2217
Luo Z, Zou Y, Hoffman J, et al (2017) Label efficient learning of transferable representations acrosss domains and tasks. Advances in neural information processing systems. 30
Marti UV, Bunke H (2002) The iam-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recognit 5(1):39–46
Nicolas S, Paquet T, Heutte L (2006) Markov random field models to extract the layout of complex handwritten documents. In: Tenth International Workshop on Frontiers in Handwriting Recognition, Suvisoft
Nigam S, Behera A, Verma S, et al (2022) Deformity removal from handwritten text documents using variable cycle gan. PREPRINT (Version 1) available at Research Square pp 1–16. https://doi.org/10.21203/rs.3.rs-1488498/v1
Nisa H, Thom JA, Ciesielski V, et al (2019) A deep learning approach to handwritten text recognition in the presence of struck-out text. In: 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ), IEEE, pp 1–6
Niu S, Liu Y, Wang J et al (2020) A decade survey of transfer learning (2010–2020). IEEE Trans Artif Intell 1(2):151–166
Oquab M, Bottou L, Laptev I, et al (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1717–1724
Pardoe D, Stone P (2010) Boosting for regression transfer. In: ICML
Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621
Pillalamarri A (2018) Evolution of script in india. https://journalsofindia.com/evolution-of-script-in-india/
Pillalamarri A (2019) The story of india’s many scripts. https://thediplomat.com/2019/07/the-story-of-indias-many-scripts/
Poddar A, Chakraborty A, Mukhopadhyay J, et al (2021a) Detection and localisation of struck-out-strokes in handwritten manuscripts. In: International Conference on Document Analysis and Recognition, Springer, pp 98–112
Poddar A, Chakraborty A, Mukhopadhyay J, et al (2021b) Texrgan: a deep adversarial framework for text restoration from deformed handwritten documents. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp 1–9
Rezaei Z, Selamat A, Taki A et al (2017) Automatic plaque segmentation based on hybrid fuzzy clustering and k nearest neighborhood using virtual histology intravascular ultrasound images. Appl Soft Comput 53:380–395
Rosenstein MT (2005) To transfer or not to transfer. In: NIPS 2005
Sarki R, Ahmed K, Wang H et al (2020) Automated detection of mild and multi-class diabetic eye diseases using deep learning. Health Inf Sci Syst 8(1):1–9
Shonenkov A, Karachev D, Novopoltsev M, et al (2021) Handwritten text generation and strikethrough characters augmentation. arXiv preprint arXiv:2112.07395
Sinha RMK (2009) A journey from indian scripts processing to indian language processing. IEEE Ann Hist Comput 31(1):8–31
Tan C, Sun F, Kong T, et al (2018) A survey on deep transfer learning. In: International conference on artificial neural networks, Springer, pp 270–279
Tzeng E, Hoffman J, Zhang N, et al (2014) Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474
Vizcarra JC, Burlingame EA, Hug CB et al (2022) A community-based approach to image analysis of cells, tissues and tumors. Comput Med Imaging Gr 95(102):013
Wang Y, Yao Q, Kwok JT et al (2020) Generalizing from a few examples: A survey on few-shot learning. ACM Comput Surv (CSUR) 53(3):1–34
Weiss K, Khoshgoftaar TM, Wang D (2016) A survey of transfer learning. J Big data 3(1):1–40
Wigington C, Stewart S, Davis B, et al (2017) Data augmentation for recognition of handwritten words and lines using a cnn-lstm network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, pp 639–645
Xian Y, Lampert CH, Schiele B et al (2018) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Trans Pattern Anal Mach Intell 41(9):2251–2265
Xu Y, Pan SJ, Xiong H et al (2017) A unified framework for metric transfer learning. IEEE Trans Knowl Data Eng 29(6):1158–1171
Yao Y, Doretto G (2010) Boosting for transfer learning with multiple sources. In: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 1855–1862
Zhu H, Long M, Wang J, et al (2016) Deep hashing network for efficient similarity retrieval. In: Proceedings of the AAAI conference on Artificial Intelligence
Zhu JY, Park T, Isola P, et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Author information
Authors and Affiliations
Contributions
All authors have made equal contribution to the paper.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Ethics approval
Not Applicable.
Consent to participate
Not Applicable.
Consent for publication
All authors have agreed with the content and give explicit consent to submit the work. All authors have obtained consent from the responsible authorities at the institute/organization where the work has been carried out.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nigam, S., Behera, A.P., Gogoi, M. et al. Strike off removal in Indic scripts with transfer learning. Neural Comput & Applic 35, 12927–12943 (2023). https://doi.org/10.1007/s00521-023-08433-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08433-z