Abstract
Text recognition systems typically work well for printed documents but struggle with handwritten documents due to different writing styles, background complexities, added noise of image acquisition methods, and deformed text images such as strike-offs and underlines. These deformities change the structural information, making it difficult to restore the deformed images while maintaining the structural information and preserving the semantic dependencies of the local pixels. Current adversarial networks are unable to preserve the structural and semantic dependencies as they focus on individual pixel-to-pixel variation and encourage non-meaningful aspects of the images. To address this, we propose a Variable Cycle Generative Adversarial Network (VCGAN) that considers the perceptual quality of the images. By using a variable Content Loss (Top-k Variable Loss (\(TV_{k}\)) ), VCGAN preserves the inter-dependence of spatially close pixels while removing the strike-off strokes. The similarity of the images is computed with \(TV_{k}\) considering the intensity variations that do not interfere with the semantic structures of the image. Our results show that VCGAN can remove most deformities with an elevated F1 score of \(97.40 \%\) and outperforms current state-of-the-art algorithms with a character error rate of \(7.64 \%\) and word accuracy of \(81.53 \%\) when tested on the handwritten text recognition system
Similar content being viewed by others
Notes
The model is weakly supervised, as it uses unpaired training data samples \(\{ (s_{i}, c_{j})_{i,j}^{N,M} \} \) where \(s \ \epsilon \ \mathcal {S}\) and \(c \ \epsilon \ \mathcal {C}\)
VCGAN computes Adversarial Loss between the generated clean image (c) and the actual strike-off image (s) as discussed above in Sect. 3
Additional results for \(k=(80,100)\) are reported in the Appendix.
The standard deviation values are confidence scores for the variation across various strike-off types considered in this work
References
Adak, C., Chaudhuri, B.B.: An approach of strike-through text identification from handwritten documents. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp 643–648. IEEE (2014)
Arlandis, J., Pérez-Cortes, J. C., Cano, J.: Rejection strategies and confidence measures for a k-nn classifier in an OCR task. In: Object Recognition Supported by User Interaction for Service Robots, pp 576–579. IEEE (2002)
Banerjee. J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 517–524. IEEE (2009a)
Banerjee, J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 517–524. IEEE (2009b)
Bannigidad, P., Gudada, C.: Restoration of degraded historical kannada handwritten document images using image enhancement techniques. In: International Conference on Soft Computing and Pattern Recognition, pp 498–508. Springer (2016)
Bannigidad, P., Gudada, C.: Restoration of degraded kannada handwritten paper inscriptions (hastaprati) using image enhancement techniques. In: 2017 International Conference on Computer Communication and Informatics (ICCCI), pp 1–6. IEEE (2017)
Bathla, A,K., Gupta, S.K., Jindal, M.K.: Challenges in recognition of devanagari scripts due to segmentation of handwritten text. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp 2711–2715. IEEE (2016)
Brink, A., Schomaker, L., Bulacu, M.: Towards explainable writer verification and identification using vantage writers. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), pp 824–828. IEEE (2007)
Brink, A., van der Klauw, H., Schomaker, L.: Automatic removal of crossed-out handwritten text and the effect on writer verification and identification. In: Document Recognition and Retrieval XV, International Society for Optics and Photonics, p 68150A (2008)
Caligiuri, M.P., Mohammed, L.A.: The Neuroscience of Handwriting: Applications for Forensic Document Examination. CRC Press, Boca Raton (2012)
Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out handwritten text. Pattern Recognit. 61, 282–294 (2017)
Eltay, M., Zidouri, A., Ahmad, I., et al.: Generative adversarial network based adaptive data augmentation for handwritten arabic text recognition. PeerJ Comput. Sci. 8, e861 (2022)
Fan, Y., Lyu, S., Ying, Y., et al.: Learning with average top-k loss. In: Advances in neural information processing systems 30 (2017)
Fogel, S., Averbuch-Elor, H., Cohen, S., et al.: Scrabblegan: Semi-supervised varying length handwritten text generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4324–4333 (2020)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in neural information processing systems 27 (2014)
Heil, R., Vats, E., Hast, A.: Iam strikethrough database. (2021). https://doi.org/10.5281/zenodo.4767095
Heil, R., Vats, E., Hast, A.: Paired image to image translation for strikethrough removal from handwritten words. arXiv preprint arXiv:2201.09633 (2022)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp 694–711. Springer (2016)
Khobragade, R.N., Koli, N.A., Lanjewar, V.T.: Challenges in recognition of online and off-line compound handwritten characters: a review. In: Smart Trends in Computing and Communications, pp 375–383 (2020)
Liao, M., Shi, B., Bai, X., et al.: Textboxes: A fast text detector with a single deep neural network. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Marti, U.V., Bunke, H.: The iam-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39–46 (2002)
Nicolas, S., Paquet, T., Heutte, L.: Markov random field models to extract the layout of complex handwritten documents. In: Tenth International Workshop on Frontiers in Handwriting Recognition, Suvisoft (2006)
Nisa, H., Thom, J.A., Ciesielski, V., et al.: A deep learning approach to handwritten text recognition in the presence of struck-out text. In: 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ), pp 1–6. IEEE (2019)
Nisa, H., Ciesielski, V., Thom, J., et al.: Annotation of struck-out text in handwritten documents. In: Proceedings of the 25th Australasian Document Computing Symposium, pp 1–7 (2021)
Pande, S.D., Jadhav, P.P., Joshi, R., et al.: Digitization of handwritten devanagari text using CNN transfer learning-a better customer service support. Neurosci. Inform. 2(3), 100016 (2022)
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017)
Poddar, A., Chakraborty, A., Mukhopadhyay, J., et al.: Detection and localisation of struck-out-strokes in handwritten manuscripts. In: International Conference on Document Analysis and Recognition, pp 98–112. Springer (2021a)
Poddar, A., Chakraborty, A., Mukhopadhyay, J., et al.: Texrgan: a deep adversarial framework for text restoration from deformed handwritten documents. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp 1–9 (2021b)
Rajiv, K.S., Amardeep, S.D.: Challenges in segmentation of text in handwritten gurmukhi script. In: International Conference on Business Administration and Information Processing, pp 388–392. Springer (2010)
Rusu, A.I., Govindaraju, V.: On the challenges that handwritten text images pose to computers and new practical applications. In: Document Recognition and Retrieval XII, International Society for Optics and Photonics, pp 84–91 (2005)
Shalev-Shwartz, S., Wexler, Y.: Minimizing the maximal loss: how and why. In: International Conference on Machine Learning, pp 793–801. PMLR (2016)
Shivangi, N., Adarsh, B., Shekhar, V., et al.: Real-strikeoff dataset. https://github.com/shivii/Real-Strike-off-dataset.git (2024)
Shonenkov, A., Karachev, D., Novopoltsev, M., et al.: Handwritten text generation and strikethrough characters augmentation. arXiv preprint arXiv:2112.07395 (2021)
Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 5, 55 (2020). https://doi.org/10.1109/TPAMI.2020.3022406
Tuganbaev, D., Deriaguine, D.: Method of stricken-out character recognition in handwritten text. US Patent 8,472,719 (2013)
Wadhwani, M., Kundu, D., Chakraborty, D., et al.: Text extraction and restoration of old handwritten documents. In: Digital Techniques for Heritage Presentation and Preservation, pp 109–132. Springer (2021)
Wigington, C., Stewart, S., Davis, B., et al.: Data augmentation for recognition of handwritten words and lines using a CNN-LSTM network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 639–645. IEEE (2017)
Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2223–2232 (2017)
Author information
Authors and Affiliations
Contributions
All authors have equally contributed in the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nigam, S., Behera, A.P., Verma, S. et al. Deformity removal from handwritten text documents using variable cycle GAN. IJDAR 27, 615–627 (2024). https://doi.org/10.1007/s10032-024-00466-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-024-00466-x