[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Deformity removal from handwritten text documents using variable cycle GAN

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Text recognition systems typically work well for printed documents but struggle with handwritten documents due to different writing styles, background complexities, added noise of image acquisition methods, and deformed text images such as strike-offs and underlines. These deformities change the structural information, making it difficult to restore the deformed images while maintaining the structural information and preserving the semantic dependencies of the local pixels. Current adversarial networks are unable to preserve the structural and semantic dependencies as they focus on individual pixel-to-pixel variation and encourage non-meaningful aspects of the images. To address this, we propose a Variable Cycle Generative Adversarial Network (VCGAN) that considers the perceptual quality of the images. By using a variable Content Loss (Top-k Variable Loss (\(TV_{k}\)) ), VCGAN preserves the inter-dependence of spatially close pixels while removing the strike-off strokes. The similarity of the images is computed with \(TV_{k}\) considering the intensity variations that do not interfere with the semantic structures of the image. Our results show that VCGAN can remove most deformities with an elevated F1 score of \(97.40 \%\) and outperforms current state-of-the-art algorithms with a character error rate of \(7.64 \%\) and word accuracy of \(81.53 \%\) when tested on the handwritten text recognition system

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The model is weakly supervised, as it uses unpaired training data samples \(\{ (s_{i}, c_{j})_{i,j}^{N,M} \} \) where \(s \ \epsilon \ \mathcal {S}\) and \(c \ \epsilon \ \mathcal {C}\)

  2. VCGAN computes Adversarial Loss between the generated clean image (c) and the actual strike-off image (s) as discussed above in Sect. 3

  3. Additional results for \(k=(80,100)\) are reported in the Appendix.

  4. The standard deviation values are confidence scores for the variation across various strike-off types considered in this work

References

  1. Adak, C., Chaudhuri, B.B.: An approach of strike-through text identification from handwritten documents. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp 643–648. IEEE (2014)

  2. Arlandis, J., Pérez-Cortes, J. C., Cano, J.: Rejection strategies and confidence measures for a k-nn classifier in an OCR task. In: Object Recognition Supported by User Interaction for Service Robots, pp 576–579. IEEE (2002)

  3. Banerjee. J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 517–524. IEEE (2009a)

  4. Banerjee, J., Namboodiri, A.M., Jawahar, C.: Contextual restoration of severely degraded document images. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 517–524. IEEE (2009b)

  5. Bannigidad, P., Gudada, C.: Restoration of degraded historical kannada handwritten document images using image enhancement techniques. In: International Conference on Soft Computing and Pattern Recognition, pp 498–508. Springer (2016)

  6. Bannigidad, P., Gudada, C.: Restoration of degraded kannada handwritten paper inscriptions (hastaprati) using image enhancement techniques. In: 2017 International Conference on Computer Communication and Informatics (ICCCI), pp 1–6. IEEE (2017)

  7. Bathla, A,K., Gupta, S.K., Jindal, M.K.: Challenges in recognition of devanagari scripts due to segmentation of handwritten text. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp 2711–2715. IEEE (2016)

  8. Brink, A., Schomaker, L., Bulacu, M.: Towards explainable writer verification and identification using vantage writers. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), pp 824–828. IEEE (2007)

  9. Brink, A., van der Klauw, H., Schomaker, L.: Automatic removal of crossed-out handwritten text and the effect on writer verification and identification. In: Document Recognition and Retrieval XV, International Society for Optics and Photonics, p 68150A (2008)

  10. Caligiuri, M.P., Mohammed, L.A.: The Neuroscience of Handwriting: Applications for Forensic Document Examination. CRC Press, Boca Raton (2012)

    Book  Google Scholar 

  11. Chaudhuri, B.B., Adak, C.: An approach for detecting and cleaning of struck-out handwritten text. Pattern Recognit. 61, 282–294 (2017)

    Article  Google Scholar 

  12. Eltay, M., Zidouri, A., Ahmad, I., et al.: Generative adversarial network based adaptive data augmentation for handwritten arabic text recognition. PeerJ Comput. Sci. 8, e861 (2022)

    Article  Google Scholar 

  13. Fan, Y., Lyu, S., Ying, Y., et al.: Learning with average top-k loss. In: Advances in neural information processing systems 30 (2017)

  14. Fogel, S., Averbuch-Elor, H., Cohen, S., et al.: Scrabblegan: Semi-supervised varying length handwritten text generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4324–4333 (2020)

  15. Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al.: Generative adversarial nets. In: Advances in neural information processing systems 27 (2014)

  16. Heil, R., Vats, E., Hast, A.: Iam strikethrough database. (2021). https://doi.org/10.5281/zenodo.4767095

  17. Heil, R., Vats, E., Hast, A.: Paired image to image translation for strikethrough removal from handwritten words. arXiv preprint arXiv:2201.09633 (2022)

  18. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp 694–711. Springer (2016)

  19. Khobragade, R.N., Koli, N.A., Lanjewar, V.T.: Challenges in recognition of online and off-line compound handwritten characters: a review. In: Smart Trends in Computing and Communications, pp 375–383 (2020)

  20. Liao, M., Shi, B., Bai, X., et al.: Textboxes: A fast text detector with a single deep neural network. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

  21. Marti, U.V., Bunke, H.: The iam-database: an english sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5(1), 39–46 (2002)

    Article  Google Scholar 

  22. Nicolas, S., Paquet, T., Heutte, L.: Markov random field models to extract the layout of complex handwritten documents. In: Tenth International Workshop on Frontiers in Handwriting Recognition, Suvisoft (2006)

  23. Nisa, H., Thom, J.A., Ciesielski, V., et al.: A deep learning approach to handwritten text recognition in the presence of struck-out text. In: 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ), pp 1–6. IEEE (2019)

  24. Nisa, H., Ciesielski, V., Thom, J., et al.: Annotation of struck-out text in handwritten documents. In: Proceedings of the 25th Australasian Document Computing Symposium, pp 1–7 (2021)

  25. Pande, S.D., Jadhav, P.P., Joshi, R., et al.: Digitization of handwritten devanagari text using CNN transfer learning-a better customer service support. Neurosci. Inform. 2(3), 100016 (2022)

    Article  Google Scholar 

  26. Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621 (2017)

  27. Poddar, A., Chakraborty, A., Mukhopadhyay, J., et al.: Detection and localisation of struck-out-strokes in handwritten manuscripts. In: International Conference on Document Analysis and Recognition, pp 98–112. Springer (2021a)

  28. Poddar, A., Chakraborty, A., Mukhopadhyay, J., et al.: Texrgan: a deep adversarial framework for text restoration from deformed handwritten documents. In: Proceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing, pp 1–9 (2021b)

  29. Rajiv, K.S., Amardeep, S.D.: Challenges in segmentation of text in handwritten gurmukhi script. In: International Conference on Business Administration and Information Processing, pp 388–392. Springer (2010)

  30. Rusu, A.I., Govindaraju, V.: On the challenges that handwritten text images pose to computers and new practical applications. In: Document Recognition and Retrieval XII, International Society for Optics and Photonics, pp 84–91 (2005)

  31. Shalev-Shwartz, S., Wexler, Y.: Minimizing the maximal loss: how and why. In: International Conference on Machine Learning, pp 793–801. PMLR (2016)

  32. Shivangi, N., Adarsh, B., Shekhar, V., et al.: Real-strikeoff dataset. https://github.com/shivii/Real-Strike-off-dataset.git (2024)

  33. Shonenkov, A., Karachev, D., Novopoltsev, M., et al.: Handwritten text generation and strikethrough characters augmentation. arXiv preprint arXiv:2112.07395 (2021)

  34. Souibgui, M.A., Kessentini, Y.: De-gan: a conditional generative adversarial network for document enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 5, 55 (2020). https://doi.org/10.1109/TPAMI.2020.3022406

    Article  Google Scholar 

  35. Tuganbaev, D., Deriaguine, D.: Method of stricken-out character recognition in handwritten text. US Patent 8,472,719 (2013)

  36. Wadhwani, M., Kundu, D., Chakraborty, D., et al.: Text extraction and restoration of old handwritten documents. In: Digital Techniques for Heritage Presentation and Preservation, pp 109–132. Springer (2021)

  37. Wigington, C., Stewart, S., Davis, B., et al.: Data augmentation for recognition of handwritten words and lines using a CNN-LSTM network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp 639–645. IEEE (2017)

  38. Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2223–2232 (2017)

Download references

Author information

Authors and Affiliations

Authors

Contributions

All authors have equally contributed in the manuscript.

Corresponding author

Correspondence to Shivangi Nigam.

Ethics declarations

Conflict of interest

The authors declare no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 2 Handwritten text recognition (Character error rate (CER) and Word Accuracy(CER)) of generated images for strike-off removal on various k values
Table 3 Comparison of performance measures of various state-of-the-art methods
Table 4 Performance Metrics for strike-off removal for \(k=10\)
Table 5 Performance Metrics for strike-off removal for \(k=50\)
Table 6 Performance Metrics for strike-off removal for \(k=80\)
Table 7 Performance Metrics for strike-off removal for \(k=100\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nigam, S., Behera, A.P., Verma, S. et al. Deformity removal from handwritten text documents using variable cycle GAN. IJDAR 27, 615–627 (2024). https://doi.org/10.1007/s10032-024-00466-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-024-00466-x

Keywords

Navigation