[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Object Spotting in Historical Documents

  • Chapter
  • First Online:
Digital Techniques for Heritage Presentation and Preservation
  • 385 Accesses

Abstract

Spotting is finding the location of a particular object without explicitly knowing the entire content in a collection of objects. In this chapter, we consider two types of objects. We consider the word in a document image as an object. Another object is an artifact that is present in terracotta panel images. The proposed object spotting method is based on Wave Kernel Signature (WKS) under the foundation of quantum mechanics. The query image and the document/panel image are smoothened first, and then the Scale Invariant Feature Transform detector is used to obtain the keypoints in both the query image and the document/panel image. Each keypoint is described in terms of WKS. The WKS descriptors represent the average probability of measuring a quantum mechanical particle at a specific location based on quantum energy. In the case of word spotting, a two-step searching technique is introduced to find the region of interest in the document image under test. On the other hand, a single-step searching technique is used to spot figures present in the panel image corresponding to a particular query image. The proposed method is tested on three historical Bangla handwritten datasets and one historical English handwritten dataset, as well as a terracotta panel image dataset. The performance of the proposed method is evaluated using standard metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 119.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 149.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
GBP 149.99
Price includes VAT (United Kingdom)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Albanese, M., d’Acierno, A., Moscato, V., Persia, F., Picariello, A.: A multimedia semantic recommender system for cultural heritage applications. In: Proc. of 5th International Conference on Semantic Computing (ICSC), pp. 403–410. IEEE, Piscataway (2011)

    Google Scholar 

  2. Aletras, N., Stevenson, M., Clough, P.: Computing similarity between items in a digital library of cultural heritage. J. Comput. Cult. Herit. 5(16), 1–19 (2012)

    Article  Google Scholar 

  3. Almazán, J., Gordo, A., Fornés, A., Valvenya, E.: Segmentation-free word spotting with exemplar SVMs. Pattern Recognit. 47, 3967–3978 (2014)

    Article  Google Scholar 

  4. Amato, G., Falchi, F., Gennaro, C.: Fast image classification for monument recognition. J. Comput. Cult. Herit. 8(18), 1–25 (2015)

    Article  Google Scholar 

  5. Ardizzone, E., Chella, A., Pirrone, R., Gambino, O.: An image retrieval system for artistic database on cultural heritage. In: Proc. della Conferenza Italiana sui Sistemi Intelligenti (CISI), pp. 1–8. Citeseer (2004)

    Google Scholar 

  6. Aubry, M., Schlickewei, U., Cremers, D.: The wave kernel signature: a quantum mechanical approach to shape analysis. In: Proc. of International conference on Computer Vision, Workshop, pp. 1626–1623. IEEE, Piscataway (2011)

    Google Scholar 

  7. Bacci, M., Bianchi, G., Campana, S., Fichera, G.: Historical and archaeological analysis of the church of the nativity. J. Cult. Herit. 13(4), e5–e26 (2012)

    Article  Google Scholar 

  8. Bay, H., Ess, A., Tuytelaars, T., van Gool, L.: Speeded-up robust features (SURF). Comput. Vis. Image Underst. 110(3), 346–359 (2008)

    Article  Google Scholar 

  9. BHID: Bishnupur Heritage Image Database. http://www.isical.ac.in/~bsnpr. Accessed 7 Mar 2017

  10. Chen, G.-F.: Intangible cultural heritage preservation: an exploratory study of digitization of the historical literature of Chinese Kunqu Opera Librettos. J. Comput. Cult. Herit. 7(4), 1–16 (2014)

    Article  Google Scholar 

  11. Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Proc. of Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision, pp. 1–22 (2004)

    Google Scholar 

  12. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of Computer Vision and Pattern Recognition, pp. 886–893. IEEE, Piscataway (2005)

    Google Scholar 

  13. Doubek, P., Matas, J., Perdoch, M., Chum, O.: Image matching and retrieval by repetitive patterns. In: Proc. of 20th International Conference on Pattern Recognition (ICPR), pp. 3195–3198. IEEE, Piscataway (2010)

    Google Scholar 

  14. Fischer, A., Keller, A., Frinken, V., Bunke, H.: HMM-based word spotting in handwritten documents using subword models. In: Proc. of 20th International Conference on Pattern Recognition (ICPR), pp. 3416-3419. IEEE, Piscataway (2010)

    Google Scholar 

  15. Fischer, A., Keller, A., Frinken, V., Bunke, H.: Lexicon-free handwritten word spotting using character HMMs. Pattern Recognit. Lett. 33(7), 934–942 (2012)

    Article  Google Scholar 

  16. Frinken, V., Fischer, A., Manmatha, R., Bunke, H.: A novel word spotting method based on recurrent neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 34, 211–224 (2012)

    Article  Google Scholar 

  17. Hast, A., Fornés, A.: A segmentation-free handwritten word spotting approach by relaxed feature matching. In: Proc. of 12th IAPR Workshop on Document Analysis Systems (DAS), pp. 150–155. IEEE, Piscataway (2016)

    Google Scholar 

  18. Iakovidis, D., Kotsifakos, E.E., Pelekis, N., Karanikas, H., Kopanakis, I., Mavroudakis, T., Theodoridis, Y.: Pattern-based retrieval of cultural heritage images. In: Proc. of the 11th Panhellenic Conference in Informatics (PCI) (2007)

    Google Scholar 

  19. Kesidis, A.L., Galiotou, E., Gatos, B., Pratikakis, I.: A word spotting framework for historical machine-printed documents. Int. J. Doc. Anal. Recognit. 14, 131–144 (2011)

    Article  Google Scholar 

  20. Khurshid, K., Faure, C., Vincen, N.: Word spotting in historical printed documents using shape and sequence comparisons. Pattern Recognit. 45, 2598–2609 (2012)

    Article  Google Scholar 

  21. Kolomenkin, M., Leifman, G., Shimshoni, I., Tal, A.: Reconstruction of relief objects from archeological line drawings. J. Comput. Cult. Herit. 6(3), 1–19 (2013)

    Article  Google Scholar 

  22. Konidaris, T., Kesidis, A.L., Gatos, B.: A segmentation-free word spotting method for historical printed documents. Pattern Anal. Appl. 19(4), 963–976 (2016)

    Article  MathSciNet  Google Scholar 

  23. Kushki, A., Androutsos, P., Plataniotis, K.N., Venetsanopoulos, A.N.: Retrieval of images from artistic repositories using a decision fusion framework. IEEE Trans. Image Process. 13(3), 277–292 (2004)

    Article  Google Scholar 

  24. Lavrenko, V., Rath, T., Manmatha, R.: Holistic word recognition for handwritten historical documents. In: Proc. of First International Workshop in Document Image Analysis for Libraries, pp. 278–287. IEEE, Piscataway (2004)

    Google Scholar 

  25. Lee, D.R., Hong, W., Oh, I.S.: Segmentation-free word spotting using SIFT. In: Proc. of Southwest Symposium on Image Analysis and Interpretation, pp. 65–68. IEEE, Piscataway (2012)

    Google Scholar 

  26. Lewis, P.H., Martinez, K., Abas, F.S., Fauzi, M.F.A., Chan, S.C.Y., Addis, M.J., Boniface, M.J., Grimwood, P., Stevenson, A., Lahanier, C., Stevenson, J.: An integrated content and metadata based retrieval system for art. IEEE Trans. Image Process. 13(3), 302–313 (2004)

    Article  Google Scholar 

  27. Leydier, Y., Ouji, A., LeBourgeois, F., Emptoz, H.: Towards an omnilingual word retrieval system for ancient manuscripts. Pattern Recognit. 42, 2089–2105 (2009)

    Article  Google Scholar 

  28. Liang, Y., Fairhurst, M.C., Guest, R.M.: A synthesised word approach to word retrieval in handwritten documents. Pattern Recognit. 45(12), 4225–4236 (2012)

    Article  Google Scholar 

  29. Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 90–110 (2004)

    Article  Google Scholar 

  30. Mallik, A., Chaudhury, S., H Ghosh. Nrityakosha: preserving the intangible heritage of Indian classical dance. J. Comput. Cult. Herit. 4(11), 1–25 (2011)

    Google Scholar 

  31. Manmatha, R., Han, C., Riseman, E.: Word spotting: a new approach to indexing handwriting. In: Proceedings of IEEE Computer Vision and Pattern Recognition, pp. 631–637 (1996)

    Google Scholar 

  32. Marti, U.V., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition systems. Int. J. Pattern Recognit. Artif. Intell. 15, 65–90 (2001)

    Article  Google Scholar 

  33. Meyer, M., Desbrun, M., Schröder, P., Bar, A.H.: Discrete differential geometry operators for triangulated 2-manifolds. In: Proc. of Visualization Mathematics. Springer, Berlin, pp. 35–57 (2002)

    Google Scholar 

  34. Mishra, S., Mukherjee, J., Mondal, P., Aswatha, S.M., Mukherjee, J.: Real-time retrieval system for heritage images. In: Proc. of Emerging Research in Electronics, Computer Science and Technology, pp. 245–253. Springer, Berlin (2014)

    Google Scholar 

  35. Moreno-Noguer, F.: Deformation and illumination invariant feature point descriptor. In: Proc. of Computer Vision and Pattern Recognition (CVPR), pp. 1593–1600. IEEE, Piscataway (2011)

    Google Scholar 

  36. Nagy, G.: Twenty years of document image analysis in PAMI. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 38–62 (2000)

    Article  Google Scholar 

  37. Panda, J., Sharma, S., Jawahar, C.V.: Heritage App: annotating images on mobile phones. In: Proc. of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), number 3. ACM, New York (2012)

    Google Scholar 

  38. Picard, D., Gosselin, P.H., Gaspard, M.C.: Challenges in content-based image indexing of cultural heritage collections. IEEE Signal Processing Mag. 32(4), 95–102 (2015)

    Article  Google Scholar 

  39. Pinkall, U., Polthier, K.: Computing discrete minimal surfaces and their conjugates. Exp. Math. 2, 15–36 (1993)

    Article  MathSciNet  Google Scholar 

  40. Polpinij, J., Sibunruang, C.: Thai heritage images classification by Naive Bayes image classifier. In: Proc. of 6th International Conference on Digital Content, Multimedia Technology and Its Applications (IDC), pp. 221–224. IEEE, Piscataway (2010)

    Google Scholar 

  41. Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: Proc. of Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 521–527. IEEE, Piscataway (2003)

    Google Scholar 

  42. Rath, T., Manmatha, R.: Word spotting for historical documents. Int. J. Doc. Anal. Recognit. 9, 139–152 (2007)

    Article  Google Scholar 

  43. Rodríguez, J., Perronnin, F.: Local gradient histogram features for word spotting in unconstrained handwritten documents. In: Proc. of International Conference on Frontiers in Handwriting Recognition (ICFHR) (2008)

    Google Scholar 

  44. Rodriguez-Serrano, J., Perronnin, F.: A Model-based sequence similarity with application to handwritten word spotting. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2108–2120 (2012)

    Article  Google Scholar 

  45. Rothacker, L., Fink, G.A., Banerjee, P., Bhattacharya, U., Chaudhuri, B.B.: Bag-of-features HMMs for segmentation-free Bangla word spotting. In: Proc. of the 4th International Workshop on Multilingual OCR, vol. 5. ACM, New York (2013)

    Google Scholar 

  46. Rusiñol, M., Aldavert, D., Toledo, R., Lladós, J.: browsing heterogeneous document collections by a segmentation-free word spotting method. In: Proc. of International Conference on Document Analysis and Recognition (ICDAR), vol. 22, pp. 63–67. IEEE, Piscataway (2011)

    Google Scholar 

  47. Rusiñol, M., Aldavert, D., Toledo, R., Lladós, J.: Efficient segmentation-free keyword spotting in historical document collections. Pattern Recognit. 48(2), 545–555 (2015)

    Article  Google Scholar 

  48. Sun, J., Ovsjanikov, M., Guibas, L.: A concise and provably informative multiscale signature based on heat diffusion. Comput. Graph. Forum 28, 1383–1392 (2009)

    Article  Google Scholar 

  49. Syeda-Mahmood, T.: Indexing of handwritten document images. In: Proc. of Workshop on Document Image Analysis, pp. 66–73. IEEE, Piscataway (1997)

    Google Scholar 

  50. Teraswa, K., Tanake, Y.: Slit style HOG feature for document image word spotting. In: Proc. International Conference of Document Analysis and Recognition (ICDAR), pp. 116–120. IEEE, Piscataway (2009)

    Google Scholar 

  51. The Asiatic Society, Kolkata. https://asiaticsocietycal.com

  52. Trier, I.D., Jain, A.K., Taxt, T.: Feature extraction methods for character recognition—a survey. Pattern Recognit. 29(4), 641–662 (1996)

    Article  Google Scholar 

  53. Vecco, M.: A definition of cultural heritage: from the tangible to the intangible. J. Cult. Herit. 11(3), 321–324 (2010)

    Article  Google Scholar 

  54. Vrochidis, S., Doulaverakis, C., Gounaris, A., Nidelkou, E., Makris, L., Kompatsiaris, I.: A hybrid ontology and visual-based retrieval model for cultural heritage multimedia collections. Int. J. Metadata Semant. Ontol. 3(3), 167–182 (2008)

    Article  Google Scholar 

  55. Zagoris, K., Pratikakis, I., Gatos, B.: Segmentation-based historical handwritten word spotting using document-specific local features. In: Proc. of International Conference on Frontiers in Handwritten Recognition (ICFHR), pp. 9–14 (2014)

    Google Scholar 

  56. Zhang, X., Tan, C.L.: Segmentation-free keyword spotting for handwritten documents based on heat kernel signature. In: Proc. of International Conference of Document Analysis and Recognition (ICDAR), pp. 827–831. IEEE, Piscataway (2013)

    Google Scholar 

  57. Zhang, X., Pal, U., Tan, C.L.: Segmentation-free keyword spotting for Bangla handwritten documents. In: Proc. of International Conference on Frontiers in Handwritten Recognition (ICFHR), pp. 381–386 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Das, S., Mandal, S. (2021). Object Spotting in Historical Documents. In: Mukhopadhyay, J., Sreedevi, I., Chanda, B., Chaudhury, S., Namboodiri, V.P. (eds) Digital Techniques for Heritage Presentation and Preservation. Springer, Cham. https://doi.org/10.1007/978-3-030-57907-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57907-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57906-7

  • Online ISBN: 978-3-030-57907-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics