Towards Modelling an Attention-Based Text Localization Process

Antonio Clavelli¹⁹,
Dimosthenis Karatzas¹⁹,
Josep Lladós¹⁹,
Mario Ferraro²⁰ &
…
Giuseppe Boccignone²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7887))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1906 Accesses

Abstract

This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.

Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Where Should Saliency Models Look Next?

GLASS: Global to Local Attention for Scene-Text Spotting

Search Guided Saliency

References

Boccignone, G., Ferraro, M.: Feed and fly control of visual scanpaths for foveation image processing. Annals of Telecommunications, 1–17 (2012), http://dx.doi.org/10.1007/s12243-012-0316-9
Boccignone, G., Ferraro, M.: Gaze shift behavior on video as composite information foraging. Signal Processing: Image Communication, 1–18 (2012), http://dx.doi.org/10.1016/j.image.2012.07.002
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. PAMI 35(1), 135–207 (2013)
Article Google Scholar
Cerf, M., Frady, E., Koch, C.: Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision 9(12) (2009)
Google Scholar
Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., Van de Weijer, J.: Eye tracking: a comprehensive guide to methods and measures. Oxford University Press, Oxford (2011)
Google Scholar
Karaoglu, S., van Gemert, J., Gevers, T.: Object reading: Text recognition for object recognition. In: Proc. ECCV 2012 Workshop IFCVCR (2012)
Google Scholar
Meng, Q., Song, Y.: Text detection in natural scenes with salient region. In: Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 384–388. IEEE Computer Society (2012)
Google Scholar
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Chapter Google Scholar
Rensink, R.A.: The dynamic representation of scenes. Vis. Cognit. 7, 17–42 (2000)
Article Google Scholar
Schütz, A., Braun, D., Gegenfurtner, K.: Eye movements and perception: A selective review. Journal of Vision 11(5) (2011)
Google Scholar
Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. Journal of Vision 9(12), 1–27 (2009)
Article Google Scholar
Shahab, A., Shafait, F., Dengel, A.: Bayesian approach to photo time-stamp recognition. In: Proc. ICDAR, pp. 1039–1043. IEEE (2011)
Google Scholar
Shahab, A., Shafait, F., Dengel, A., Uchida, S.: How salient is scene text? In: Proc. 10th IAPR Int. Workshop on DAS, pp. 317–321. IEEE (2012)
Google Scholar
Sumathi, C., Santhanam, T., Priya, N.: Techniques and challenges of automatic text extraction in complex images: a survey. J. Theor. Appl. Inf. Tech. 35(2) (2012)
Google Scholar
Sun, Q., Lu, Y., Sun, S.: A visual attention based approach to text extraction. In: Proc. 20th ICPR, pp. 3991–3995. IEEE (2010)
Google Scholar
Tatler, B., Hayhoe, M., Land, M., Ballard, D.: Eye guidance in natural vision: Reinterpreting salience. Journal of Vision 11(5) (2011)
Google Scholar
Tatler, B., Vincent, B.: The prominence of behavioural biases in eye guidance. Visual Cognition 17(6-7), 1029–1054 (2009)
Article Google Scholar
Tipping, M.: Sparse bayesian learning and the relevance vector machine. The Journal of Machine Learning Research 1, 211–244 (2001)
MathSciNet MATH Google Scholar
Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychological Review 113(4), 766 (2006)
Article Google Scholar
Wang, H., Pomplun, M.: The attraction of visual attention to texts in real-world scenes. Journal of Vision 12(6) (2012)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: Proc. ICCV, pp. 1457–1464. IEEE (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Center, Universitat Autonoma de Barcelona, Edificio O, Campus UAB, Bellaterra (Cerdanyola), 08193, Barcelona, Spain
Antonio Clavelli, Dimosthenis Karatzas & Josep Lladós
Dipartimento di Fisica, Universitá di Torino, via Pietro Giuria 1, 10125, Torino, Italy
Mario Ferraro
Dipartimento di Informatica, Universitá di Milano, via Comelico 39/41, Italy
Giuseppe Boccignone

Authors

Antonio Clavelli
View author publications
You can also search for this author in PubMed Google Scholar
Dimosthenis Karatzas
View author publications
You can also search for this author in PubMed Google Scholar
Josep Lladós
View author publications
You can also search for this author in PubMed Google Scholar
Mario Ferraro
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Boccignone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Systems and Robotics, Instituto Superior Técnico, Portugal
João M. Sanches
University of Alicante, Spain
Luisa Micó
INESC and University of Porto, Porto, Portugal
Jaime S. Cardoso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clavelli, A., Karatzas, D., Lladós, J., Ferraro, M., Boccignone, G. (2013). Towards Modelling an Attention-Based Text Localization Process. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-38628-2_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Modelling an Attention-Based Text Localization Process

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Where Should Saliency Models Look Next?

GLASS: Global to Local Attention for Scene-Text Spotting

Search Guided Saliency

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Towards Modelling an Attention-Based Text Localization Process

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Where Should Saliency Models Look Next?

GLASS: Global to Local Attention for Scene-Text Spotting

Search Guided Saliency

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation