Abstract
This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.
Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boccignone, G., Ferraro, M.: Feed and fly control of visual scanpaths for foveation image processing. Annals of Telecommunications, 1–17 (2012), http://dx.doi.org/10.1007/s12243-012-0316-9
Boccignone, G., Ferraro, M.: Gaze shift behavior on video as composite information foraging. Signal Processing: Image Communication, 1–18 (2012), http://dx.doi.org/10.1016/j.image.2012.07.002
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. PAMI 35(1), 135–207 (2013)
Cerf, M., Frady, E., Koch, C.: Faces and text attract gaze independent of the task: Experimental data and computer model. Journal of Vision 9(12) (2009)
Holmqvist, K., Nyström, M., Andersson, R., Dewhurst, R., Jarodzka, H., Van de Weijer, J.: Eye tracking: a comprehensive guide to methods and measures. Oxford University Press, Oxford (2011)
Karaoglu, S., van Gemert, J., Gevers, T.: Object reading: Text recognition for object recognition. In: Proc. ECCV 2012 Workshop IFCVCR (2012)
Meng, Q., Song, Y.: Text detection in natural scenes with salient region. In: Proceedings of the 2012 10th IAPR International Workshop on Document Analysis Systems, pp. 384–388. IEEE Computer Society (2012)
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Rensink, R.A.: The dynamic representation of scenes. Vis. Cognit. 7, 17–42 (2000)
Schütz, A., Braun, D., Gegenfurtner, K.: Eye movements and perception: A selective review. Journal of Vision 11(5) (2011)
Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. Journal of Vision 9(12), 1–27 (2009)
Shahab, A., Shafait, F., Dengel, A.: Bayesian approach to photo time-stamp recognition. In: Proc. ICDAR, pp. 1039–1043. IEEE (2011)
Shahab, A., Shafait, F., Dengel, A., Uchida, S.: How salient is scene text? In: Proc. 10th IAPR Int. Workshop on DAS, pp. 317–321. IEEE (2012)
Sumathi, C., Santhanam, T., Priya, N.: Techniques and challenges of automatic text extraction in complex images: a survey. J. Theor. Appl. Inf. Tech. 35(2) (2012)
Sun, Q., Lu, Y., Sun, S.: A visual attention based approach to text extraction. In: Proc. 20th ICPR, pp. 3991–3995. IEEE (2010)
Tatler, B., Hayhoe, M., Land, M., Ballard, D.: Eye guidance in natural vision: Reinterpreting salience. Journal of Vision 11(5) (2011)
Tatler, B., Vincent, B.: The prominence of behavioural biases in eye guidance. Visual Cognition 17(6-7), 1029–1054 (2009)
Tipping, M.: Sparse bayesian learning and the relevance vector machine. The Journal of Machine Learning Research 1, 211–244 (2001)
Torralba, A., Oliva, A., Castelhano, M., Henderson, J.: Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychological Review 113(4), 766 (2006)
Wang, H., Pomplun, M.: The attraction of visual attention to texts in real-world scenes. Journal of Vision 12(6) (2012)
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: Proc. ICCV, pp. 1457–1464. IEEE (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Clavelli, A., Karatzas, D., Lladós, J., Ferraro, M., Boccignone, G. (2013). Towards Modelling an Attention-Based Text Localization Process. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-38628-2_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)