Abstract
In this chapter, we show how visual attention can enhance the performance in object recognition. This proposition is inspired by the idea introduced in Neisser (Cognitive psychology. Appleton-Century-Crofts, New York, 1967), which indicates that object recognition in human perception consists of two steps: an “attentional process selects the region of interest” and “complex object recognition processes are restricted to these regions”. Recently, in computer vision, much work has been done to combine the two domains of visual attention and object recognition, which we will call “attentive content-based image retrieval”. The test on our attentive CBIR approach in VOC 2005 demonstrated that we can maintain approximately the same recognition performance by selecting only 40 % of SIFT keypoints using classical saliency models. The proposed attentive CBIR framework can also be used to make a ranking between existing saliency models when used for CBIR. This ranking is different from the one using classical ground-truth like eye-tracking which means that choosing the best saliency models in predicting eye-tracking might be misleading when focusing on a CBIR application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
According to the Collins dictionary, the saliency is the quality of being prominent, conspicuous, or striking.
References
Walther, D., Rutishauser, U., Koch, C., & Perona, P. (2005). Selective visual attention enables learning and recognition of multiple objects in cluttered scenes. Computer Vision and Image Understanding, 100(1–2), 41–63.
Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222.
Itti, L., & Koch, C. (2001). Feature combination strategies for saliency-based visual attention systems. Journal of Electronic Imaging, 10, 161–169.
Hubel, D. H., & Wiesel, T. N. (2004). Brain and visual perception: The story of a 25-year collaboration (Vol. 31). New York: Oxford University Press.
Treisman, A., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 136(12), 97–136.
Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207.
Tuytelaars, T., & Mikolajczyk, K. (2007). Local invariant feature detectors: A survey. Foundations and Trends in Computer Graphics and Vision, 3(3), 177–280.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Everingham, M., Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2009). The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, 88(2), 303–338.
Awad, D., Courboulay, V., & Revel, A. (2012). Saliency filtering of sift detectors: Application to cbir. Advanced concepts for intelligent vision systems, 350 (Vol. 7517 of LNCS, pp. 290–300). Berlin/New York: Springer.
Dave, A., Dubey, R., & Ghanem, B. (2012). Do humans fixate on interest points? In Pattern Recognition (ICPR) (pp. 2784–2787).
Foo, J. J. (2007). Pruning SIFT for scalable near-duplicate image matching. In Australasian Database Conference, Ballarat, p. 9.
Alhwarin, F., Ristić-Durrant, D., & Gräser, A. (2010). Vf-sift: Very fast sift feature matching. In Proceedings of the 32Nd DAGM Conference on Pattern Recognition (pp. 222–231). Berlin/Heidelberg: Springer.
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention: Towards the underlying neural circuitry. Human Neurobiology, 4(4), 219–227.
Itti, L., Koch, C., Niebur, E., & Others (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.
Le Meur, O., Le Callet, P., Barba, D., & Thoreau, D. (2006). A coherent computational approach to model bottom-up visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(5), 802–817.
Frintrop, S. (2005). VOCUS: A visual attention system for object detection and goal-directed search, Phd thesis, accepted at the University of Bonn in July 2005 (Lecture notes in artificial intelligence (LNAI), Vol. 3899/2006). Berlin/Heidelberg: Springer. ISBN: 3-540-32759-2.
Torralba, A. (2003). Modeling global scene factors in attention. Journal of the Optical Society of America A, 20(7), 1407–1418.
Oliva, A., Torralba, A., Castelhano, M.S., & Henderson, J.M. (2003). Top-Down Control of Visual Attention in Object Detection, Proceedings of the IEEE International Conference on Image Processing. Vol. I, pages 253-256; September 14-17, in Barcelona, Spain.
Itti, L., & Baldi, P. (2009). Bayesian surprise attracts human attention. Vision Research, 49(10), 1295–1306.
Gao, D. G. D., Han, S. H. S., & Vasconcelos, N. (2009). Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(6), 989–1005.
Gu, E., Wang, J., & Badler, N. I. (2007). Generating sequence of eye fixations using decision-theoretic attention model. In Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint (pp. 277–292). Berlin/Heidelberg: Springer.
Bruce, N., & Tsotsos, J. (2006). Saliency based on information maximization. Advances in Neural Information Processing Systems, (18), 155–162.
Salah, A. A., Alpaydin, E., & Akarun, L. (2002). A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 420–425
Rao, R. P. N. (2005). Bayesian inference and attentional modulation in the visual cortex. NeuroReport, 16(16), 1843–1848.
Liu, Y. J., Luo, X., Xuan, Y. M., Chen, W. F., & Fu, X. L. (2011). Image retargeting quality assessment. EUROGRAPHICS, 30(2).
Hou, X., & Zhang, L. (2007). Saliency detection: A spectral residual approach. 2007 IEEE Conference on Computer Vision and Pattern Recognition, 1(800), 1–8.
Peters, R. J., & Itti, L. (2007). Beyond bottom-up: Incorporating task-dependent influences into a computational model of spatial attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis (pp. 1–8).
Kienzle, W., Franz, M. O., Schölkopf, B., & Wichmann, F. A. (2009). Center-surround patterns emerge as optimal predictors for human saccade targets. Journal of Vision, 9(5), 7.1–15.
Ramström, O., & Christensen, H. I. (2002). Visual attention using game theory. In Proceedings of the Second International Workshop on Biologically Motivated Computer Vision (BMCV’02), London (pp. 462–471). Springer.
Perreira Da Silva, M., Courboulay, V., & Estraillier, P. (2011). Objective validation of a dynamical and plausible computational model of visual attention. In IEEE European Workshop on Visual Information Processing, Paris (pp. 223–228).
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., & Dutoit, T. (2013). Saliency and human fixations: State-of-the-art and study of comparison metrics. Proceedings of IEEE 13th International Conference on Computer Vision (pp. 1153–1160). Sydney, Australia.
Perreira Da Silva, M., Courboulay, V., Prigent, A., & Estraillier, P. (2010). Evaluation of preys/predators systems for visual attention simulation. In VISAPP 2010 – International Conference on Computer Vision Theory and Applications, Angers (pp. 275–282). INSTICC.
Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts.
Frintrop, S., & Jensfelt, P. (2008). Attentional landmarks and active gaze control for visual slam. IEEE Transactions on Robotics, 24(5), 1054–1065.
Frintrop, S. (2011). Towards attentive robots. Paladyn. Journal of Behavioral Robotics, 2, 64–70. doi:10.2478/s13230-011-0018-4.
Treisman, A., & Gormican, S. (1988). Feature analysis in early vision: Evidence from search asymmetries. Psychological Review, 95(1), 15–48.
Wolfe, J. M. (1994). Guided search 2.0: A revised model of visual search. Psychonomic Bulletin & Review, 1(2), 202–238.
Perreira Da Silva, M., Courboulay, V., Prigent, A., & Estraillier, P. (2010). Evaluation of preys/predators systems for visual attention simulation. In P. Richard & J. Braz (Eds.), VISAPP 2010 – International Conference on Computer Vision Theory and Applications, Angers (Vol. 2, pp. 275–282). INSTICC.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.
Sivic, J., Russell, B.C, Efros, A. A, Zisserman, A., & Freeman, W. T. (2005). Discovering objects and their location in images. In Proceedings of IEEE International Conference on Computer Vision ICCV, Beijing.
Kondor, R., & Jebara, T. (2003). A kernel between sets of vectors. In Machine Learning: Tenth International Conference, Washington, DC.
Laaksonen, J. (2000). PicSOM – content-based image retrieval with self-organizing maps. Pattern Recognition Letters, 21(13–14), 1199–1207.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.
Mikolajczyk, K. (2004). Scale & affine invariant interest point detectors. International Journal of Computer Vision, 60(1), 63–86.
Lindeberg, T. (1998). Feature detection with automatic scale selection. Computer Vision, 30(2), 96.
Awad, D., Mancas, M., Riche, N., Courboulay, V., & Revel, A. (2015). A cbir-based evaluations framework for visual attention models. In 23rd European Signal Processing Conference (EUSIPCO), Nice.
Koch, C., Harel, J., & Perona, P. (2006). Graph-based visual saliency. In Proceedings of Advances in neural information processing systems (NIPS), Vancouver.
Seo, H. J., & Milanfar, P. (2009). Static and spacetime visual saliency detection by self-resemblance. Journal of Vision, 9(12), 1–27.
Murray, N., Vanrell, M., Otazu, X., & Alejandro Parraga, C. (2011). Saliency estimation using a nonparametric low-level vision model. In Computer Vision and Pattern Recognition (CVPR), Colorado Springs (pp. 433–440).
Riche, N., Mancas, M., Gosselin, B., & Dutoit, T. (2012). Rare: A new bottom-up saliency model. In Proceedings of the IEEE International Conference of Image Processing (ICIP), Lake Buena Vista.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this chapter
Cite this chapter
Awad, D., Courboulay, V., Revel, A. (2016). Attentive Content-Based Image Retrieval. In: Mancas, M., Ferrera, V., Riche, N., Taylor, J. (eds) From Human Attention to Computational Attention. Springer Series in Cognitive and Neural Systems, vol 10. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-3435-5_19
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3435-5_19
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-3433-1
Online ISBN: 978-1-4939-3435-5
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)