Abstract
This paper introduces a selective attention system that guides users to detect the regions of interest more effectively by adaptively selecting and using spatial and temporal features according to the input images. Although the proposed system is based on a typical bottom-up method, it achieved improvement in the method for extracting features and calculating the saliencies compared to existing studies. In the proposed system, spatial saliencies have dynamic information from which features are adaptively selected according to the input images. Also temporal saliencies in the proposed system have pieces of information for individual moving objects that are associated with each other obtained through multi-resolution feature analysis. In addition, when combining a spatial saliency and a temporal saliency, the activity of the input saliency is measured, and the weights that change dynamically according to the activity are calculated, and the spatial saliency and temporal saliency are combined according to the weights. In order to evaluate the performance of the proposed system, comparative experiments with the existing systems were conducted with diverse experimental images and as a result, it could be seen that the proposed system produces results closer to the results of humans’ visual recognition compared to previous systems.
Similar content being viewed by others
References
Navalpakkam, V., Arbib, M., & Itti, L. (2005). Attention and scene understanding. In L. Itti, G. Rees, & J. K. Tsotsos (Eds.), Neurobiology of attention (pp. 197–203). Cambridge: Academic Press.
Zhai, Y., & Shah, M. (2006). Visual attention detection in video sequences using spatiotemporal cues. In Proceeding of the 14th annual ACM international conference on multimedia (pp. 815–824).
Koch, C., & Ullman, S. (1985). Shifts in selective visual attention : Towards the underlying neural circuitry. Human Neurobiology, 4, 219–227.
Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40(10–12), 1489–1506.
Park, M., & Cheoi, K. (2009). Automatic focusing attention for a stereo pair of image sequence. In Proceedings of the 2nd international conference on interaction sciences: Information technology, culture and human (pp. 31–47).
Dhavale, N., & Itti, L. (2003). Saliency-based multifoveated MPEG compression. Proceeding of IEEE International Symposium on Signal Processing and its Applications, 1, 229–232.
Hiroshi, A. (2004). Visual psychophysics (8): Visual motion perception and motion pictures. Journal of Image Information and Television Engineers, 58(8), 1151–1156.
Hong, H. (2006). Vision information processing system based on visual attention using four vision path in human. Master degree of Yonsei University.
Cheoi, K., & Park, M. (2011). Visual information selection mechanism based on human visual attention. Journal of Korea Multimedia Society, 14(3), 378–391.
Li, S., & Lee, M. (2007). An efficient spatiotemporal attention model and its application to shot matching. IEEE Transactions on Circuits and Systems for Video Technology, 17(10), 1383–1387.
Itti, L., Yoshida, M., Berg, D. J., Ikeda, T., Kato, R., Takaura, K., et al. (2009). Saliency-based guidance of eye movements in monkeys with unilateral lesion of primary visual cortex. Neuroscience Research, 65, S51–S51.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Tak, Y., Rho, S., & Hwang, E. (2011). Motion sequence-based human abnormality detection scheme for smart spaces. Wireless Personal Communications, 60(3), 507–519.
Yu, H., & Tao, M. (2011). Fast handover in hierarchical mobile IPv6 based on motion pattern detection of mobile node. Wireless Personal Communications, 61(2), 303–321.
Zhou, L., Li, G., Zheng, Z., & Xiao, T. (2015). A trust region-based particle filter algorithm for indoor tracking. Wireless Personal Communications, 80(2), 739–750.
Katsarakis, N., Pnevmatikakis, A., Tan, Z., & Prasad, R. (2014). Combination of multiple measurement cues for visual face tracking. Wireless Personal Communications, 78(3), 1789–1810.
Livingstone, M., & Hubel, D. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7(11), 3416–3468.
Hurvich, L., & Jameson, D. (1957). An opponent-process theory of color vision. Psychological Review, 64(6), 384–404.
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2011-0023674) and the KIAT (Korea Institute for Advancement of Technology) grant funded by the Korea Government (MOTIE : Ministry of Trade Industry and Energy). (No. N0002429).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cheoi, K.J., Kim, MH. Adaptive Spatiotemporal Feature Extraction and Dynamic Combining Methods for Selective Visual Attention System. Wireless Pers Commun 98, 3227–3243 (2018). https://doi.org/10.1007/s11277-017-5043-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-017-5043-0