Saliency prediction based on multi-channel models of visual processing

Qiang Li ORCID: orcid.org/0000-0002-5337-0676^1,2

355 Accesses
4 Citations
2 Altmetric
Explore all metrics

Abstract

Visual attention is one of the most significant characteristics for selecting and understanding the outside redundancy world. The human vision system cannot process all information simultaneously due to the visual information bottleneck. In order to reduce the redundant input of visual information, the human visual system mainly focuses on dominant parts of scenes. This is commonly known as visual saliency map prediction. This paper proposed a new psychophysical oriented saliency prediction architecture, which inspired by multi-channel model of visual cortex functioning in humans. The model consists of opponent color channels, wavelet transform, wavelet energy map, and contrast sensitivity function for extracting low-level image features and providing a maximum approximation to the low-level human visual system. The proposed model is evaluated using several datasets, including the MIT1003, MIT300, TORONTO, SID4VAM, and UCF Sports datasets. We also quantitatively and qualitatively compare the saliency prediction performance with that of other state-of-the-art models. Our model achieved strongly stable and better performance with different metrics on natural images, psychophysical synthetic images and dynamic videos. Additionally, we suggested that Fourier and spectral-inspired saliency prediction models outperformed other state-of-the-art non-neural network and even deep neural network models on psychophysical synthetic images. In the meantime, we suggest that deep neural networks need specific architectures and goals to be able to predict salient performance on psychophysical synthetic images better and more reliably. Finally, the proposed model could be used as a computational model of primate low-level vision system and help us understand mechanism of primate low-level vision system. The project page can be available at: https://sinodanishspain.github.io/HVS_SaliencyModel/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

Article 03 October 2024

Multi-level Net: A Visual Saliency Prediction Model

Aggregated Deep Saliency Prediction by Self-attention Network

Data and Code availability

The code performs main part of experiments described in this article are available at project page: https://sinodanishspain.github.io/HVS_SaliencyModel/.

Notes

Abbreviations

HVS:: Human Vision System
V1:: Primary Visual Cortex
ICL:: Incremental Coding Length
CNN:: Convolution Neural Network
DNN:: Deep Neural Network
WT:: Wavelet Transform
IWT:: Inverse Wavelet Transform
AUC:: Area Under Curve
NSS:: Normalized Scanpath Saliency
CC:: Pearson’s Correlation Coefficient
SIM:: Similarity or Histogram Intersection
IG:: Information Gain
KL:: Kullback–Leibler Divergence
CSFs:: Contrast Sensitivity Functions
FT:: Fourier Transform
DWT:: Discrete Wavelet Transform
IDWT:: Inverse Discrete Wavelet Transform
LGN:: Lateral Geniculate Nucleus
GT:: Ground Truth

References

Treisman, A., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980)
Article Google Scholar
Barlow, H.: Sensory mechanisms, the reduction of redundancy, and intelligence. Mech. Thought Proc. 10, 535–539 (1959)
Google Scholar
Wang, D., Kristjansson, A., Nakayama, K.: Efficient visual search without top-down or bottom-up guidance. Percept. Psychophys. 67, 239–53 (2005)
Article Google Scholar
Itti, L.: Models of bottom-up and top-down visual attention, Ph.D. dissertation, Pasadena, California, Jan (2000)
Sun, Y., Fisher, R.: Object-based visual attention for computer vision. Artif. Intell. 146, 77–123 (2003)
Article MathSciNet MATH Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (1998)
Article Google Scholar
Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition (CVPR), Jun. (2009)
Bruce, N.D.B., Tsotsos, J.K.: Saliency based on information maximization, In: Proceedings of the 18th International Conference on Neural Information Processing Systems, ser. NIPS’05, Vancouver, British Columbia, Canada: MIT Press, pp. 155–162 (2005)
Li, J., Levine, M., An, X., Xu, X., He, H.: Visual saliency based on scale-space analysis in the frequency domain. IEEE Trans. Pattern Anal. Mach. Intell. 35, 996–1010 (2012)
Article Google Scholar
Hou, X., Zhang, L.: Dynamic visual attention: searching for coding length increments. Adv. Neural Inf. Process. Syst 21, 681–688 (2008)
Google Scholar
Murray, N., Vanrell, M., Otazu, X., Párraga, C.A.: Saliency estimation using a non-parametric low-level vision model, In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 433–440 (2011)
Zhang, J., Sclaroff, S.: Saliency detection: A boolean map approach, In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Dec. pp. 153–160 (2013)
Pinna, B., Reeves, A., Koenderink, J., Doorn, A., Deiana, K.: A new principle of figure-ground segregation: the accentuation. Vis. Res. 143, 9–25 (2017)
Article Google Scholar
Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34, 194–201 (2011)
Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection, In: Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2376–2383 (2010)
Hou, X., Zhang, L.: Saliency detection: a spectral residual approach, In: IEEE Conference in Computer Vision and Pattern Recognition, vol. 2007, (2007)
Guo, C., Ma, Q.: Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform, In: IEEE Conference in Computer Vision and Pattern Recognition, (2008)
Schauerte, B., Stiefelhagen, R.: Quaternion-based spectral saliency detection for eye fixation prediction, In: European Conference on Computer Vision (ECCV), Oct. pp. 116–129, ISBN: 978-3-642-33708-6 (2012)
Murray, N., Vanrell, M., Otazu, X., Párraga, C.A.: Low-level spatiochromatic grouping for saliency estimation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2810–6 (2013)
Article Google Scholar
Otazu, X., Pàrraga, C.A., Vanrell, M.: Toward a unified chromatic induction model. J. Vis. 10, 5 (2010)
Article Google Scholar
Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9, 15–27 (2009)
Article Google Scholar
Louis, A., Maass, P., Rieder, A.: Wavelets: theory and Applications. Jan. ISBN: 978-0-471-96792-7 (1997)
Selvaraj, A., Shebiah, N.: Object recognition using wavelet based salient points. Open Signal Process. J. 2, 14–20 (2009)
Article Google Scholar
Spratling, M.: Predictive coding as a model of the v1 saliency map hypothesis. Neural Netwo. Offi. J. Int. Neural Netw. Soc. 26, 7–28 (2011)
Article Google Scholar
Rao, R., Ballard, D.: Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87 (1999)
Article Google Scholar
Borji, A.: Saliency prediction in the deep learning era: successes and limitations. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 679–700 (2019)
Article Google Scholar
Kruthiventi, S., Gudisa, V., Dholakiya, J., Babu, R.: Saliency unified: A deep architecture for simultaneous eye fixation prediction and salient object segmentation, Jun. pp. 5781–5790 (2016)
Cong, R., Lei, J., Fu, H., Cheng, M.-M., Lin, W., Huang, Q.: Review of visual saliency detection with comprehensive information. IEEE Trans. Circuits Syst. Video Technol. 29(10), 2941–2959 (2019)
Article Google Scholar
Cong, R., Lei, J., Fu, H., Porikli, F., Huang, Q., Hou, C.: Video saliency detection via sparsity-based reconstruction and propagation. IEEE Trans. Image Process. 28(10), 4819–4831 (2019)
Article MathSciNet MATH Google Scholar
Fang, C., Tian, H., Zhang, D., Zhang, Q., Han, J., Han, J.: Densely nested top-down flows for salient object detection. Sci. China Inf. Sci. 65(8), 1–14 (2022)
Article MathSciNet Google Scholar
Zhang, D., Han, J., Zhang, Y., Xu, D.: Synthesizing supervision for learning deep saliency network without human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1755–1769 (2020)
Article Google Scholar
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: 23rd International Conference on Pattern Recognition (ICPR), pp. 3488–3493 (2016)
Kümmerer, M., Wallis, T. S. A., Bethge, M.: Deepgaze II: reading fixations from deep features trained on object recognition, CoRR, vol. abs/1610.01563, arXiv: 1610.01563. [Online]. Available: (2016)
Butz, M.: Toward a cognitive sequence learner: hierarchy, self-organization, and top-down bottom-up interaction, Apr. [Online]. Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.177.6739 &rep=rep1 &type=pdf (2004)
Finlayson, G., Drew, M., Funt, B.: Color constancy: Enhancing von Kries adaptation via sensor transformations, Proc SPIE, vol. 1913, Sep (1993)
Finlayson, G.D., Alsam, A., Hordley, S.D.: Local linear models for improved von kries adaptation, In: The Tenth Color Imaging Conference: Color Science and Engineering Systems, Technologies, Applications, CIC,: Scottsdale, Arizona, USA, November 12–15, 2002, IS &T - The Society for Imaging. Science and Technology 2002, pp. 139–144 (2002)
Krauskopf, J., Gegenfurtner, K.: Color discrimination and adaptation. Vis. Res. 32, 2165–75 (1992)
Article Google Scholar
Brill, M.: Trichromatic theory, pp. 827–829, ISBN: 978-0-387-30771-8 (2014)
Shapley, R., Hawken, M.: Color in the cortex: single- and double-opponent cells. Vis. Res. 51, 701–17 (2011)
Article Google Scholar
Hering, E.: Outlines of a Theory of the Light Sense. Harvard University Press, Cambridge (1920)
Google Scholar
Hurvich, L., Jameson, D.: An opponent-process theory of color vison. Psychol. Rev. 64, 384–404 (1957)
Article Google Scholar
Zhaoping, L.: A neural model of contour integration in the primary visual cortex. Neural Comput. 10, 903–40 (1998)
Article Google Scholar
Haar, A.: Zur theorie der orthogonalen funktionensysteme. (zweite mitteilung)., Mathematische Annalen, vol. 71, pp. 38–53, [Online]. Available: http://eudml.org/doc/158516 (1912)
Imamoğlu, N., Lin, W., Fang, Y.: A saliency detection model using low-level features based on wavelet transform. IEEE Trans. Multimedia 15, 96–105 (2013)
Article Google Scholar
Mullen, K.: The contrast sensitivity of human color vision to red-green and blue-yellow chromatic gratings. J. Physiol. 359, 381–400 (1985)
Article Google Scholar
Mannos, J., Sakrison, D.: The effects of a visual fidelity criterion of the encoding of images. IEEE Trans. Inf. Theory 20(4), 525–536 (1974)
Article MATH Google Scholar
Watson, A., Malo, J.: Video quality measures based on the standard spatial observer, In: Proceedings International Conference on Image Processing (ICIP), vol. 3, Feb. pp. III–41, ISBN: 0-7803-7622-6 (2002)
Watson, A., Ahumada, A.: The spatial standard observer. J. Vis. 4, 51–51 (2010)
Article Google Scholar
Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations, In: MIT Technical Report (2012)
Berga, D., Fdez-Vidal, X. R., Otazu, X., Pardo, X. M.: Sid4vam: A benchmark dataset with synthetic images for visual attention modeling, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8788– 8797 (2019)
Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? arXiv preprint arXiv:1604.03605 (2016)
Borji, A., Rezazadegan Tavakoli, H., Sihite, D., Itti, L.: Analysis of scores, datasets, and models in visual saliency prediction, In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Dec, pp. 921–928 (2013)
Emami, M., Hoberock, L.: Selection of a best metric and evaluation of bottom-up visual saliency models. Image Vis. Comput. 31, 796–808 (2013)
Article Google Scholar
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: State-of-the-art and study of comparison metrics, In: Proceedings of the IEEE International Conference on Computer Vision, (2013)
Kümmerer, M., Wallis, T., Bethge, M.: Information-theoretic model comparison unifies saliency metrics. Proc. Natl. Acad. Sci. 112, 201 510 393 (2015)
Article Google Scholar
Jost, T., Ouerhani, N., Wartburg, R., Müri, R., Hügli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100, 107–123 (2005)
Article Google Scholar
Feng, M.: Evaluation toolbox for salient object detection., https://github.com/ArcherFMY/sal_eval_toolbox (2018)
Li, Q.: Understanding saliency prediction with deep convolutional neural networks and psychophysical models (2022)
Bowers, J.S., et al.: Deep problems with neural network models of human vision. Behav. Brain Sci., 1–74 (2022)
Riche, N., Mancas, M., Gosselin, B., Dutoit, T.: Rare: A new bottom-up saliency model, In: Image Processing, 19th IEEE Conference on (IEEE), (2012)
Zhang, L., Tong, M., Marks, T., Shan, H., Cottrell, G.: Sun: a Bayesian framework for saliency using nature statistics. J. Vis. 8, 32 (2008)
Article Google Scholar
Harel, J.: A saliency implementation in matlab, http://www.klab.caltech.edu/~harel/share/gbvs.php (2012)

Download references

Acknowledgements

I thank the anonymous reviewers, whose suggestions helped to improve and clarify this manuscript. This work was partially funded by GVA Grisolía-P/2019/035.

Author information

Authors and Affiliations

Image Processing Laboratory, University of Valencia, Valencia, Spain
Qiang Li
Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Atlanta, USA
Qiang Li

Authors

Qiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiang Li.

Ethics declarations

Conflict of interest

The author declares no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Fig. 5.

Table 5 Saliency prediction models. The model function categories that inspired the developed corresponding models are shown on the right side of the table. Most saliency prediction models were inspired by biological/cognitive, Fourier/spectral, information-theoretic, and probabilistic principles

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Q. Saliency prediction based on multi-channel models of visual processing. Machine Vision and Applications 34, 47 (2023). https://doi.org/10.1007/s00138-023-01405-2

Download citation

Received: 12 March 2022
Revised: 23 January 2023
Accepted: 19 April 2023
Published: 08 May 2023
DOI: https://doi.org/10.1007/s00138-023-01405-2

Saliency prediction based on multi-channel models of visual processing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

Multi-level Net: A Visual Saliency Prediction Model

Aggregated Deep Saliency Prediction by Self-attention Network

Data and Code availability

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Saliency prediction based on multi-channel models of visual processing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

Multi-level Net: A Visual Saliency Prediction Model

Aggregated Deep Saliency Prediction by Self-attention Network

Data and Code availability

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now