[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Saliency prediction based on multi-channel models of visual processing

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Visual attention is one of the most significant characteristics for selecting and understanding the outside redundancy world. The human vision system cannot process all information simultaneously due to the visual information bottleneck. In order to reduce the redundant input of visual information, the human visual system mainly focuses on dominant parts of scenes. This is commonly known as visual saliency map prediction. This paper proposed a new psychophysical oriented saliency prediction architecture, which inspired by multi-channel model of visual cortex functioning in humans. The model consists of opponent color channels, wavelet transform, wavelet energy map, and contrast sensitivity function for extracting low-level image features and providing a maximum approximation to the low-level human visual system. The proposed model is evaluated using several datasets, including the MIT1003, MIT300, TORONTO, SID4VAM, and UCF Sports datasets. We also quantitatively and qualitatively compare the saliency prediction performance with that of other state-of-the-art models. Our model achieved strongly stable and better performance with different metrics on natural images, psychophysical synthetic images and dynamic videos. Additionally, we suggested that Fourier and spectral-inspired saliency prediction models outperformed other state-of-the-art non-neural network and even deep neural network models on psychophysical synthetic images. In the meantime, we suggest that deep neural networks need specific architectures and goals to be able to predict salient performance on psychophysical synthetic images better and more reliably. Finally, the proposed model could be used as a computational model of primate low-level vision system and help us understand mechanism of primate low-level vision system. The project page can be available at: https://sinodanishspain.github.io/HVS_SaliencyModel/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data and Code availability

The code performs main part of experiments described in this article are available at project page: https://sinodanishspain.github.io/HVS_SaliencyModel/.

Notes

  1. https://www.python.org/.

  2. http://sipi.usc.edu/database/database.php?volume=misc.

  3. http://r0k.us/graphics/kodak/.

  4. https://github.com/cvzoya/saliency.

  5. https://github.com/ArcherFMY/sal_eval_toolbox/.

  6. https://www.crcv.ucf.edu/data/UCF_Sports_Action.php

Abbreviations

HVS:

Human Vision System

V1:

Primary Visual Cortex

ICL:

Incremental Coding Length

CNN:

Convolution Neural Network

DNN:

Deep Neural Network

WT:

Wavelet Transform

IWT:

Inverse Wavelet Transform

AUC:

Area Under Curve

NSS:

Normalized Scanpath Saliency

CC:

Pearson’s Correlation Coefficient

SIM:

Similarity or Histogram Intersection

IG:

Information Gain

KL:

Kullback–Leibler Divergence

CSFs:

Contrast Sensitivity Functions

FT:

Fourier Transform

DWT:

Discrete Wavelet Transform

IDWT:

Inverse Discrete Wavelet Transform

LGN:

Lateral Geniculate Nucleus

GT:

Ground Truth

References

  1. Treisman, A., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12, 97–136 (1980)

    Article  Google Scholar 

  2. Barlow, H.: Sensory mechanisms, the reduction of redundancy, and intelligence. Mech. Thought Proc. 10, 535–539 (1959)

    Google Scholar 

  3. Wang, D., Kristjansson, A., Nakayama, K.: Efficient visual search without top-down or bottom-up guidance. Percept. Psychophys. 67, 239–53 (2005)

    Article  Google Scholar 

  4. Itti, L.: Models of bottom-up and top-down visual attention, Ph.D. dissertation, Pasadena, California, Jan (2000)

  5. Sun, Y., Fisher, R.: Object-based visual attention for computer vision. Artif. Intell. 146, 77–123 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  6. Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20, 1254–1259 (1998)

    Article  Google Scholar 

  7. Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision Pattern Recognition (CVPR), Jun. (2009)

  8. Bruce, N.D.B., Tsotsos, J.K.: Saliency based on information maximization, In: Proceedings of the 18th International Conference on Neural Information Processing Systems, ser. NIPS’05, Vancouver, British Columbia, Canada: MIT Press, pp. 155–162 (2005)

  9. Li, J., Levine, M., An, X., Xu, X., He, H.: Visual saliency based on scale-space analysis in the frequency domain. IEEE Trans. Pattern Anal. Mach. Intell. 35, 996–1010 (2012)

    Article  Google Scholar 

  10. Hou, X., Zhang, L.: Dynamic visual attention: searching for coding length increments. Adv. Neural Inf. Process. Syst 21, 681–688 (2008)

    Google Scholar 

  11. Murray, N., Vanrell, M., Otazu, X., Párraga, C.A.: Saliency estimation using a non-parametric low-level vision model, In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 433–440 (2011)

  12. Zhang, J., Sclaroff, S.: Saliency detection: A boolean map approach, In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Dec. pp. 153–160 (2013)

  13. Pinna, B., Reeves, A., Koenderink, J., Doorn, A., Deiana, K.: A new principle of figure-ground segregation: the accentuation. Vis. Res. 143, 9–25 (2017)

    Article  Google Scholar 

  14. Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34, 194–201 (2011)

    Google Scholar 

  15. Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection, In: Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2376–2383 (2010)

  16. Hou, X., Zhang, L.: Saliency detection: a spectral residual approach, In: IEEE Conference in Computer Vision and Pattern Recognition, vol. 2007, (2007)

  17. Guo, C., Ma, Q.: Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform, In: IEEE Conference in Computer Vision and Pattern Recognition, (2008)

  18. Schauerte, B., Stiefelhagen, R.: Quaternion-based spectral saliency detection for eye fixation prediction, In: European Conference on Computer Vision (ECCV), Oct. pp. 116–129, ISBN: 978-3-642-33708-6 (2012)

  19. Murray, N., Vanrell, M., Otazu, X., Párraga, C.A.: Low-level spatiochromatic grouping for saliency estimation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2810–6 (2013)

    Article  Google Scholar 

  20. Otazu, X., Pàrraga, C.A., Vanrell, M.: Toward a unified chromatic induction model. J. Vis. 10, 5 (2010)

    Article  Google Scholar 

  21. Seo, H., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9, 15–27 (2009)

    Article  Google Scholar 

  22. Louis, A., Maass, P., Rieder, A.: Wavelets: theory and Applications. Jan. ISBN: 978-0-471-96792-7 (1997)

  23. Selvaraj, A., Shebiah, N.: Object recognition using wavelet based salient points. Open Signal Process. J. 2, 14–20 (2009)

    Article  Google Scholar 

  24. Spratling, M.: Predictive coding as a model of the v1 saliency map hypothesis. Neural Netwo. Offi. J. Int. Neural Netw. Soc. 26, 7–28 (2011)

    Article  Google Scholar 

  25. Rao, R., Ballard, D.: Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87 (1999)

    Article  Google Scholar 

  26. Borji, A.: Saliency prediction in the deep learning era: successes and limitations. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 679–700 (2019)

    Article  Google Scholar 

  27. Kruthiventi, S., Gudisa, V., Dholakiya, J., Babu, R.: Saliency unified: A deep architecture for simultaneous eye fixation prediction and salient object segmentation, Jun. pp. 5781–5790 (2016)

  28. Cong, R., Lei, J., Fu, H., Cheng, M.-M., Lin, W., Huang, Q.: Review of visual saliency detection with comprehensive information. IEEE Trans. Circuits Syst. Video Technol. 29(10), 2941–2959 (2019)

    Article  Google Scholar 

  29. Cong, R., Lei, J., Fu, H., Porikli, F., Huang, Q., Hou, C.: Video saliency detection via sparsity-based reconstruction and propagation. IEEE Trans. Image Process. 28(10), 4819–4831 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  30. Fang, C., Tian, H., Zhang, D., Zhang, Q., Han, J., Han, J.: Densely nested top-down flows for salient object detection. Sci. China Inf. Sci. 65(8), 1–14 (2022)

    Article  MathSciNet  Google Scholar 

  31. Zhang, D., Han, J., Zhang, Y., Xu, D.: Synthesizing supervision for learning deep saliency network without human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1755–1769 (2020)

    Article  Google Scholar 

  32. Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: 23rd International Conference on Pattern Recognition (ICPR), pp. 3488–3493 (2016)

  33. Kümmerer, M., Wallis, T. S. A., Bethge, M.: Deepgaze II: reading fixations from deep features trained on object recognition, CoRR, vol. abs/1610.01563, arXiv: 1610.01563. [Online]. Available: (2016)

  34. Butz, M.: Toward a cognitive sequence learner: hierarchy, self-organization, and top-down bottom-up interaction, Apr. [Online]. Available: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.177.6739 &rep=rep1 &type=pdf (2004)

  35. Finlayson, G., Drew, M., Funt, B.: Color constancy: Enhancing von Kries adaptation via sensor transformations, Proc SPIE, vol. 1913, Sep (1993)

  36. Finlayson, G.D., Alsam, A., Hordley, S.D.: Local linear models for improved von kries adaptation, In: The Tenth Color Imaging Conference: Color Science and Engineering Systems, Technologies, Applications, CIC,: Scottsdale, Arizona, USA, November 12–15, 2002, IS &T - The Society for Imaging. Science and Technology 2002, pp. 139–144 (2002)

  37. Krauskopf, J., Gegenfurtner, K.: Color discrimination and adaptation. Vis. Res. 32, 2165–75 (1992)

    Article  Google Scholar 

  38. Brill, M.: Trichromatic theory, pp. 827–829, ISBN: 978-0-387-30771-8 (2014)

  39. Shapley, R., Hawken, M.: Color in the cortex: single- and double-opponent cells. Vis. Res. 51, 701–17 (2011)

    Article  Google Scholar 

  40. Hering, E.: Outlines of a Theory of the Light Sense. Harvard University Press, Cambridge (1920)

    Google Scholar 

  41. Hurvich, L., Jameson, D.: An opponent-process theory of color vison. Psychol. Rev. 64, 384–404 (1957)

    Article  Google Scholar 

  42. Zhaoping, L.: A neural model of contour integration in the primary visual cortex. Neural Comput. 10, 903–40 (1998)

    Article  Google Scholar 

  43. Haar, A.: Zur theorie der orthogonalen funktionensysteme. (zweite mitteilung)., Mathematische Annalen, vol. 71, pp. 38–53, [Online]. Available: http://eudml.org/doc/158516 (1912)

  44. Imamoğlu, N., Lin, W., Fang, Y.: A saliency detection model using low-level features based on wavelet transform. IEEE Trans. Multimedia 15, 96–105 (2013)

    Article  Google Scholar 

  45. Mullen, K.: The contrast sensitivity of human color vision to red-green and blue-yellow chromatic gratings. J. Physiol. 359, 381–400 (1985)

    Article  Google Scholar 

  46. Mannos, J., Sakrison, D.: The effects of a visual fidelity criterion of the encoding of images. IEEE Trans. Inf. Theory 20(4), 525–536 (1974)

    Article  MATH  Google Scholar 

  47. Watson, A., Malo, J.: Video quality measures based on the standard spatial observer, In: Proceedings International Conference on Image Processing (ICIP), vol. 3, Feb. pp. III–41, ISBN: 0-7803-7622-6 (2002)

  48. Watson, A., Ahumada, A.: The spatial standard observer. J. Vis. 4, 51–51 (2010)

    Article  Google Scholar 

  49. Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations, In: MIT Technical Report (2012)

  50. Berga, D., Fdez-Vidal, X. R., Otazu, X., Pardo, X. M.: Sid4vam: A benchmark dataset with synthetic images for visual attention modeling, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8788– 8797 (2019)

  51. Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? arXiv preprint arXiv:1604.03605 (2016)

  52. Borji, A., Rezazadegan Tavakoli, H., Sihite, D., Itti, L.: Analysis of scores, datasets, and models in visual saliency prediction, In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Dec, pp. 921–928 (2013)

  53. Emami, M., Hoberock, L.: Selection of a best metric and evaluation of bottom-up visual saliency models. Image Vis. Comput. 31, 796–808 (2013)

    Article  Google Scholar 

  54. Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: State-of-the-art and study of comparison metrics, In: Proceedings of the IEEE International Conference on Computer Vision, (2013)

  55. Kümmerer, M., Wallis, T., Bethge, M.: Information-theoretic model comparison unifies saliency metrics. Proc. Natl. Acad. Sci. 112, 201 510 393 (2015)

    Article  Google Scholar 

  56. Jost, T., Ouerhani, N., Wartburg, R., Müri, R., Hügli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100, 107–123 (2005)

    Article  Google Scholar 

  57. Feng, M.: Evaluation toolbox for salient object detection., https://github.com/ArcherFMY/sal_eval_toolbox (2018)

  58. Li, Q.: Understanding saliency prediction with deep convolutional neural networks and psychophysical models (2022)

  59. Bowers, J.S., et al.: Deep problems with neural network models of human vision. Behav. Brain Sci., 1–74 (2022)

  60. Riche, N., Mancas, M., Gosselin, B., Dutoit, T.: Rare: A new bottom-up saliency model, In: Image Processing, 19th IEEE Conference on (IEEE), (2012)

  61. Zhang, L., Tong, M., Marks, T., Shan, H., Cottrell, G.: Sun: a Bayesian framework for saliency using nature statistics. J. Vis. 8, 32 (2008)

    Article  Google Scholar 

  62. Harel, J.: A saliency implementation in matlab, http://www.klab.caltech.edu/~harel/share/gbvs.php (2012)

Download references

Acknowledgements

I thank the anonymous reviewers, whose suggestions helped to improve and clarify this manuscript. This work was partially funded by GVA Grisolía-P/2019/035.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Li.

Ethics declarations

Conflict of interest

The author declares no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Fig. 5.

Table 5 Saliency prediction models. The model function categories that inspired the developed corresponding models are shown on the right side of the table. Most saliency prediction models were inspired by biological/cognitive, Fourier/spectral, information-theoretic, and probabilistic principles

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Q. Saliency prediction based on multi-channel models of visual processing. Machine Vision and Applications 34, 47 (2023). https://doi.org/10.1007/s00138-023-01405-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-023-01405-2

Keywords

Navigation