[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3552469.3555714acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment

Published: 10 October 2022 Publication History

Abstract

As a subjective concept, image quality assessment (IQA) is significantly affected by perceptual mechanisms. Two mutually influenced mechanisms, namely spatial attention and contrast sensitivity, are particularly important for IQA. This paper aims to explore a deep learning approach based on transformer for the two mechanisms. By converting contrast sensitivity to attention representation, a unified multi-head attention module is performed on spatial and channel features in transformer encoder to simulate the two mechanisms in IQA. Sequential spatial-channel self-attention is proposed to avoid expensive computation in the classical Transformer model. In addition, as image rescaling can potentially affect perceived quality, zero-padding and masking with assigning special attention weights are performed to handle arbitrary image resolutions without requiring image rescaling. The evaluation results on publicly available large-scale IQA databases have demonstrated outstanding performance and generalization of the proposed IQA model.

Supplementary Material

MP4 File (video-ssca-qoevma22.mp4)
Video presentation briefly describing the paper "Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment" at the workshop QoEVMA'22.

References

[1]
Meesters L., and Martens J. B. 2002. A single-ended blockiness measure for JPEG-coded images. Signal Process., 82(3): 369--387.
[2]
Ferzli R., and Karam L. 2009. A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans. Image Process. 18(4): 717--728.
[3]
Sheikh H.R. 2004. Image quality assessment using natural scene statistics. Ph.D. dissertation, The University of Texas at Austin, USA.
[4]
Moorthy A.K., and Bovik A.C. 2011. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Trans. Image Process., 20(12): 3350--3364.
[5]
Mittal A., Moorthy A.K., and Bovik A.C. 2012. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process., 21(12): 4695--4708.
[6]
Itti L., and Koch C. 2001. Computational modelling of visual attention. Nat. Rev. Neurosci., 2: 194--203.
[7]
Engelke U., Kaprykowsky H., Zepernick H.-J., and Ndjiki-Nya P. 2011. Visual attention in quality assessment. IEEE Signal Process. Mag., 28(6): 50--59.
[8]
Liu H., and Heynderickx I. 2011. Visual attention in objective image quality assessment: based on eye-tracking data. IEEE Trans. Circuits Syst. Video Technol., 21(7): 971--982.
[9]
You J., Perkis A., Hannuksela M.M., and Gabbou M. 2009. Perceptual quality assessment based on visual attention analysis. ACM Int. Conf. Multimed. (MM), Beijing, China.
[10]
Zhang W., Borji A., Wang Z., Le Callet P., and Liu H. 2015. The application of visual saliency models in objective image quality assessment: A statistical evaluation. IEEE Trans. Neural Netw. Learn. Syst., 27(6): 1266 - 1278.
[11]
You J., Ebrahimi T., and Perkis A. 2014. Attention driven foveated video quality assessment. IEEE Trans. Image Process., 23(1): 200--213.
[12]
Geisler W. S., and Perry J. S. 1998. A real-time foveated multi-resolution system for low-bandwidth video communication. SPIE Human Vision Electron. Imaging, 3299:294--305, San Jose, CA, USA, Jan. 1998.
[13]
Zhang X., Lin W., and Xue P. 2007. Just-noticeable difference estimation with pixels in images. J. Vis. Commun. 19(1): 30--41.
[14]
Pestilli F., and Carrasco M. 2005. Attention enhances contrast sensitivity at cued and impairs it at uncued locations. Vision Res., 45(14): 1867--1875.
[15]
You J., and Korhonen J. 2022. Attention integrated hierarchical networks for no-reference image quality assessment. J. Vis. Commun., 82.
[16]
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., and Polosukhin I. 2017. Attention is all your need. Adv. Neural Inf. Process. Syst. (NIPS), Dec. 2017, Long Beach, CA, USA.
[17]
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., Uszkoreit J., and Houlsby N. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. Int. Conf. Learn. Represent. (ICLR), May 2021, Virtual.
[18]
You J., and Korhonen J. 2021. Transformer for image quality assessment. IEEE. Int. Conf. Image Process. (ICIP), Sep. 2021, Anchorage, Alaska, USA.
[19]
Ke J., Wang O., Wang Y., Milanfar P., and Yang F. 2021. MUSIQ: Multi-scale image quality Transformer. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, Virtual.
[20]
Cheon M., Yoon S., Kang B., and Lee J. 2021. Perceptual image quality assessment with Transformers. IEEE/CVF Int. Conf. Comput. Vis. Workshops, Oct. 2021, Virtual.
[21]
Kang L., Ye P., Li Y., and Doermann D. 2014. Convolutional neural networks for no-reference image quality assessment.," IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, Columbus, OH, US,.
[22]
Bosse S., Maniry D., Müller K.-R., Wiegand T., and Same W. 2018. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process., 27(1): 206--219.
[23]
Li Y., Po L-M., Feng L., and Yuan F. 2016. No-reference image quality assessment with deep convolutional neural networks. IEEE Int. Conf. Digit. Signal Process. (DSP), Oct. 2016, Beijing, China.
[24]
Gao F., Yu J., Zhu S., Huang Q., and Tian Q. 2018. Blind image quality prediction by exploiting multi-level deep representations. Pattern Recognit., 81: 432--442, 2018.
[25]
Bianco S., Celona L., Napoletano P., and Schettini R. 2018. On the use of deep learning for blind image quality assessment. Signal, Image and Video Process., 12: 355--362.
[26]
Hosu V., Lin H., Sziranyi T., and Saupe D. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process., 29: 4041--4056.
[27]
Zhang W., Ma K., Yan J., Deng D., and Wang Z. 2020. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol., 30(1): 36--47.
[28]
Ma K., Liu W., Zhang K., Duanmu Z., Wang Z., and Zuo W. End-to-end blind image quality assessment using deep neural networks. IEEE Trans. Image Process., 27(3): 1202--1213.
[29]
Yang S., Jiang Q., Lin W., and Wang Y. 2019. SGDNet: An end-to-end saliency-guided deep neural network for no-reference image quality assessment. ACM Int. Conf. Multimed. (MM), Oct. 2019, Nice, France.
[30]
Wang Z. Bovik A.C., Sheikh H.R., and Simoncelli E.P. 2004 Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4): 600--612.
[31]
Yan Q., Gong D., and Zhang Y. 2018. Two-stream convolutional networks for blind image quality assessment. IEEE Trans. Image Process., 28(5): 2200--2211.
[32]
Su S., Yan Q., Zhu Y., Zhang C., Ge X., Sun J., and Zhang Y. 2020. Blindly assess image quality in the wild guided by a self-adaptive hyper network. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.
[33]
Wang Z., Simoncelli E.P., and Bovik A.C. 2003. Multi-scale structural similarity for image quality assessment. IEEE Conf. on Signals, Syst. and Comput., Nov. 2003, Pacific Grove, CA, USA.
[34]
Wu J., Ma J., Liang F., Dong W., Shi G., and Lin W. 2020. End-to-end blind image quality prediction with cascaded deep neural network. IEEE Trans. Image Process., 29: 7414--7426.
[35]
Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., Lin S., and Guo B. 2021. Swin Transformer: Hierarchical vision Transformer using shifted windows. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, Virtual.
[36]
Hu J., Shen L., and Sun G. Squeeze-and-excitation networks. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, Salt Lake City, UT, USA.
[37]
Woo S., Park J., Lee J-Y., and Kweon I.S. CBAM: Convolutional block attention module. Euro-pean Conf. Comput. Vis. (ECCV), Sep. 2018, Munich, Germany.
[38]
Devlin J., Chang M.-W., Lee K., and Toutanova K. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. NAACL-HLT, 1:4171--4186, Jun. 2019, Minneapolis, Minnesota, USA.
[39]
Kitaev N., Kaiser L., and Levskaya A. 2020. Reformer: The efficient Transformer. Int. Conf. Learn. Represent. (ICLR), Apr.-May, 2020, Virtual.
[40]
Zhu C., Ping W., Xiao C., Shoeybi M., Goldstein T., Anandkumar A., and Catanzaro B. 2021. Long-short Transformer: Efficient Transformers for language and vision. Adv. Neural Inf. Process. Syst. (NeurIPS), Dec. 2021, Virtual.
[41]
Fang Y., Zhu H., Zeng Y., Ma K., and Wang Z. 2020. Perceptual quality assessment of smartphone photography. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.
[42]
Ying Z., Niu H., Gupta P., Mahajan D., Ghadiyaram D., and Bovik A.C. 2020. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.
[43]
Zhang W., Ma K., Zhai G., and Yang X. 2021. Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Trans. Image Process., 30: 3474--3486.
[44]
Li D., Jiang T., and Jiang M. 2021. Unified quality assessment of in-the-wild videos with mixed datasets training. Int. J. Comput. Vis., 129: 1238--1257.
[45]
ITU-T Recommendation P.910. 2008. Subjective video quality assessment methods for multimedia applications," ITU.
[46]
A.B. Jung, K. Wada, J. Crall, et al., Imgaug, available online: https:// github.com/aleju/imgaug.
[47]
Virtanen T., Nuutinen M., Vaahteranoksa M., Oittinen P., and Häkkinen J. 2015. CID2013: A database for eval-uating no-reference image quality assessment algorithms. IEEE Trans. Image Process, 24(1):, pp. 390--402.
[48]
Ghadiyaram D., and Bovik A.C. 2016. Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans. Image Process, 25(1): 372--387.
[49]
Selvaraju R.R., Cogswell M., Das A., Vedantam R., Parikh D., and Batra D. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2017, Venice, Italy.

Index Terms

  1. Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    QoEVMA '22: Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications
    October 2022
    75 pages
    ISBN:9781450394994
    DOI:10.1145/3552469
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. contrast sensitivity
    2. image quality assessment (iqa)
    3. sequential spatial-channel attention (ssca)
    4. spatial attention
    5. transformer

    Qualifiers

    • Research-article

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    QoEVMA '22 Paper Acceptance Rate 8 of 14 submissions, 57%;
    Overall Acceptance Rate 14 of 20 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 69
      Total Downloads
    • Downloads (Last 12 months)18
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media