More Web Proxy on the site http://driver.im/

research-article

Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment

Authors:

Jari KorhonenAuthors Info & Claims

QoEVMA '22: Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications

Pages 13 - 21

https://doi.org/10.1145/3552469.3555714

Published: 10 October 2022 Publication History

Abstract

As a subjective concept, image quality assessment (IQA) is significantly affected by perceptual mechanisms. Two mutually influenced mechanisms, namely spatial attention and contrast sensitivity, are particularly important for IQA. This paper aims to explore a deep learning approach based on transformer for the two mechanisms. By converting contrast sensitivity to attention representation, a unified multi-head attention module is performed on spatial and channel features in transformer encoder to simulate the two mechanisms in IQA. Sequential spatial-channel self-attention is proposed to avoid expensive computation in the classical Transformer model. In addition, as image rescaling can potentially affect perceived quality, zero-padding and masking with assigning special attention weights are performed to handle arbitrary image resolutions without requiring image rescaling. The evaluation results on publicly available large-scale IQA databases have demonstrated outstanding performance and generalization of the proposed IQA model.

Supplementary Material

MP4 File (video-ssca-qoevma22.mp4)

Video presentation briefly describing the paper "Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment" at the workshop QoEVMA'22.

Download
54.82 MB

References

[1]

Meesters L., and Martens J. B. 2002. A single-ended blockiness measure for JPEG-coded images. Signal Process., 82(3): 369--387.

Digital Library

[2]

Ferzli R., and Karam L. 2009. A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB). IEEE Trans. Image Process. 18(4): 717--728.

Digital Library

[3]

Sheikh H.R. 2004. Image quality assessment using natural scene statistics. Ph.D. dissertation, The University of Texas at Austin, USA.

[4]

Moorthy A.K., and Bovik A.C. 2011. Blind image quality assessment: From natural scene statistics to perceptual quality. IEEE Trans. Image Process., 20(12): 3350--3364.

Digital Library

[5]

Mittal A., Moorthy A.K., and Bovik A.C. 2012. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process., 21(12): 4695--4708.

Digital Library

[6]

Itti L., and Koch C. 2001. Computational modelling of visual attention. Nat. Rev. Neurosci., 2: 194--203.

[7]

Engelke U., Kaprykowsky H., Zepernick H.-J., and Ndjiki-Nya P. 2011. Visual attention in quality assessment. IEEE Signal Process. Mag., 28(6): 50--59.

[8]

Liu H., and Heynderickx I. 2011. Visual attention in objective image quality assessment: based on eye-tracking data. IEEE Trans. Circuits Syst. Video Technol., 21(7): 971--982.

Digital Library

[9]

You J., Perkis A., Hannuksela M.M., and Gabbou M. 2009. Perceptual quality assessment based on visual attention analysis. ACM Int. Conf. Multimed. (MM), Beijing, China.

[10]

Zhang W., Borji A., Wang Z., Le Callet P., and Liu H. 2015. The application of visual saliency models in objective image quality assessment: A statistical evaluation. IEEE Trans. Neural Netw. Learn. Syst., 27(6): 1266 - 1278.

[11]

You J., Ebrahimi T., and Perkis A. 2014. Attention driven foveated video quality assessment. IEEE Trans. Image Process., 23(1): 200--213.

Digital Library

[12]

Geisler W. S., and Perry J. S. 1998. A real-time foveated multi-resolution system for low-bandwidth video communication. SPIE Human Vision Electron. Imaging, 3299:294--305, San Jose, CA, USA, Jan. 1998.

[13]

Zhang X., Lin W., and Xue P. 2007. Just-noticeable difference estimation with pixels in images. J. Vis. Commun. 19(1): 30--41.

Digital Library

[14]

Pestilli F., and Carrasco M. 2005. Attention enhances contrast sensitivity at cued and impairs it at uncued locations. Vision Res., 45(14): 1867--1875.

[15]

You J., and Korhonen J. 2022. Attention integrated hierarchical networks for no-reference image quality assessment. J. Vis. Commun., 82.

Digital Library

[16]

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L., and Polosukhin I. 2017. Attention is all your need. Adv. Neural Inf. Process. Syst. (NIPS), Dec. 2017, Long Beach, CA, USA.

[17]

Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., Uszkoreit J., and Houlsby N. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. Int. Conf. Learn. Represent. (ICLR), May 2021, Virtual.

[18]

You J., and Korhonen J. 2021. Transformer for image quality assessment. IEEE. Int. Conf. Image Process. (ICIP), Sep. 2021, Anchorage, Alaska, USA.

[19]

Ke J., Wang O., Wang Y., Milanfar P., and Yang F. 2021. MUSIQ: Multi-scale image quality Transformer. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, Virtual.

[20]

Cheon M., Yoon S., Kang B., and Lee J. 2021. Perceptual image quality assessment with Transformers. IEEE/CVF Int. Conf. Comput. Vis. Workshops, Oct. 2021, Virtual.

[21]

Kang L., Ye P., Li Y., and Doermann D. 2014. Convolutional neural networks for no-reference image quality assessment.," IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2014, Columbus, OH, US,.

[22]

Bosse S., Maniry D., Müller K.-R., Wiegand T., and Same W. 2018. Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. Image Process., 27(1): 206--219.

[23]

Li Y., Po L-M., Feng L., and Yuan F. 2016. No-reference image quality assessment with deep convolutional neural networks. IEEE Int. Conf. Digit. Signal Process. (DSP), Oct. 2016, Beijing, China.

[24]

Gao F., Yu J., Zhu S., Huang Q., and Tian Q. 2018. Blind image quality prediction by exploiting multi-level deep representations. Pattern Recognit., 81: 432--442, 2018.

Digital Library

[25]

Bianco S., Celona L., Napoletano P., and Schettini R. 2018. On the use of deep learning for blind image quality assessment. Signal, Image and Video Process., 12: 355--362.

[26]

Hosu V., Lin H., Sziranyi T., and Saupe D. 2020. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Trans. Image Process., 29: 4041--4056.

Digital Library

[27]

Zhang W., Ma K., Yan J., Deng D., and Wang Z. 2020. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Trans. Circuits Syst. Video Technol., 30(1): 36--47.

[28]

Ma K., Liu W., Zhang K., Duanmu Z., Wang Z., and Zuo W. End-to-end blind image quality assessment using deep neural networks. IEEE Trans. Image Process., 27(3): 1202--1213.

[29]

Yang S., Jiang Q., Lin W., and Wang Y. 2019. SGDNet: An end-to-end saliency-guided deep neural network for no-reference image quality assessment. ACM Int. Conf. Multimed. (MM), Oct. 2019, Nice, France.

[30]

Wang Z. Bovik A.C., Sheikh H.R., and Simoncelli E.P. 2004 Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13(4): 600--612.

Digital Library

[31]

Yan Q., Gong D., and Zhang Y. 2018. Two-stream convolutional networks for blind image quality assessment. IEEE Trans. Image Process., 28(5): 2200--2211.

Digital Library

[32]

Su S., Yan Q., Zhu Y., Zhang C., Ge X., Sun J., and Zhang Y. 2020. Blindly assess image quality in the wild guided by a self-adaptive hyper network. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.

[33]

Wang Z., Simoncelli E.P., and Bovik A.C. 2003. Multi-scale structural similarity for image quality assessment. IEEE Conf. on Signals, Syst. and Comput., Nov. 2003, Pacific Grove, CA, USA.

[34]

Wu J., Ma J., Liang F., Dong W., Shi G., and Lin W. 2020. End-to-end blind image quality prediction with cascaded deep neural network. IEEE Trans. Image Process., 29: 7414--7426.

Digital Library

[35]

Liu Z., Lin Y., Cao Y., Hu H., Wei Y., Zhang Z., Lin S., and Guo B. 2021. Swin Transformer: Hierarchical vision Transformer using shifted windows. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, Virtual.

[36]

Hu J., Shen L., and Sun G. Squeeze-and-excitation networks. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2018, Salt Lake City, UT, USA.

[37]

Woo S., Park J., Lee J-Y., and Kweon I.S. CBAM: Convolutional block attention module. Euro-pean Conf. Comput. Vis. (ECCV), Sep. 2018, Munich, Germany.

Digital Library

[38]

Devlin J., Chang M.-W., Lee K., and Toutanova K. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proc. NAACL-HLT, 1:4171--4186, Jun. 2019, Minneapolis, Minnesota, USA.

[39]

Kitaev N., Kaiser L., and Levskaya A. 2020. Reformer: The efficient Transformer. Int. Conf. Learn. Represent. (ICLR), Apr.-May, 2020, Virtual.

[40]

Zhu C., Ping W., Xiao C., Shoeybi M., Goldstein T., Anandkumar A., and Catanzaro B. 2021. Long-short Transformer: Efficient Transformers for language and vision. Adv. Neural Inf. Process. Syst. (NeurIPS), Dec. 2021, Virtual.

[41]

Fang Y., Zhu H., Zeng Y., Ma K., and Wang Z. 2020. Perceptual quality assessment of smartphone photography. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.

[42]

Ying Z., Niu H., Gupta P., Mahajan D., Ghadiyaram D., and Bovik A.C. 2020. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, Virtual.

[43]

Zhang W., Ma K., Zhai G., and Yang X. 2021. Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Trans. Image Process., 30: 3474--3486.

Digital Library

[44]

Li D., Jiang T., and Jiang M. 2021. Unified quality assessment of in-the-wild videos with mixed datasets training. Int. J. Comput. Vis., 129: 1238--1257.

Digital Library

[45]

ITU-T Recommendation P.910. 2008. Subjective video quality assessment methods for multimedia applications," ITU.

[46]

A.B. Jung, K. Wada, J. Crall, et al., Imgaug, available online: https:// github.com/aleju/imgaug.

[47]

Virtanen T., Nuutinen M., Vaahteranoksa M., Oittinen P., and Häkkinen J. 2015. CID2013: A database for eval-uating no-reference image quality assessment algorithms. IEEE Trans. Image Process, 24(1):, pp. 390--402.

[48]

Ghadiyaram D., and Bovik A.C. 2016. Massive online crowdsourced study of subjective and objective picture quality. IEEE Trans. Image Process, 25(1): 372--387.

Digital Library

[49]

Selvaraju R.R., Cogswell M., Das A., Vedantam R., Parikh D., and Batra D. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2017, Venice, Italy.

Index Terms

Simulating Visual Mechanisms by Sequential Spatial-Channel Attention for Image Quality Assessment
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding

Recommendations

Unifying Dual-Attention and Siamese Transformer Network for Full-Reference Image Quality Assessment
Image Quality Assessment (IQA) is a critical task of computer vision. Most Full-Reference (FR) IQA methods have limitation in the accurate prediction of perceptual qualities of the traditional distorted images and the Generative Adversarial Networks (GANs)...
Training-free referenceless camera image blur assessment via hypercomplex singular value decomposition

Blur plays an important role in the perception of camera image quality. Generally, blur leads to attenuation of high frequency information and accordingly changes the image energy. Quaternion describes the color information as a whole. Recent researches ...
Image Inpainting Using Lightweight Transformer Neural Network Based on Channel Attention
ICNCC '23: Proceedings of the 2023 12th International Conference on Networks, Communication and Computing

Image inpainting is an important image processing application in repairing images with damaged or undesirable contents. While traditional approaches based on pixel or patch matching lack the ability to generate novel contents for the missing regions in ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

QoEVMA '22: Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications

October 2022

75 pages

ISBN:9781450394994

DOI:10.1145/3552469

General Chairs:
Jing Li
Alibaba Group, China
,
Xinbo Gao
Xidian University, China
,
Patrick Le Callet
Nantes University, France
,
Zhi Li
Netflix Inc., U.S.
,
Wen Lu
Xidian University, China
,
Jiachen Yang
Tianjin University, China
,
Junle Wang
Tencent, China

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 14, 2022

Lisboa, Portugal

Acceptance Rates

QoEVMA '22 Paper Acceptance Rate 8 of 14 submissions, 57%;

Overall Acceptance Rate 14 of 20 submissions, 70%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
69
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents