[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Multi-viewport based 3D convolutional neural network for 360-degree video quality assessment

Published: 01 May 2022 Publication History

Abstract

360-degree videos, also known as omnidirectional or panoramic videos, provide the user an immersive experience that 2D videos cannot provide. It is crucial to access the perceived quality of the 360-degree video. 2D video quality assessment (VQA) methods are unsuitable for 360-degree videos. There are few 360-degree video quality assessment (360VQA) methods. This paper proposes a multi-viewport based 3D convolutional neural network for 360VQA (3D-360VQA). First, it is easy to divide the 2D planar video into rectangular blocks as video patches in order to adapt to a deep neural network. The way to form the video patch in a 2D planar video is unsuitable for a 360-degree video. Thus, a multiple viewports based video patch forming method is proposed. Second, although the deep neural networks have achieved great success in image quality assessment (IQA), there are few deep neural networks for 360VQA. A 3D convolution based deep neural network is proposed to predict the perceived quality of 360-degree videos. The publicly available 360-degree videos datasets are used to evaluate the proposed method. The experimental results show that the proposed method is suitable for the 360-degree video and outperforms other existing methods, which verifies the effectiveness of our network architecture.

References

[1]
Bosse S, Maniry D, Müller KR, Wiegand T, and Samek W Deep neural networks for no-reference and full-reference image quality assessment IEEE Trans Image Process 2018 27 1 206-219
[2]
Chen D, Wang Y, and Gao W No-reference image quality assessment: an attention driven approach IEEE Trans Image Process 2020 29 6496-6506
[3]
Chen S, Zhang Y, Li Y, Chen Z, Wang Z (2018) Spherical structural similarity index for objective omnidirectional video quality assessment. In: Proc IEEE Int Conf Multimedia Expo (ICME), San Diego, California, United States, pp 1–6
[4]
Chen Z, Liao N, Gu X, Wu F, and Shi G Hybrid distortion ranking tuned bitstream-layer video quality assessment IEEE Trans Circuits Syst Video Technol 2016 26 6 1029-1043
[5]
Daly SJ (1992) Visible differences predictor: an algorithm for the assessment of image fidelity. In: Rogowitz BE (ed) Human vision, visual processing, and digital display III, vol 1666. International Society for Optics and Photonics, SPIE, San Jose, CA, United States, pp 2–15
[6]
Dinh KQ, Lee J, Kim J, Park Y, Choi KP, Park J (2018) Only-reference video quality assessment for video coding using convolutional neural network. In: Proc IEEE Int Conf Image Process (ICIP), Athens, Greece, pp 2496–2500
[7]
Ferzli R and Karam LJ A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB) IEEE Trans Image Process 2009 18 4 717-728
[8]
Final report from the video quality experts group on the validation of objective models of video quality assessment, Phase I (FR-TV1) (2000)
[9]
Gao F, Yu J, Zhu S, Huang Q, and Tian Q Blind image quality prediction by exploiting multi-level deep representations Pattern Recogn 2018 81 432-442
[10]
Girod B What’s Wrong with Mean-Squared Error? 1993 Cambridge MIT Press
[11]
Golestaneh SA and Chandler DM No-reference quality assessment of JPEG images via a quality relevance map IEEE Signal Process Lett 2014 21 2 155-158
[12]
Hu S, Jin L, Wang H, Zhang Y, Kwong S, and Kuo CCJ Objective video quality assessment based on perceptually weighted mean squared error IEEE Trans Circuits Syst Video Technol 2017 27 9 1844-1855
[13]
Jiang Q, Shao F, Lin W, and Jiang G BLIQUE-TMI: Blind Quality evaluator for tone-mapped images based on local and global feature analyses IEEE Trans Circuits Syst Video Technol 2019 29 2 323-335
[14]
Kang L, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for no-reference image quality assessment. In: Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit (CVPR), Columbus, OH, United States, pp 1733–1740
[15]
Kelly DH Motion and vision. II. Stabilized spatio-temporal threshold surface J Opt Soc Am 1979 69 10 1340-1349
[16]
Kim H, Kim J, Oh T, and Lee S Blind sharpness prediction for ultra high-definition video based on human visual resolution IEEE Trans Circuits Syst Video Technol 2017 27 5 951-964
[17]
Kim J, Nguyen AD, and Lee S Deep CNN-based blind image quality predictor IEEE Trans Neural Netw Learn Syst 2019 30 1 11-24
[18]
Kim J, Zeng H, Ghadiyaram D, Lee S, Zhang L, and Bovik AC Deep convolutional neural models for picture-quality prediction: Challenges and solutions to data-driven image quality assessment IEEE Signal Process Mag 2017 34 6 130-141
[19]
Kim MC Fourier-domain analysis of display pixel structure for image quality J Display Technol 2016 12 2 185-194
[20]
Laparra V, Muñoz-Marí J, and Malo J Divisive normalization image quality metric revisited J Opt Soc Am A 2010 27 4 852-864
[21]
LeCun Y, Bottou L, Bengio Y, and Haffner P Gradient-based learning applied to document recognition Proc IEEE 1998 86 11 2278-2324
[22]
LeCun Y, Bottou L, Orr GB, and Müller KR Efficient BackProp 1998 Berlin Springer 9-50
[23]
Lee S, Pattichis MS, and Bovik AC Foveated video quality assessment IEEE Trans Multimedia 2002 4 1 129-132
[24]
Li L, Lin W, Wang X, Yang G, Bahrami K, and Kot AC No-reference image blur assessment based on discrete orthogonal moments IEEE Trans Cybern 2016 46 1 39-50
[25]
Li S, Ma L, and Ngan KN Full-reference video quality assessment by decoupling detail losses and additive impairments IEEE Trans Circuits Syst Video Technol 2012 22 7 1100-1112
[26]
Li Y, Po LM, Feng L, Yuan F (2016) No-reference image quality assessment with deep convolutional neural networks. In: Proc IEEE Int Conf Digit Signal Process (DSP), Beijing, China, pp 685–689
[27]
Liu H and Heynderickx I A perceptually relevant no-reference blockiness metric based on local image characteristics EURASIP J Adv Signal Process 2009 2009 263540 1-14
[28]
Liu H and Heynderickx I Visual attention in objective image quality assessment: based on eye-tracking data IEEE Trans Circuits Syst Video Technol 2011 21 7 971-982
[29]
Liu H, Klomp N, and Heynderickx I A no-reference metric for perceived ringing artifacts in images IEEE Trans Circuits Syst Video Technol 2010 20 4 529-539
[30]
Liu Y, Gu K, Zhai G, Liu X, Zhao D, and Gao W Quality assessment for real out-of-focus blurred images J Vis Commun Image Represent 2017 46 70-80
[31]
Masry M, Hemami SS, and Sermadevi Y A scalable wavelet-based video distortion metric and applications IEEE Trans Circuits Syst Video Technol 2006 16 2 260-273
[32]
Narvekar ND and Karam LJ A no-reference image blur metric based on the cumulative probability of blur detection (CPBD) IEEE Trans Image Process 2011 20 9 2678-2683
[33]
Oh T, Park J, Seshadrinathan K, Lee S, and Bovik AC No-reference sharpness assessment of camera-shaken images by analysis of spectral structure IEEE Trans Image Process 2014 23 12 5428-5439
[34]
Po LM, Liu M, Yuen WYF, Li Y, Xu X, Zhou C, Wong PHW, Lau KW, and Luk HT A novel patch variance biased convolutional neural network for no-reference image quality assessment IEEE Trans Circuits Syst Video Technol 2019 29 4 1223-1229
[35]
Roodaki H, Hashemi MR, and Shirmohammadi S A new methodology to derive objective quality assessment metrics for scalable multiview 3D video coding ACM Trans Multimedia Comput Commun Appl 2012 8 3s 44:1-44:25
[36]
Saha A and Wu QMJ Utilizing image scales towards totally training free blind image quality assessment IEEE Trans Image Process 2015 24 6 1879-1892
[37]
Seshadrinathan K and Bovik AC Motion tuned spatio-temporal quality assessment of natural videos IEEE Trans Image Process 2010 19 2 335-350
[38]
Sheikh HR, Bovik AC, and Cormack L No-reference quality assessment using natural scene statistics: JPEG2000 IEEE Trans Image Process 2005 14 1 1918-1927
[39]
Sinno Z and Bovik AC Large-scale study of perceptual video quality IEEE Trans Image Process 2019 28 2 612-627
[40]
Su YC, Grauman K (2018) Learning compressible 360° video isomers. In: Proc. IEEE Comput Soc Conf Comput Vision Pattern Recognit (CVPR), Salt Lake City, UT, USA, pp 7824–7833
[41]
Sun W, Gu K, Luo W, Min X, Zhai G, Ma S, Yang X (2019) MC360IQA: A multi-channel CNN for blind 360-degree image quality assessment. In: Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), Sapporo, Japan, pp 1–5
[42]
Sun Y, Lu A, Yu L (2016) AHG8: WS-PSNR For 360 video objective quality evaluation. In: Document JVET-d0040, 4th JVET meeting. Chengdu, CN
[43]
Sun Y, Lu A, and Yu L Weighted-to-spherically uniform quality evaluation for omnidirectional video IEEE Signal Process Lett 2017 24 9 1408-1412
[44]
Tang Z, Zheng Y, Gu K, Liao K, Wang W, and Yu M Full-reference image quality assessment by combining features in spatial and frequency domains IEEE Trans Broadcast 2019 65 1 138-151
[45]
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3D convolutional networks. In: Proc IEEE Int Conf Comput Vision (ICCV), Santiago, Chile, pp 4489–4497
[46]
Wang S, Jin K, Lu H, Cheng C, Ye J, and Qian D Human visual system-based fundus image quality assessment of portable fundus camera photographs IEEE Trans Med Imag 2016 35 4 1046-1055
[47]
Wang Y, Shuai Y, Zhu Y, Zhang J, and An P Jointly learning perceptually heterogeneous features for blind 3D video quality assessment Neurocomputing 2019 332 298-304
[48]
Wang Z, Bovik AC, Evans BL (2000) Blind measurement of blocking artifacts in images. In: Proc. IEEE Int Conf Image Process (ICIP), vol 3, Chengdu, CN, pp 981–984
[49]
Wang Z, Bovik AC, Lu L, Kouloheris JL (2001) Foveated wavelet image quality index. In: Tescher AG (ed) Applications of Digital Image Processing XXIV, vol 4472. International Society for Optics and Photonics, SPIE, San Diego, CA, United States, pp 42–52
[50]
Wu Q, Li H, Meng F, Ngan KN, Luo B, Huang C, and Zeng B Blind image quality assessment based on multichannel feature fusion and label transfer IEEE Trans Circuits Syst Video Technol 2016 26 3 425-440
[51]
Xu M, Li C, Chen Z, Wang Z, and Guan Z Assessing visual quality of omnidirectional videos IEEE Trans Circuits Syst Video Technol 2019 29 12 3516-3530
[52]
Xu M, Li C, Liu Y, Deng X, Lu J (2017) A subjective visual quality assessment method of panoramic videos. In: Proc. IEEE Int Conf Multimedia Expo (ICME), Hong Kong, pp 517–522
[53]
Yan Q, Gong D, and Zhang Y Two-stream convolutional networks for blind image quality assessment IEEE Trans Image Process 2019 28 5 2200-2211
[54]
You J, Ebrahimi T, and Perkis A Attention driven foveated video quality assessment IEEE Trans Image Process 2014 23 1 200-213
[55]
Yu M, Lakshman H, Girod B (2015) A framework to evaluate omnidirectional video coding schemes. In: Proc. IEEE Int. Symp. Mixed Augment. Real. (ISMAR), Fukuoka, Japan, pp 31–36
[56]
Zakharchenko V, Choi KP, Park JH (2016) Quality metric for spherical panoramic video. In: Iftekharuddin KM, Awwal AAS, Vázquez MG, Márquez A, Matin MA (eds) Optics and Photonics for Information Processing X, vol 9970. International Society for Optics and Photonics, SPIE, San Diego, CA, United States, pp 57–65
[57]
Zhang F and Bull DR A perception-based hybrid model for vdeo quality assessment IEEE Trans Circuits Syst Video Technol 2016 26 6 1017-1028
[58]
Zhang F, Lin W, Chen Z, and Ngan KN Additive log-logistic model for networked video quality assessment IEEE Trans Image Process 2013 22 4 1536-1547
[59]
Zhang L, Shen Y, and Li H VSI: A visual saliency-induced index for perceptual image quality assessment IEEE Trans Image Process 2014 23 10 4270-4281
[60]
Zhang W, Borji A, Wang Z, Callet PL, and Liu H The application of visual saliency models in objective image quality assessment: a statistical evaluation IEEE Trans Neural Netw Learn Syst 2016 27 6 1266-1278
[61]
Zhang W, Ma K, Yan J, Deng D, and Wang Z Blind image quality assessment using a deep bilinear convolutional neural network IEEE Trans Circuits Syst Video Technol 2020 30 1 36-47
[62]
Zhou C, Li Z, Osgood J, and Liu Y On the effectiveness of offset projections for 360-degree video streaming ACM Trans Multimedia Comput Commun Appl 2018 14 3s 62:1-62:24

Index Terms

  1. Multi-viewport based 3D convolutional neural network for 360-degree video quality assessment
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Multimedia Tools and Applications
        Multimedia Tools and Applications  Volume 81, Issue 12
        May 2022
        1382 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 01 May 2022
        Accepted: 04 January 2022
        Revision received: 13 February 2021
        Received: 24 November 2020

        Author Tags

        1. Deep learning
        2. Video quality assessment (VQA)
        3. 360-degree video
        4. Convolutional neural network (CNN)
        5. Virtual reality (VR)

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 13 Dec 2024

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media