[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Perceptual Quality Assessment of Omnidirectional Audio-Visual Signals

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2023)

Abstract

Omnidirectional videos (ODVs) play an increasingly important role in the application fields of medical, education, advertising, tourism, etc. Assessing the quality of ODVs is significant for service-providers to improve the user’s Quality of Experience (QoE). However, most existing quality assessment studies for ODVs only focus on the visual distortions of videos, while ignoring that the overall QoE also depends on the accompanying audio signals. In this paper, we first establish a large-scale audio-visual quality assessment dataset for omnidirectional videos, which includes 375 distorted omnidirectional audio-visual (A/V) sequences generated from 15 high-quality pristine omnidirectional A/V contents, and the corresponding perceptual audio-visual quality scores. Then, we design three baseline methods for full-reference omnidirectional audio-visual quality assessment (OAVQA), which combine existing state-of-the-art single-mode audio and video QA models via multimodal fusion strategies. We validate the effectiveness of the A/V multimodal fusion method for OAVQA on our dataset, which provides a new benchmark for omnidirectional QoE evaluation. Our dataset is available at https://github.com/iamazxl/OAVQA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 55.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 69.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anwar, M.S., Wang, J., Ullah, A., Khan, W., Ahmad, S., Fei, Z.: Measuring quality of experience for 360-degree videos in virtual reality. SCIENCE CHINA Inf. Sci. 63, 1–15 (2020)

    Article  Google Scholar 

  2. Duan, H., Guo, L., Sun, W., Min, X., Chen, L., Zhai, G.: Augmented reality image quality assessment based on visual confusion theory. In: Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). pp. 1–6. IEEE (2022)

    Google Scholar 

  3. Duan, H., Min, X., Sun, W., Zhu, Y., Zhang, X.P., Zhai, G.: Attentive deep image quality assessment for omnidirectional stitching. IEEE J. Selected Topics Signal Process. (JSTSP) (2023)

    Google Scholar 

  4. Duan, H., Min, X., Zhu, Y., Zhai, G., Yang, X., Le Callet, P.: Confusing image quality assessment: toward better augmented reality experience. IEEE Trans. Image Process. (TIP) 31, 7206–7221 (2022)

    Article  Google Scholar 

  5. Duan, H., et al.: Develop then rival: a human vision-inspired framework for superimposed image decomposition. IEEE Transactions on Multimedia (TMM) (2022)

    Google Scholar 

  6. Duan, H., Shen, W., Min, X., Tu, D., Li, J., Zhai, G.: Saliency in augmented reality. In: Proceedings of the ACM International Conference on Multimedia (ACM MM), pp. 6549–6558 (2022)

    Google Scholar 

  7. Duan, H., et al.: Masked autoencoders as image processors. arXiv preprint arXiv:2303.17316 (2023)

  8. Duan, H., Zhai, G., Min, X., Zhu, Y., Fang, Y., Yang, X.: Perceptual quality assessment of omnidirectional images. In: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2018)

    Google Scholar 

  9. Duan, H., Zhai, G., Yang, X., Li, D., Zhu, W.: Ivqad 2017: an immersive video quality assessment database. In: Proceedings of the IEEE International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 1–5. IEEE (2017)

    Google Scholar 

  10. Fan, C.L., Hung, T.H., Hsu, C.H.: Modeling the user experience of watching 360 videos with head-mounted displays. ACM Trans. Multimed. Comput., Commun. Appl. (TOMM) 18(1), 1–23 (2022)

    Google Scholar 

  11. Fei, Z., Wang, F., Wang, J., Xie, X.: Qoe evaluation methods for 360-degree vr video transmission. IEEE J. Selected Topics Signal Process. (JSTSP) 14(1), 78–88 (2019)

    Article  Google Scholar 

  12. Fela, R.F., Pastor, A., Le Callet, P., Zacharov, N., Vigier, T., Forchhammer, S.: Perceptual evaluation on audio-visual dataset of 360 content. In: Proceedings of the IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 1–6. IEEE (2022)

    Google Scholar 

  13. Fela, R.F., Zacharov, N., Forchhammer, S.: Perceptual evaluation of 360 audiovisual quality and machine learning predictions. In: 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6. IEEE (2021)

    Google Scholar 

  14. Fela, R.F., Zacharov, N., et al.: Towards a perceived audiovisual quality model for immersive content. In: 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. IEEE (2020)

    Google Scholar 

  15. Hansen, J.H., Pellom, B.L.: An effective quality evaluation protocol for speech enhancement algorithms. In: Proceedings of the Fifth International Conference on Spoken Language Processing (1998)

    Google Scholar 

  16. Hines, A., Gillen, E., Kelly, D., Skoglund, J., Kokaram, A., Harte, N.: Visqolaudio: An objective audio quality metric for low bitrate codecs. J. Acoust. Soc. America 137(6), EL449–EL455 (2015)

    Google Scholar 

  17. Hu, Y., Loizou, P.C.: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 229–238 (2007)

    Article  Google Scholar 

  18. Li, C., Xu, M., Du, X., Wang, Z.: Bridge the gap between vqa and human behavior on omnidirectional video: A large-scale dataset and a deep learning model. In: Proceedings of the ACM International Conference on Multimedia, pp. 932–940 (2018)

    Google Scholar 

  19. Li, Z., Aaron, A., Katsavounidis, I., Moorthy, A., Manohara, M.: Toward a practical perceptual video quality metric. The Netflix Tech Blog 6(2), 2 (2016)

    Google Scholar 

  20. Meng, Y., Ma, Z.: Viewport-based omnidirectional video quality assessment: database, modeling and inference. IEEE Trans. Circuits Syst. Video Technol. 32(1), 120–134 (2022)

    Article  Google Scholar 

  21. Min, X., Zhai, G., Zhou, J., Farias, M.C., Bovik, A.C.: Study of subjective and objective quality assessment of audio-visual signals. IEEE Trans. Image Process. (TIP) 29, 6054–6068 (2020)

    Article  Google Scholar 

  22. Schatz, R., Sackl, A., Timmerer, C., Gardlo, B.: Towards subjective quality of experience assessment for omnidirectional video streaming. In: 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6 (2017)

    Google Scholar 

  23. Sheikh, H.R., Bovik, A.C.: Image information and visual quality. IEEE Trans. Image Process. (TIP) 15(2), 430–444 (2006)

    Article  Google Scholar 

  24. Sun, W., Min, X., Zhai, G., Gu, K., Duan, H., Ma, S.: Mc360iqa: A multi-channel cnn for blind 360-degree image quality assessment. IEEE J. Selected Topics Signal Process. (JSTSP) 14(1), 64–77 (2019)

    Article  Google Scholar 

  25. Sun, Y., Lu, A., Yu, L.: Weighted-to-spherically-uniform quality evaluation for omnidirectional video. IEEE Signal Process. Lett. 24(9), 1408–1412 (2017)

    Google Scholar 

  26. Taal, C.H., Hendriks, R.C., Heusdens, R., Jensen, J.: An algorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136 (2011)

    Article  Google Scholar 

  27. Thiede, T., et al.: Peaq-the itu standard for objective measurement of perceived audio quality. J. Audio Eng. Society 48(1/2), 3–29 (2000)

    Google Scholar 

  28. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Processing (TIP) 13(4), 600–612 (2004)

    Article  Google Scholar 

  29. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003. vol. 2, pp. 1398–1402. IEEE (2003)

    Google Scholar 

  30. Xue, W., Zhang, L., Mou, X., Bovik, A.C.: Gradient magnitude similarity deviation: a highly efficient perceptual image quality index. IEEE Trans. Image Process. (TIP) 23(2), 684–695 (2013)

    Article  MathSciNet  Google Scholar 

  31. Yu, M., Lakshman, H., Girod, B.: A framework to evaluate omnidirectional video coding schemes. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, pp. 31–36. IEEE (2015)

    Google Scholar 

  32. Zakharchenko, V., Choi, K.P., Park, J.: Quality metric for spherical panoramic video. In: Optical Engineering + Applications (2016)

    Google Scholar 

  33. Zhang, B., Yan, Z., Wang, J., Luo, Y., Yang, S., Fei, Z.: An audio-visual quality assessment methodology in virtual reality environment. In: 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 1–6 (2018)

    Google Scholar 

  34. Zhang, L., Zhang, L., Mou, X., Zhang, D.: FSIM: a feature similarity index for image quality assessment. IEEE Trans. Image Process. (TIP) 20(8), 2378–2386 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

This work is supported by National Key R &D Project of China (2021YFE0206700), NSFC (61831015, 62101325, 62101326, 62271312, 62225112), Shanghai Pujiang Program (22PJ1407400), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102), STCSM (22DZ2229005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangtao Zhai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, X. et al. (2024). Perceptual Quality Assessment of Omnidirectional Audio-Visual Signals. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14474. Springer, Singapore. https://doi.org/10.1007/978-981-99-9119-8_46

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-9119-8_46

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-9118-1

  • Online ISBN: 978-981-99-9119-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics