[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Compensation of audio data with a high frequency components for realistic media FTV

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Like the concept of free-viewpoint TV (FTV), the audio data should be rendered according to video data. However, on condition that minimum numbers of microphone are used, it is difficult to acquire accurate audio signal for rendering audio data to the image with the choice of view point. Especially, degradation of high frequency components (HFC) happens due to the characteristic of polar pattern for microphone. The degradation of HFC causes imperfection of signal restoration and leads to the degradation of clarity for hearing. In this paper, a compensation method for the degradation of HFC audio signal is proposed for producing an immersive audio effect at realistic media. Our experimental results show that low frequency components (LFC) of audio signal had a little directional degradation in spite of effect of the polar patterns of microphone and the compensation of HFC can be realized with adapting the attenuation inclination of LFC. This research is expected to be helpful for producing an immersive audio effect for a realistic media.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Bull DR (2006) Volumetric representation for sparse multi-views. IEEE Proc Int Conf Image Proc:1221–1224

  2. Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Akbari S, Bull DR (2011) Colour volumetric compression for realistic view synthesis applications. Multimedia Tools and Applications 53:25–51. doi:10.1008/s11042-010-0484-4

    Article  Google Scholar 

  3. Casanovas AL, Cavallaro A (2015) Audio-visual events for multi-camera synchronization. Multimedia Tools and Applications 74:1317–1340. doi:10.1007/s11042-014-1872-y

    Article  Google Scholar 

  4. Cho DH, Lee SL (2013) Object feature extraction and matching for effective multiple vehicles tracking. Journal of the Korea Information Processing Society 2:789–794. doi:10.3745/ktsde.2013.2.11.789

    Google Scholar 

  5. Choi T, Hyun D, Lee S, Park, Y (2008) Design of Realtime Multichannel 3D audio rendering system. Proceedings of Symposium of the Korean Institute of communications and Information Sciences 1997-1998

  6. Everest FA, Pohlmann KC (2009) Master handbook of acoustic. McGraw-Hill, New York

    Google Scholar 

  7. Forrest S (2012) The future of TV. URL: http://blog.imgtec.com/powervr/the-future-of-tv

  8. Han YC, B-j H (2014) Virtual pottery: a virtual 3D audiovisual interface using natural hand motions. Multimedia Tools and Applications 73:917–933. doi:10.1007/s11042-013-1382-3

    Article  Google Scholar 

  9. Herre J, Hilpert J, Kuntz A, Plogsties J (2015) MPEG-H Audio-The New Standard for Universal Spatial/3D Audio Coding. Audio Engineering Society (AES) 62:821–830. doi:10.17743/jaes.2014.0049

    Article  Google Scholar 

  10. Jang D, Seo J, Lee YJ, Yoo JH, Park T, Lee T (2015) A study on realistic sound reproduction for UHDTV. Journal of Broadcast Engineering 20:68–81. doi:10.5909/jbe.2015.20.1.68

    Article  Google Scholar 

  11. Kim S, Lee YW, LEE YL (2013) 3D sound system based on audio/video analysis. Conference of Institute of Electronics Engineers of Korea 1924-1927

  12. Kim H-G, Moreau N, Sikora T (2006) MPEG-7 audio and beyond: audio content indexing and retrieval. Wiley, USA

    Google Scholar 

  13. Kim JH, Kwon KS, Kang TG, Kim NS (2014) Current state of the art and Prospect of user centric-realistic audio technologies. The Korean Society of Broadcast Engineers 19:54–65. doi:10.5909/JBE.2014.19.1.10

    Google Scholar 

  14. Kim J-U, Cho H-S, Lee Y-B, Yeo S-D, Kim S-K (2015) A Study on Immersive Audio Improvement of FTV using an effective noise. Journal of The Korea Institute of Electronic Communication Sciences 10:233–238. doi:10.13067/JKIECS.2015.10.2.233

    Article  Google Scholar 

  15. Lei C, Yang YH (2006) Tri-focal tensor-based multiple video synchronization with subframe optimization. IEEE trans. Image Processing 15:2473–2480. doi:10.1109/TIP/2006/877438

    Article  Google Scholar 

  16. Llagostera Casanovas A, Monaci G, Vandergheynst P, Gribonval R (2010) Blind audio-visual source separation based on sparse redundant representations. IEEE Trans Multimedia 12(5):358–371

    Article  Google Scholar 

  17. Magnor M, Ramanathan P, Girod B (2003) Multi-view coding for image-based rendering using 3-D scene geometry. 13:1092–1106

  18. Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Signal Process Image Commun 24:65–72. doi:10.1016/j.image.2008.10.013

    Article  Google Scholar 

  19. Neuendorf M, Plogsties J, Meltzer S, Bleidt R (2014) Immersive audio with MPEG 3D audio-status and outlook. NAB Broadcast Engineering Conference (BEC) Proceedings 2014:52–56

    Google Scholar 

  20. Niwa K, Nishino T, Takeda K (2007) Development of Selectable Viewpoint and Listening Point System for Musical Performance. In: 19th International Congress on Acoustics, Madrid 1–6

  21. Nour-Eddine L, Abdelkader A (2015) GMM-based Maghreb dialect identification system. Journal of Information Processing Systems 30:22–38. doi:10.3745/jips.02.0015

    Google Scholar 

  22. Oldfield R, Shirley B, Spille J (2015) Object-based audio for interactive football broadcast. Multimedia Tools and Applications 74:2717–2741. doi:10.1007/s11042-013-1472-2

    Article  Google Scholar 

  23. Ricketts Todd A, Dittberner Andrew B, Johnson Earl E (2008) High-frequency amplification and sound quality in listeners with normal through moderate hearing loss. Journal of Speech, Language, and Hearing Research 51:160–172. doi:10.1044/1092-4388

    Article  Google Scholar 

  24. Seo J, Kang K, Jeong D-G (2012) Overview of MPEG 3D audio standard activities for high-order multichannel realistic audio service. Conference of The Korean Society of Broad Engineers:170–172

  25. Sha Y-t, Bao C-c, Jia M-s, Liu, X (2010) High frequency reconstruction of audio signal based on chaotic prediction theory. 2010 I.E. International Conference on Acoustics, Speech and Signal 381–384 doi:10.1109/icassp.2010.5495813

  26. Tanimoto M, Tehrani MP, Fujii T, Yendo T (2011) Free-viewpoint TV. IEEE Signal Process Mag 28:67–76. doi:10.1109/MSP.2010.939077

    Article  Google Scholar 

  27. Tanimoto M, Tehrani MP, Fujii T, Yendo T (2012) FTV for 3-D spatial communication. Proceeding of the IEEE 100:905–917. doi:10.1109/JPROC.2011.2182101

    Article  Google Scholar 

  28. Tehrani MP, Hirano Y, Fujii T, Kajita S, Takeda K, Mase K (2006) Arbitrary listening-point generation using sub-band representation of sound wave ray-space. IEEE 5:541–544. doi:18.1109/ICASSP.2006.1661332

  29. Tehrani MP, Yendo T, Fujii T, Takeda K, Tanimoto M (2009) Integration of 3D audio and 3D video for FTV. 3DTV Conference: The True Vision – Capture, Transmission and Display of 3D Video 4–6 doi:10.1109/3DTV.2009.5069681

  30. Yamamoto K, Kitahara M, Kimata H, Yendo T, Fujii T, Tanimoto M, Shimizu S, Kamikura K, Yashima Y (2007) Multiview video coding using view interpolation and color correction. IEEE 17:1436–1449. doi:10.1109/TCSVT.2007.903802

    Google Scholar 

  31. Yao Q, Takahashi K, Fujii T (2013) Compressed sensing of ray space for free viewpoint image (FVI) generation. ITE Transactions on Media Technology and Application 2:23–32. doi:10.1109/APSIPA.2013.6694266

    Article  Google Scholar 

  32. Yim E, Kham K, Lee J-H (2013) Spatial coincidence effects of the visual and auditory stimulation in 3D TV. Conference of The HCI Society of Korea:751–754

  33. Zivkovic Z (2004) Improved Adaptive Gaussian Mixture Model for Background Subtraction. 17th International Conference on Pattern Recognition 2:28–31 doi:10.1109/icpr.2004.1333992

Download references

Acknowledgments

The work was supported by the ICT R&D program of MSIP/IITP, Republic of Korea, [B0101-15-0042, Volumetric 3D Image and 3D Audio Realization Technology].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Seong-Kweon Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yeo, SD., Cho, TI., Kim, JU. et al. Compensation of audio data with a high frequency components for realistic media FTV. Multimed Tools Appl 76, 11361–11376 (2017). https://doi.org/10.1007/s11042-016-3713-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3713-7

Keywords

Navigation