Abstract
Like the concept of free-viewpoint TV (FTV), the audio data should be rendered according to video data. However, on condition that minimum numbers of microphone are used, it is difficult to acquire accurate audio signal for rendering audio data to the image with the choice of view point. Especially, degradation of high frequency components (HFC) happens due to the characteristic of polar pattern for microphone. The degradation of HFC causes imperfection of signal restoration and leads to the degradation of clarity for hearing. In this paper, a compensation method for the degradation of HFC audio signal is proposed for producing an immersive audio effect at realistic media. Our experimental results show that low frequency components (LFC) of audio signal had a little directional degradation in spite of effect of the polar patterns of microphone and the compensation of HFC can be realized with adapting the attenuation inclination of LFC. This research is expected to be helpful for producing an immersive audio effect for a realistic media.
Similar content being viewed by others
References
Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Bull DR (2006) Volumetric representation for sparse multi-views. IEEE Proc Int Conf Image Proc:1221–1224
Anantrasirichai N, Nishan Canagarajah C, Redmill DW, Akbari S, Bull DR (2011) Colour volumetric compression for realistic view synthesis applications. Multimedia Tools and Applications 53:25–51. doi:10.1008/s11042-010-0484-4
Casanovas AL, Cavallaro A (2015) Audio-visual events for multi-camera synchronization. Multimedia Tools and Applications 74:1317–1340. doi:10.1007/s11042-014-1872-y
Cho DH, Lee SL (2013) Object feature extraction and matching for effective multiple vehicles tracking. Journal of the Korea Information Processing Society 2:789–794. doi:10.3745/ktsde.2013.2.11.789
Choi T, Hyun D, Lee S, Park, Y (2008) Design of Realtime Multichannel 3D audio rendering system. Proceedings of Symposium of the Korean Institute of communications and Information Sciences 1997-1998
Everest FA, Pohlmann KC (2009) Master handbook of acoustic. McGraw-Hill, New York
Forrest S (2012) The future of TV. URL: http://blog.imgtec.com/powervr/the-future-of-tv
Han YC, B-j H (2014) Virtual pottery: a virtual 3D audiovisual interface using natural hand motions. Multimedia Tools and Applications 73:917–933. doi:10.1007/s11042-013-1382-3
Herre J, Hilpert J, Kuntz A, Plogsties J (2015) MPEG-H Audio-The New Standard for Universal Spatial/3D Audio Coding. Audio Engineering Society (AES) 62:821–830. doi:10.17743/jaes.2014.0049
Jang D, Seo J, Lee YJ, Yoo JH, Park T, Lee T (2015) A study on realistic sound reproduction for UHDTV. Journal of Broadcast Engineering 20:68–81. doi:10.5909/jbe.2015.20.1.68
Kim S, Lee YW, LEE YL (2013) 3D sound system based on audio/video analysis. Conference of Institute of Electronics Engineers of Korea 1924-1927
Kim H-G, Moreau N, Sikora T (2006) MPEG-7 audio and beyond: audio content indexing and retrieval. Wiley, USA
Kim JH, Kwon KS, Kang TG, Kim NS (2014) Current state of the art and Prospect of user centric-realistic audio technologies. The Korean Society of Broadcast Engineers 19:54–65. doi:10.5909/JBE.2014.19.1.10
Kim J-U, Cho H-S, Lee Y-B, Yeo S-D, Kim S-K (2015) A Study on Immersive Audio Improvement of FTV using an effective noise. Journal of The Korea Institute of Electronic Communication Sciences 10:233–238. doi:10.13067/JKIECS.2015.10.2.233
Lei C, Yang YH (2006) Tri-focal tensor-based multiple video synchronization with subframe optimization. IEEE trans. Image Processing 15:2473–2480. doi:10.1109/TIP/2006/877438
Llagostera Casanovas A, Monaci G, Vandergheynst P, Gribonval R (2010) Blind audio-visual source separation based on sparse redundant representations. IEEE Trans Multimedia 12(5):358–371
Magnor M, Ramanathan P, Girod B (2003) Multi-view coding for image-based rendering using 3-D scene geometry. 13:1092–1106
Mori Y, Fukushima N, Yendo T, Fujii T, Tanimoto M (2009) View generation with 3D warping using depth information for FTV. Signal Process Image Commun 24:65–72. doi:10.1016/j.image.2008.10.013
Neuendorf M, Plogsties J, Meltzer S, Bleidt R (2014) Immersive audio with MPEG 3D audio-status and outlook. NAB Broadcast Engineering Conference (BEC) Proceedings 2014:52–56
Niwa K, Nishino T, Takeda K (2007) Development of Selectable Viewpoint and Listening Point System for Musical Performance. In: 19th International Congress on Acoustics, Madrid 1–6
Nour-Eddine L, Abdelkader A (2015) GMM-based Maghreb dialect identification system. Journal of Information Processing Systems 30:22–38. doi:10.3745/jips.02.0015
Oldfield R, Shirley B, Spille J (2015) Object-based audio for interactive football broadcast. Multimedia Tools and Applications 74:2717–2741. doi:10.1007/s11042-013-1472-2
Ricketts Todd A, Dittberner Andrew B, Johnson Earl E (2008) High-frequency amplification and sound quality in listeners with normal through moderate hearing loss. Journal of Speech, Language, and Hearing Research 51:160–172. doi:10.1044/1092-4388
Seo J, Kang K, Jeong D-G (2012) Overview of MPEG 3D audio standard activities for high-order multichannel realistic audio service. Conference of The Korean Society of Broad Engineers:170–172
Sha Y-t, Bao C-c, Jia M-s, Liu, X (2010) High frequency reconstruction of audio signal based on chaotic prediction theory. 2010 I.E. International Conference on Acoustics, Speech and Signal 381–384 doi:10.1109/icassp.2010.5495813
Tanimoto M, Tehrani MP, Fujii T, Yendo T (2011) Free-viewpoint TV. IEEE Signal Process Mag 28:67–76. doi:10.1109/MSP.2010.939077
Tanimoto M, Tehrani MP, Fujii T, Yendo T (2012) FTV for 3-D spatial communication. Proceeding of the IEEE 100:905–917. doi:10.1109/JPROC.2011.2182101
Tehrani MP, Hirano Y, Fujii T, Kajita S, Takeda K, Mase K (2006) Arbitrary listening-point generation using sub-band representation of sound wave ray-space. IEEE 5:541–544. doi:18.1109/ICASSP.2006.1661332
Tehrani MP, Yendo T, Fujii T, Takeda K, Tanimoto M (2009) Integration of 3D audio and 3D video for FTV. 3DTV Conference: The True Vision – Capture, Transmission and Display of 3D Video 4–6 doi:10.1109/3DTV.2009.5069681
Yamamoto K, Kitahara M, Kimata H, Yendo T, Fujii T, Tanimoto M, Shimizu S, Kamikura K, Yashima Y (2007) Multiview video coding using view interpolation and color correction. IEEE 17:1436–1449. doi:10.1109/TCSVT.2007.903802
Yao Q, Takahashi K, Fujii T (2013) Compressed sensing of ray space for free viewpoint image (FVI) generation. ITE Transactions on Media Technology and Application 2:23–32. doi:10.1109/APSIPA.2013.6694266
Yim E, Kham K, Lee J-H (2013) Spatial coincidence effects of the visual and auditory stimulation in 3D TV. Conference of The HCI Society of Korea:751–754
Zivkovic Z (2004) Improved Adaptive Gaussian Mixture Model for Background Subtraction. 17th International Conference on Pattern Recognition 2:28–31 doi:10.1109/icpr.2004.1333992
Acknowledgments
The work was supported by the ICT R&D program of MSIP/IITP, Republic of Korea, [B0101-15-0042, Volumetric 3D Image and 3D Audio Realization Technology].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yeo, SD., Cho, TI., Kim, JU. et al. Compensation of audio data with a high frequency components for realistic media FTV. Multimed Tools Appl 76, 11361–11376 (2017). https://doi.org/10.1007/s11042-016-3713-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3713-7