Abstract
Modern low bit-rate speech coders require efficient coding of the linear predictive coding (LPC) coefficients. Immittance Spectral Frequencies (ISF) and Line Spectral Frequencies (LSF) are currently the most efficient transmission parameters for LPC coefficients in wideband speech coding. In this paper, we propose a new hybrid coding scheme with multi-coder vector quantization for efficient coding of ISF parameters of the wideband speech coder AMR-WB. The coding system was designed based on four structured quantizers under noiseless channel conditions: the split vector quantizer (SVQ), the switched split vector quantizer (SSVQ), the multi-stage vector quantizer (MSVQ), and the multi switched split vector quantizer (MSSVQ). Simulation results show that our proposed AMR-WB ISF coding scheme outperforms conventional wideband ISF quantizers at lower bit rates.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bessette, B., Salami, R., Lefebvre, R., Jelínek, M., Rotola-Pukkila, J., Vainio, J., Mikkola, H., & Järvinen, K. (2002). The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing, 10(8), 620–636. https://doi.org/10.1109/TSA.2002.804299
Bistritz, Y., & Peller, S. (1993). Immittance spectral pairs (ISP) for speech encoding. In Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP'93) (pp. 9–12). Minneapolis, MN, USA, 2. https://doi.org/10.1109/ICASSP.1993.319215
Bistritz, Y., Lev-Ari, H., & Kailath, T. (1989). Immittance domain levinson algorithms. IEEE Transactions on Information Theory, 35(3), 675–682. https://doi.org/10.1109/18.30994
Biundo, G., Grassi, S., Ansorge, M., Pellandini, F., & Farine, P. A. (2002). Design techniques for spectral quantization in wideband speech coding. In Proceedings of the 3rd COST 276 workshop on information and knowledge management for integrated media communication (pp. 114–119). Budapest.
Bouzid, M., & Cheraitia, S. (2012). Channel optimized switched split vector quantization for wideband speech LSF parameters. In Proceedings of 11th edition of the international conference on information science, signal processing and their applications (ISSPA'2012) (pp. 1045–1050, 3–5). Canada
Bouzid, M., & Cheraitia, S. (2015). Voicing-based classified split vector quantizer for efficient coding of AMR-WB ISF parameters. In Proceedings of the 17th international conference on speech and computer (SPECOM 2015) Springer, Lecture Notes in Artificial Intelligence (LNAI 9319), Athens, Greece, (pp. 472–479, 20–24). https://doi.org/10.1007/978-3-319-23132-7_58
Bouzid, M., Meziane, N., & Cheraitia, S. (2023). Efficient coding of wideband ISF parameters: Application of variable rate SSVQ scheme. In Proceedings of the international conference on smart applications, communication and networking (SmartNets'2023) (pp. 25–27). Turkey. https://doi.org/10.1109/SmartNets58706.2023.10216230
Bouzid, M., & Djeradi, A. (2005). Optimisation de la quantification vectorielle codée par treillis: Application au codage des paramètres LSF. Annales des Télécommunications, 60(5–6), 744–769. https://doi.org/10.1007/BF03219945
Chen J. H., & Wang, D. (1996). Transform predictive coding of wideband speech signals. In Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP'96) (pp. 275–278). Atlanta, USA.
Cheraitia, S., & Bouzid, M. (2014). Robust coding of wideband speech immittance spectral frequencies. Speech Communication, Elsevier, 65, 94–108. https://doi.org/10.1016/j.specom.2014.07.001
Garofolo, J. S., et al. (1988). DARPA TIMIT (CD-ROM) Acoustic-phonetic Continuous Speech Database. National Institute of Standards and Technology (NIST), Gaithersburg.
Gersho, A., & Gray, R. M. (1992). Vector quantization and signal compression. Kluwer Academic Publishers.
Guibé, G., How, H. T., & Hanzo, L. (2001). Speech spectral quantizers for wideband speech coding. European Transactions on Telecommunications, 12(6), 535–545. https://doi.org/10.1002/ett.4460120609
Hallett, L., & Hintz, A. (2010). Digital broadcasting- challenges and opportunities for European community radio broadcasters. Telematics and Informatics, 27(2), 151–161. https://doi.org/10.1016/j.tele.2009.06.005
Itakura, F. (1975). Line spectrum representation of linear predictive coefficients of speech signals. Journal of Acoustical Society of America, 57(S1), S35. https://doi.org/10.1121/1.1995189
Juang, B. H., & Gray Jr. A.H. (1982). Multiple stage vector quantisation for speech coding. In Proceedings of the IEEE international conference on acoustic, speech and signal processing (ICASSP'82) (pp. 1, 597–600).
Katsavounidis, I., Kuo, C., & Zhang, Z. A. (1994). New initialization technique for generalized Lloyd iteration. IEEE Signal Processing Letter, 1(10), 144–146. https://doi.org/10.1109/97.329844
Kleijn, W. B., & Paliwal, K. K. (1995). Speech coding and synthesis (pp. 433–466). Elsevier Science B.V.
Krishnan, V., Anderson, D. V., & Truong, K. K. (2004). Optimal multistage vector quantization of LPC parameters over noisy channels. IEEE Transactions on Speech Audio Process, 12(1), 1–8. https://doi.org/10.1109/TSA.2003.819945
Li, Y., Kang, Y., Wu, H., Guo, Y., & Meng, J. (2020). Single and multiple frame coding of LSF parameters using deep neural network and pyramid vector quantizer. Speech Communication Journal, 120, 1–10. https://doi.org/10.1016/j.specom.2020.03.004
Liang, Y., Lee, Y.-C., & Teng, A. (2007). Real-time communication: Internet protocol voice and video telephony and teleconferencing. Multimedia over IP and Wireless Networks. https://doi.org/10.1016/B978-012088480-3/50016-3
Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantization design. IEEE Transactions on Communications, 28(1), 84–95. https://doi.org/10.1109/TCOM.1980.1094577
McLoughlin, I. V. (2008). Review line spectral pairs. Signal Processing Elsevier, 88(3), 448–467. https://doi.org/10.1016/j.sigpro.2007.09.003
Paliwal, K. K., & Atal, B. S. (1993). Efficient vector quantization of LPC parameters at 24 bits/frame. IEEE Transactions on Speech and Audio Processing, 1(1), 3–14. https://doi.org/10.1109/89.221363
Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Prentice-Hall.
Satya Sai Ram, M., Siddaiah, P., & Madhavi Latha, M. (2008a). Multi switched split vector quantizer. International Journal of Computer, Information, and Systems Science, and Engineering, IJCISSE, 2(1), 90–95. https://doi.org/10.5281/zenodo.1071444
Satya Sai Ram, M., Siddaiah, P., & Madhavi Latha, M. (2008b). Multi switched split vector quantization of narrow band speech signals. Proceedings of World Academy of Science, Engineering and Technology, 27, 236–239.
Semenov, V. (2015). Analysis of time distribution of immittance spectral frequencies and technique for their calculation. Computational and Applied Mathematics Journal, 1(6), 406–409.
Sheikhan, M. (2013). Hybrid of PSO and SOM neural network for immittance spectral frequency quantization in AMR-WB speech codecs. In Proceedings of the 5th international conference on information and knowledge technology (IKT'2013), Shiraz.
Sheikhan, M., & Garoucy, S. (2010). Hybrid VQ and neural models for ISF quantization in wideband speech coding. World Applied Sciences Journal, 10, 59–66.
So, S., & Paliwal, K. K. (2004). Efficient vector quantization of line spectral frequencies using the switched split vector quantiser. In Proceedings of the international conference on spoken language processing, Jeju, Korea.
So, S., & Paliwal, K. K. (2007). A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding. Digital Signal Processing Journal, Elsevier, 17, 114–137. https://doi.org/10.1016/j.dsp.2005.10.002
Yeh, C. Y., & Huang, H. (2019). An upgraded version of the binary search space-structured VQ search algorithm for AMR-WB codec. Symmetry Journal, 11, 283. https://doi.org/10.3390/sym11020283
Xiaochen, W., Yong, Z., Ruimin, H., & Xi, D. (2009). An immittance spectral frequency parameters quantization algorithm based on Gaussian mixture model. In Proceedings of the international conference on multimedia information networking and security (MINES'09) (pp. 324–328). https://doi.org/10.1109/MINES.2009.250
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bouzid, M., Meziane, N. & Cheraitia, SE. Multi-coder vector quantizer for transparent coding of wideband speech ISF parameters. Int J Speech Technol 27, 121–132 (2024). https://doi.org/10.1007/s10772-024-10084-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-024-10084-x