[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Multi-coder vector quantizer for transparent coding of wideband speech ISF parameters

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Modern low bit-rate speech coders require efficient coding of the linear predictive coding (LPC) coefficients. Immittance Spectral Frequencies (ISF) and Line Spectral Frequencies (LSF) are currently the most efficient transmission parameters for LPC coefficients in wideband speech coding. In this paper, we propose a new hybrid coding scheme with multi-coder vector quantization for efficient coding of ISF parameters of the wideband speech coder AMR-WB. The coding system was designed based on four structured quantizers under noiseless channel conditions: the split vector quantizer (SVQ), the switched split vector quantizer (SSVQ), the multi-stage vector quantizer (MSVQ), and the multi switched split vector quantizer (MSSVQ). Simulation results show that our proposed AMR-WB ISF coding scheme outperforms conventional wideband ISF quantizers at lower bit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Bessette, B., Salami, R., Lefebvre, R., Jelínek, M., Rotola-Pukkila, J., Vainio, J., Mikkola, H., & Järvinen, K. (2002). The adaptive multirate wideband speech codec (AMR-WB). IEEE Transactions on Speech and Audio Processing, 10(8), 620–636. https://doi.org/10.1109/TSA.2002.804299

    Article  Google Scholar 

  • Bistritz, Y., & Peller, S. (1993). Immittance spectral pairs (ISP) for speech encoding. In Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP'93) (pp. 9–12). Minneapolis, MN, USA, 2. https://doi.org/10.1109/ICASSP.1993.319215

  • Bistritz, Y., Lev-Ari, H., & Kailath, T. (1989). Immittance domain levinson algorithms. IEEE Transactions on Information Theory, 35(3), 675–682. https://doi.org/10.1109/18.30994

    Article  MathSciNet  Google Scholar 

  • Biundo, G., Grassi, S., Ansorge, M., Pellandini, F., & Farine, P. A. (2002). Design techniques for spectral quantization in wideband speech coding. In Proceedings of the 3rd COST 276 workshop on information and knowledge management for integrated media communication (pp. 114–119). Budapest.

  • Bouzid, M., & Cheraitia, S. (2012). Channel optimized switched split vector quantization for wideband speech LSF parameters. In Proceedings of 11th edition of the international conference on information science, signal processing and their applications (ISSPA'2012) (pp. 1045–1050, 3–5). Canada

  • Bouzid, M., & Cheraitia, S. (2015). Voicing-based classified split vector quantizer for efficient coding of AMR-WB ISF parameters. In Proceedings of the 17th international conference on speech and computer (SPECOM 2015) Springer, Lecture Notes in Artificial Intelligence (LNAI 9319), Athens, Greece, (pp. 472–479, 20–24). https://doi.org/10.1007/978-3-319-23132-7_58

  • Bouzid, M., Meziane, N., & Cheraitia, S. (2023). Efficient coding of wideband ISF parameters: Application of variable rate SSVQ scheme. In Proceedings of the international conference on smart applications, communication and networking (SmartNets'2023) (pp. 25–27). Turkey. https://doi.org/10.1109/SmartNets58706.2023.10216230

  • Bouzid, M., & Djeradi, A. (2005). Optimisation de la quantification vectorielle codée par treillis: Application au codage des paramètres LSF. Annales des Télécommunications, 60(5–6), 744–769. https://doi.org/10.1007/BF03219945

    Article  Google Scholar 

  • Chen J. H., & Wang, D. (1996). Transform predictive coding of wideband speech signals. In Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP'96) (pp. 275–278). Atlanta, USA.

  • Cheraitia, S., & Bouzid, M. (2014). Robust coding of wideband speech immittance spectral frequencies. Speech Communication, Elsevier, 65, 94–108. https://doi.org/10.1016/j.specom.2014.07.001

    Article  Google Scholar 

  • Garofolo, J. S., et al. (1988). DARPA TIMIT (CD-ROM) Acoustic-phonetic Continuous Speech Database. National Institute of Standards and Technology (NIST), Gaithersburg.

  • Gersho, A., & Gray, R. M. (1992). Vector quantization and signal compression. Kluwer Academic Publishers.

    Book  Google Scholar 

  • Guibé, G., How, H. T., & Hanzo, L. (2001). Speech spectral quantizers for wideband speech coding. European Transactions on Telecommunications, 12(6), 535–545. https://doi.org/10.1002/ett.4460120609

    Article  Google Scholar 

  • Hallett, L., & Hintz, A. (2010). Digital broadcasting- challenges and opportunities for European community radio broadcasters. Telematics and Informatics, 27(2), 151–161. https://doi.org/10.1016/j.tele.2009.06.005

    Article  Google Scholar 

  • Itakura, F. (1975). Line spectrum representation of linear predictive coefficients of speech signals. Journal of Acoustical Society of America, 57(S1), S35. https://doi.org/10.1121/1.1995189

    Article  Google Scholar 

  • Juang, B. H., & Gray Jr. A.H. (1982). Multiple stage vector quantisation for speech coding. In Proceedings of the IEEE international conference on acoustic, speech and signal processing (ICASSP'82) (pp. 1, 597–600).

  • Katsavounidis, I., Kuo, C., & Zhang, Z. A. (1994). New initialization technique for generalized Lloyd iteration. IEEE Signal Processing Letter, 1(10), 144–146. https://doi.org/10.1109/97.329844

    Article  Google Scholar 

  • Kleijn, W. B., & Paliwal, K. K. (1995). Speech coding and synthesis (pp. 433–466). Elsevier Science B.V.

    Google Scholar 

  • Krishnan, V., Anderson, D. V., & Truong, K. K. (2004). Optimal multistage vector quantization of LPC parameters over noisy channels. IEEE Transactions on Speech Audio Process, 12(1), 1–8. https://doi.org/10.1109/TSA.2003.819945

    Article  Google Scholar 

  • Li, Y., Kang, Y., Wu, H., Guo, Y., & Meng, J. (2020). Single and multiple frame coding of LSF parameters using deep neural network and pyramid vector quantizer. Speech Communication Journal, 120, 1–10. https://doi.org/10.1016/j.specom.2020.03.004

    Article  Google Scholar 

  • Liang, Y., Lee, Y.-C., & Teng, A. (2007). Real-time communication: Internet protocol voice and video telephony and teleconferencing. Multimedia over IP and Wireless Networks. https://doi.org/10.1016/B978-012088480-3/50016-3

    Article  Google Scholar 

  • Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantization design. IEEE Transactions on Communications, 28(1), 84–95. https://doi.org/10.1109/TCOM.1980.1094577

    Article  Google Scholar 

  • McLoughlin, I. V. (2008). Review line spectral pairs. Signal Processing Elsevier, 88(3), 448–467. https://doi.org/10.1016/j.sigpro.2007.09.003

    Article  Google Scholar 

  • Paliwal, K. K., & Atal, B. S. (1993). Efficient vector quantization of LPC parameters at 24 bits/frame. IEEE Transactions on Speech and Audio Processing, 1(1), 3–14. https://doi.org/10.1109/89.221363

    Article  Google Scholar 

  • Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Prentice-Hall.

    Google Scholar 

  • Satya Sai Ram, M., Siddaiah, P., & Madhavi Latha, M. (2008a). Multi switched split vector quantizer. International Journal of Computer, Information, and Systems Science, and Engineering, IJCISSE, 2(1), 90–95. https://doi.org/10.5281/zenodo.1071444

    Article  Google Scholar 

  • Satya Sai Ram, M., Siddaiah, P., & Madhavi Latha, M. (2008b). Multi switched split vector quantization of narrow band speech signals. Proceedings of World Academy of Science, Engineering and Technology, 27, 236–239.

  • Semenov, V. (2015). Analysis of time distribution of immittance spectral frequencies and technique for their calculation. Computational and Applied Mathematics Journal, 1(6), 406–409.

    Google Scholar 

  • Sheikhan, M. (2013). Hybrid of PSO and SOM neural network for immittance spectral frequency quantization in AMR-WB speech codecs. In Proceedings of the 5th international conference on information and knowledge technology (IKT'2013), Shiraz.

  • Sheikhan, M., & Garoucy, S. (2010). Hybrid VQ and neural models for ISF quantization in wideband speech coding. World Applied Sciences Journal, 10, 59–66.

    Google Scholar 

  • So, S., & Paliwal, K. K. (2004). Efficient vector quantization of line spectral frequencies using the switched split vector quantiser. In Proceedings of the international conference on spoken language processing, Jeju, Korea.

  • So, S., & Paliwal, K. K. (2007). A comparative study of LPC parameter representations and quantisation schemes for wideband speech coding. Digital Signal Processing Journal, Elsevier, 17, 114–137. https://doi.org/10.1016/j.dsp.2005.10.002

    Article  Google Scholar 

  • Yeh, C. Y., & Huang, H. (2019). An upgraded version of the binary search space-structured VQ search algorithm for AMR-WB codec. Symmetry Journal, 11, 283. https://doi.org/10.3390/sym11020283

    Article  Google Scholar 

  • Xiaochen, W., Yong, Z., Ruimin, H., & Xi, D. (2009). An immittance spectral frequency parameters quantization algorithm based on Gaussian mixture model. In Proceedings of the international conference on multimedia information networking and security (MINES'09) (pp. 324–328). https://doi.org/10.1109/MINES.2009.250

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Merouane Bouzid.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bouzid, M., Meziane, N. & Cheraitia, SE. Multi-coder vector quantizer for transparent coding of wideband speech ISF parameters. Int J Speech Technol 27, 121–132 (2024). https://doi.org/10.1007/s10772-024-10084-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-024-10084-x

Keywords

Navigation