Abstract
During the last five decades, extensive researches have been carried out in the field of speech compression, which has resulted in various techniques for speech coding. Researchers have been in full swing for more efficient speech coding and their effort is still continuing in different parts of the world. In this paper we are proposing an alternative method for better speech coding. In the proposed technique we use discrete wavelet transform to decompose the signal and wavelet energy is used to differentiate between active voice region and silence region in the speech signal. Depending upon the region’s status the system, different thresholding strategies have been chosen which leads to a better compression without any loss of speech intelligibility. The proposed method is evaluated in terms of qualitative and quantitative parameters. In this paper we also propose an alternative parameter for MOS values which is here after known as System Recognition Rate.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Achuthan, A., Rajeswari, M., Ramachandram, D., Aziz, M.E., & Shuaib, I.L. (2010). Wavelet energy-guided level set-based active contour: A segmentation method to segment highly similar regions. Computers in Biology and Medicine, 407, 608–620.
Amaar, A., Saad, E.M., Ashour, I., & Elzorkany, M. (2011). Image compression using hybrid vector quantization with DC., In International conference on graphic and image processing, Cairo.
Chacko, B. P. (2011). Intelligent Character Recognition: A study and anlysis of extreme learning machine and support vector machine using divison point and wavelet feature. Kannur: Depatrment of Information Technology, Kannur University.
Chacko, B.P., Vimal Krishnan, V.R., Raju, G., & Anto, P.B. (2012). Handwritten character recognition using wavelet energy and extreme learning machine. International Journal of Machine Learning and Cybernetics, 32, 149–161.
Daubechies, I. (1992). Ten lectures on wavelets. Philadelphia: SIAM.
Feher, K. (2001). Wirless digital communication, modulation & spread spectrum applications. New Delhi: Prentice Hall of India.
Haykin, S. (2001). Communication systems. New York: Wiley.
Holmes, J. N. (1988). Speech synthesis and recognition. London: Chapman & Hall.
Hubbard, B. B. (2003). The world according to wavelets: The story of a mathematical technique in the making (2nd ed.). Ahmedabad: Universities Press.
Joseph, S.M., & Anto, P.B. (2011). The optimal wavelet for speech compression. In Advances in computing and communications (pp. 406–414). Berlin: Springer.
Karam, J. (2006). Various speech processing techniques for multimedia applications. Kuwait: Gulf University for Sciences and Technology (GUST).
Karam, J. (2010). A comprenhensive approach for speech related multimedia applications. WSEAS Transactions on Signal Processing, 6(1), 12–21.
Kondoz, X. X. X. (2004). Digital speech coding for low bit rate communication systems (2nd ed.). New York: Wiley.
Lin, B., Nguyen, B., & Olsen, E. T. (1995). Orthogonal wavelets and signal processing, signal processing methods for audio images and telecommunications. London: Academic Press.
Litwin, L.R. (1998). Speech coding with wavelets. IEEE Potentials, 17(2), 38–41.
Mallat, S. A. (1989). Theory for muItiresolution signal decomposition: The wavelet representation. EEE Transactions on Pattern Analysis. Machine Intelligence, 31, 674–693.
McClellan, J. H., & Schafer, R. W. (2003). Signal processing first. Upper Saddle River: Pearson Education.
Meyer, Y., & Ryan, R. D. (1993). Wavelets: algorithms and applications. Philadelphia: Society for Industrial and Applied Mathematics.
Nelson, M., & Gailly, J.-L. (2003). The data compression book (2nd ed.). Mumba: BPB Publications.
Oi, J., & Viswanathan, V. (1995). Application of wavelets to speech processing, modern methods of speech processing. Boston: Kluwer Academic Publishers.
Osman, M.A., Al, N., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and wavelet, in 2010 2nd international conference on computer engineering and technology (ICCET), (pp. V7–92–V97-99).
Osman, A., Nasser A.I., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and Wavelet, In 2nd international conference on computer engineering and technology IEEE, pp. 7.
Painter, T., & Spanias, A. (2000). Perceputal coding of digital audio. Proceedings of the IEEE, 884, 62.
Polikar, R. (1999). The story of wavelets. In Proceedings of IMACS/IEEE CSCC’99 (pp. 5481–5486).
Polikar, R. (1996). Fundamental concept & an over view of the wavelet theory. Glassboro: Rowan University.
Rabiner, L., & Schafer, R. W. (2003). Digital processing of speech signals. New Delhi: Pearson Education.
Rabiner, L. R., Juang, B. H., & Yengnanarayana, B. (2009). Fundamentals of speech recognition. New Delhi: Pearson Education Inc.
Rao, R. M., & Ajit, S. (2004). Wavelet transforms: Introduction to theory and applications. New Delhi: Pearson Education Pvt. Ltd,
Resnikoff, H. L., & Wells, R. O. (2004). Wavelet analysis: The scalable strcture of information. Heidelberg: Springer.
Salomon, D. (2011). Data compression, The complete reference (4th ed.). New Delh: Springer.
Sayood, K. (2000). Introduction to data compression (2nd ed.). New Delhi: Elsevier India Pvt Ltd.
Schiller, J. (2005). Mobile communication (2e ed.). New Delhi: Pearson Education.
Wu, X.-Q., Wang, K.-Q., & Zhang, D. (2005). Wavelet energy feature extraction and matching for palmprint recognition. Journal of Computer Science and Technology, 203, 411–418.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Joseph, S.M., Babu, A.P. Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding. Int J Speech Technol 19, 537–550 (2016). https://doi.org/10.1007/s10772-014-9240-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-014-9240-x