Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Shijo M. Joseph¹ &
Anto P. Babu¹

392 Accesses
6 Citations
Explore all metrics

Abstract

During the last five decades, extensive researches have been carried out in the field of speech compression, which has resulted in various techniques for speech coding. Researchers have been in full swing for more efficient speech coding and their effort is still continuing in different parts of the world. In this paper we are proposing an alternative method for better speech coding. In the proposed technique we use discrete wavelet transform to decompose the signal and wavelet energy is used to differentiate between active voice region and silence region in the speech signal. Depending upon the region’s status the system, different thresholding strategies have been chosen which leads to a better compression without any loss of speech intelligibility. The proposed method is evaluated in terms of qualitative and quantitative parameters. In this paper we also propose an alternative parameter for MOS values which is here after known as System Recognition Rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review

Article 01 February 2024

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

A Wavelet Based Hybrid Threshold Transform Method for Speech Intelligibility and Quality in Noisy Speech Patterns of English Language

Article 24 January 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Achuthan, A., Rajeswari, M., Ramachandram, D., Aziz, M.E., & Shuaib, I.L. (2010). Wavelet energy-guided level set-based active contour: A segmentation method to segment highly similar regions. Computers in Biology and Medicine, 407, 608–620.
Amaar, A., Saad, E.M., Ashour, I., & Elzorkany, M. (2011). Image compression using hybrid vector quantization with DC., In International conference on graphic and image processing, Cairo.
Chacko, B. P. (2011). Intelligent Character Recognition: A study and anlysis of extreme learning machine and support vector machine using divison point and wavelet feature. Kannur: Depatrment of Information Technology, Kannur University.
Google Scholar
Chacko, B.P., Vimal Krishnan, V.R., Raju, G., & Anto, P.B. (2012). Handwritten character recognition using wavelet energy and extreme learning machine. International Journal of Machine Learning and Cybernetics, 32, 149–161.
Daubechies, I. (1992). Ten lectures on wavelets. Philadelphia: SIAM.
Book MATH Google Scholar
Feher, K. (2001). Wirless digital communication, modulation & spread spectrum applications. New Delhi: Prentice Hall of India.
Google Scholar
Haykin, S. (2001). Communication systems. New York: Wiley.
Google Scholar
Holmes, J. N. (1988). Speech synthesis and recognition. London: Chapman & Hall.
Google Scholar
Hubbard, B. B. (2003). The world according to wavelets: The story of a mathematical technique in the making (2nd ed.). Ahmedabad: Universities Press.
MATH Google Scholar
Joseph, S.M., & Anto, P.B. (2011). The optimal wavelet for speech compression. In Advances in computing and communications (pp. 406–414). Berlin: Springer.
Karam, J. (2006). Various speech processing techniques for multimedia applications. Kuwait: Gulf University for Sciences and Technology (GUST).
Google Scholar
Karam, J. (2010). A comprenhensive approach for speech related multimedia applications. WSEAS Transactions on Signal Processing, 6(1), 12–21.
Google Scholar
Kondoz, X. X. X. (2004). Digital speech coding for low bit rate communication systems (2nd ed.). New York: Wiley.
Book Google Scholar
Lin, B., Nguyen, B., & Olsen, E. T. (1995). Orthogonal wavelets and signal processing, signal processing methods for audio images and telecommunications. London: Academic Press.
Litwin, L.R. (1998). Speech coding with wavelets. IEEE Potentials, 17(2), 38–41.
Mallat, S. A. (1989). Theory for muItiresolution signal decomposition: The wavelet representation. EEE Transactions on Pattern Analysis. Machine Intelligence, 31, 674–693.
McClellan, J. H., & Schafer, R. W. (2003). Signal processing first. Upper Saddle River: Pearson Education.
Google Scholar
Meyer, Y., & Ryan, R. D. (1993). Wavelets: algorithms and applications. Philadelphia: Society for Industrial and Applied Mathematics.
Google Scholar
Nelson, M., & Gailly, J.-L. (2003). The data compression book (2nd ed.). Mumba: BPB Publications.
Google Scholar
Oi, J., & Viswanathan, V. (1995). Application of wavelets to speech processing, modern methods of speech processing. Boston: Kluwer Academic Publishers.
Osman, M.A., Al, N., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and wavelet, in 2010 2nd international conference on computer engineering and technology (ICCET), (pp. V7–92–V97-99).
Osman, A., Nasser A.I., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and Wavelet, In 2nd international conference on computer engineering and technology IEEE, pp. 7.
Painter, T., & Spanias, A. (2000). Perceputal coding of digital audio. Proceedings of the IEEE, 884, 62.
Polikar, R. (1999). The story of wavelets. In Proceedings of IMACS/IEEE CSCC’99 (pp. 5481–5486).
Polikar, R. (1996). Fundamental concept & an over view of the wavelet theory. Glassboro: Rowan University.
Google Scholar
Rabiner, L., & Schafer, R. W. (2003). Digital processing of speech signals. New Delhi: Pearson Education.
Google Scholar
Rabiner, L. R., Juang, B. H., & Yengnanarayana, B. (2009). Fundamentals of speech recognition. New Delhi: Pearson Education Inc.
Google Scholar
Rao, R. M., & Ajit, S. (2004). Wavelet transforms: Introduction to theory and applications. New Delhi: Pearson Education Pvt. Ltd,
Resnikoff, H. L., & Wells, R. O. (2004). Wavelet analysis: The scalable strcture of information. Heidelberg: Springer.
Google Scholar
Salomon, D. (2011). Data compression, The complete reference (4th ed.). New Delh: Springer.
MATH Google Scholar
Sayood, K. (2000). Introduction to data compression (2nd ed.). New Delhi: Elsevier India Pvt Ltd.
Schiller, J. (2005). Mobile communication (2e ed.). New Delhi: Pearson Education.
Google Scholar
Wu, X.-Q., Wang, K.-Q., & Zhang, D. (2005). Wavelet energy feature extraction and matching for palmprint recognition. Journal of Computer Science and Technology, 203, 411–418.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science & Technology Kannur University, Mangattuparamba Campus, Kannur, 670 567, Kerala, India
Shijo M. Joseph & Anto P. Babu

Authors

Shijo M. Joseph
View author publications
You can also search for this author in PubMed Google Scholar
Anto P. Babu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shijo M. Joseph.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joseph, S.M., Babu, A.P. Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding. Int J Speech Technol 19, 537–550 (2016). https://doi.org/10.1007/s10772-014-9240-x

Download citation

Received: 18 December 2012
Accepted: 19 June 2014
Published: 06 June 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10772-014-9240-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

A Wavelet Based Hybrid Threshold Transform Method for Speech Intelligibility and Quality in Noisy Speech Patterns of English Language

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

A Wavelet Based Hybrid Threshold Transform Method for Speech Intelligibility and Quality in Noisy Speech Patterns of English Language

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation