Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

232 Accesses
4 Citations
Explore all metrics

Abstract

In recent past, wavelet packet (WP) based speech enhancement techniques have been gaining popularity due to their inherent nature of noise minimization. WP based techniques appeared as more robust and efficient than short-time Fourier transform based methods. In the present work, a speech enhancement method using Teager energy operated equal rectangular bandwidth (ERB)-like WP decomposition has been proposed. Twenty four sub-band perceptual wavelet packet decomposition (PWPD) structure is implemented according to the auditory ERB scale. ERB scale based decomposition structure is used because the central frequency of the ERB scale distribution is similar to the frequency response of the human cochlea. Teager energy operator is applied to estimate the threshold value for the PWPD coefficients. Lastly, Wiener filtering is applied to remove the low frequency noise before final reconstruction stage. The proposed method has been applied to evaluate the Hindi sentences database, corrupted with six noise conditions. The proposed method’s performance is analysed with respect to several speech quality parameters and output signal to noise ratio levels. Performance indicates that the proposed technique outperforms some traditional speech enhancement algorithms at all SNR levels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments

A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection

Article 21 March 2017

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Allen, J. B. (1994). How do humans process and recognize speech? IEEE Transactions on Speech and Audio Processing, 2(4), 567–577.
Article Google Scholar
Bahoura, M., & Rouat, J. (2001). Wavelet speech enhancement based on the teager energy operator. IEEE Signal Processing Letters, 8(1), 10–12.
Article Google Scholar
Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48(12), 1620–1637.
Article Google Scholar
Berouti, M., Schwartz, R., Makhoul, J., 1979. Enhancement of speech corrupted by acoustic noise. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP’79) (Vol. 4, pp. 208–211). IEEE.
Bhowmick, A., Chandra, M. (2017). Speech enhancement using voiced speech probability based wavelet decomposition. Computers & Electrical Engineering. doi:10.1016/j.compeleceng.2017.01.013.
Google Scholar
Biswas, A., Sahu, P., & Chandra, M. (2014). Admissible wavelet packet features based on human inner ear frequency response for hindi consonant recognition. Computers & Electrical Engineering, 40(4), 1111–1122.
Article Google Scholar
Chen, F., Loizou, P. C. (2010). Speech enhancement using a frequency-specific composite Wiener function. In: 2010 IEEE international conference on acoustics speech and signal processing (ICASSP) (pp. 4726–4729). IEEE.
Chen, S.-H., Wang, J.-F. (2004). Speech enhancement using perceptual wavelet packet decomposition and teager energy operator. In: Real world speech processing (pp. 51–65). New York: Springer.
Cohen, I. (2004). Speech enhancement using a noncausal a priori SNR estimator. IEEE Signal Processing Letters, 11(9), 725–728.
Article Google Scholar
Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Transactions on Information Theory, 41(3), 613–627.
Article MathSciNet MATH Google Scholar
Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech and Signal Processing, 33(2), 443–445.
Article Google Scholar
Farooq, O., & Datta, S. (2001). Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters, 8(7), 196–198.
Article Google Scholar
Farooq, O., & Datta, S. (2003). Phoneme recognition using wavelet based features. Information Sciences, 150(1), 5–15.
Article Google Scholar
Gandhiraj, R., Sathidevi, P. (2007). Auditory-based wavelet packet filterbank for speech recognition using neural network. In: International conference on advanced computing and communications (ADCOM), 2007 (pp. 666–673). IEEE.
Ghanbari, Y., & Karami-Mollaei, M. R. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48(8), 927–940.
Article Google Scholar
Gonzalez, S., & Brookes, M. (2014). Pefac-a pitch estimation algorithm robust to high levels of noise. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(2), 518–530.
Article Google Scholar
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.
Article Google Scholar
Islam, M. T., Shahnaz, C., Zhu, W.-P., & Ahmad, M. O. (2015). Speech enhancement based on student modeling of teager energy operated perceptual wavelet packet coefficients and a custom thresholding function. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(11), 1800–1811.
Article Google Scholar
Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.
Article Google Scholar
Johnson, M. T., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 49(2), 123–133.
Article Google Scholar
Johnstone, I. M., & Silverman, B. W. (1997). Wavelet threshold estimators for data with correlated noise. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(2), 319–351.
Article MathSciNet MATH Google Scholar
Kaiser, J. F. (1993). Some useful properties of teager’s energy operators. In: IEEE international conference on acoustics, speech, and signal processing, 1993 (ICASSP-93) (Vol. 3, pp. 149–152). IEEE.
Kamath, S., Loizou, P. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In: IEEE international conference on acoustics speech and signal processing (Vol. 4, pp. 4164–4164). Citeseer.
Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.
Article Google Scholar
Mallat, S. (1999). A wavelet tour of signal processing. Cambridge: Academic Press.
MATH Google Scholar
Mittal, U., & Phamdo, N. (2000). Signal/noise KLT based approach for enhancing speech degraded by colored noise. IEEE Transactions on Speech and Audio Processing, 8(2), 159–167.
Article Google Scholar
Plapous, C., Marro, C., & Scalart, P. (2006). Improved signal-to-noise ratio estimation for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 14(6), 2098–2108.
Article Google Scholar
Sahu, P., Biswas, A., Bhowmick, A., & Chandra, M. (2014). Auditory ERB like admissible wavelet packet features for timit phoneme recognition. Engineering Science and Technology, An International Journal, 17(3), 145–151.
Article Google Scholar
Samudravijaya, K., Rawat, K., & Rao, P. (1998). Design of phonetically rich sentences for hindi speech database. The Acoustical Society of India, 26, 466–471.
Google Scholar
Scalart, P., et al. (1996). Speech enhancement based on a priori signal to noise estimation. In: 1996 IEEE international conference on acoustics, speech, and signal processing, 1996 (ICASSP-96) (Vol. 2, pp. 629–632). IEEE.
Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. The Annals of Statistics, 9, 1135–1151.
Article MathSciNet MATH Google Scholar
Wang, X. P., Zhu, C.-Q., Li, Z.-G. (2002). A comparative study on wavelet packet based front-end in connected mandarin digit recognition. In: International symposium on Chinese spoken language processing.

Download references

Author information

Authors and Affiliations

Department of Electronics & Communication, BIT, Mesra, Jharkhand, India
Anirban Bhowmick & Mahesh Chandra
Department of Electronics & Communication, ABES Engineering College, Ghaziabad, India
Astik Biswas

Authors

Anirban Bhowmick
View author publications
You can also search for this author in PubMed Google Scholar
Mahesh Chandra
View author publications
You can also search for this author in PubMed Google Scholar
Astik Biswas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anirban Bhowmick.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhowmick, A., Chandra, M. & Biswas, A. Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition. Int J Speech Technol 20, 813–827 (2017). https://doi.org/10.1007/s10772-017-9448-7

Download citation

Received: 21 March 2017
Accepted: 31 July 2017
Published: 17 August 2017
Issue Date: December 2017
DOI: https://doi.org/10.1007/s10772-017-9448-7

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments

A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Single Channel Speech Enhancement for Mixed Non-stationary Noise Environments

A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection

A Curvelet Transformer Based Computationally Efficient Speech Enhancement for Kalman Filter

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation