Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition

Seon Man Kim¹⁶,
Ji Hun Park¹⁶,
Hong Kook Kim¹⁶,
Sung Joo Lee¹⁷ &
…
Yun Keun Lee¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7191))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

2775 Accesses
7 Citations

Abstract

In this paper, we propose a noise reduction method based on non-negative matrix factorization (NMF) for noise-robust automatic speech recognition (ASR). Most noise reduction methods applied to ASR front-ends have been developed for suppressing background noise that is assumed to be stationary rather than non-stationary. Instead, the proposed method attenuates non-target noise by a hybrid approach that combines a Wiener filtering and an NMF technique. This is motivated by the fact that Wiener filtering and NMF are suitable for reduction of stationary and non-stationary noise, respectively. It is shown from ASR experiments that an ASR system employing the proposed approach improves the average word error rate by 11.9%, 22.4%, and 5.2%, compared to systems employing the two-stage mel-warped Wiener filter, the minimum mean square error log-spectral amplitude estimator, and NMF with a Wiener post-filter, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Spectral Analysis for Automatic Speech Recognition and Enhancement

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Spectro-temporal Power Spectrum Features for Noise Robust ASR

Article 22 November 2016

References

Wu, J., Droppo, J., Deng, L., Acero, A.: A noise-robust ASR front-end using Wiener filter constructed from MMSE estimation of clean speech and noise. In: IEEE Workshop on ASRU, pp. 321–326 (2003)
Google Scholar
Choi, H.C.: Noise robust front-end for ASR using spectral subtraction, spectral flooring and cumulative distribution mapping. In: 10th Australian Int. Conf. on Speech Science and Technology, pp. 451–456 (2004)
Google Scholar
Ephraim, Y., Malah, D.: Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443–445 (1985)
Article Google Scholar
Agarwal, A., Cheng, Y.M.: Two-stage mel-warped Wiener filter for robust speech recognition. In: IEEE Workshop on ASRU, pp. 67–70 (1999)
Google Scholar
Lee, S.J., Kang, B.O., Jung, H.Y., Lee, Y.K., Kim, H.S.: Statistical model-based noise reduction approach for car interior applications to speech recognition. ETRI Journal 32(5), 801–809 (2010)
Article Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Article Google Scholar
Wilson, K.K., Raj, B., Smaragdis, R., Divakaran, A.: Speech denoising using non-negative matrix factorization with priors. In: ICASSP, pp. 4029–4032 (2008)
Google Scholar
Kim, S.M., Kim, H.K., Lee, S.J., Lee, Y.K.: Noise robust speech recognition based on a non-negative matrix factorization. In: Inter-noise 2011 (2011)
Google Scholar
Virtanen, T.: Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Speech Audio Process. 15(3), 1066–1074 (2007)
Article Google Scholar
Malah, D., Cox, R., Accardi, A.J.: Tracking speech-presence uncertainty to improve speech enhancement in nonstationary noise environments. In: ICASSP, pp. 789–792 (1999)
Google Scholar
Sohn, J., Kim, N.S., Sung, W.: statistical model-based voice activity detection. IEEE Signal Process. Lett. 6(1), 1–3 (1999)
Article Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for nonnegative matrix factorization. In: Adv. Neural Inform. Process. Sys., vol. 13, pp. 556–562 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, 500-712, Korea
Seon Man Kim, Ji Hun Park & Hong Kook Kim
Speech/Language Information Research Center, Electronics and Telecommunications Research Institute, Daejeon, 305-700, Korea
Sung Joo Lee & Yun Keun Lee

Authors

Seon Man Kim
View author publications
You can also search for this author in PubMed Google Scholar
Ji Hun Park
View author publications
You can also search for this author in PubMed Google Scholar
Hong Kook Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sung Joo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yun Keun Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Fabian Theis Andrzej Cichocki Arie Yeredor Michael Zibulevsky

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, S.M., Park, J.H., Kim, H.K., Lee, S.J., Lee, Y.K. (2012). Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2012. Lecture Notes in Computer Science, vol 7191. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28551-6_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-28551-6_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28550-9
Online ISBN: 978-3-642-28551-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Spectral Analysis for Automatic Speech Recognition and Enhancement

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Spectro-temporal Power Spectrum Features for Noise Robust ASR

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Spectral Analysis for Automatic Speech Recognition and Enhancement

A Modified NMF-Based Filter Bank Approach for Enhancement of Speech Data in Nonstationary Noise

Spectro-temporal Power Spectrum Features for Noise Robust ASR

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation