Abstract
Speech denoising may improve intelligibility of speech and hearing comfort in voice communication/recognition applications in noisy environments. It can also be used to enhance old recordings. Most speech enhancement methods are intrusive and cause some loss in the signal component while removing noise. In this paper, we propose a method based on common vector approach (CVA) for reducing losses in single-channel enhancement algorithms. In the proposed technique, overlapping speech sample frames are collected in classes according to their similarity and common and difference vectors of the classes are separated using CVA. Since the noise component is uncorrelated and therefore presumably concentrated in the difference part, difference vectors are denoised using a common denoising technique and sample frames are reconstructed by combining the common and the denoised difference parts. This operation does not affect the common vector and somewhat secures improvement even for highly noised data. Compared to the state-of-the-art, highly promising results are obtained in terms of several speech quality measures.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Armengot, M., Ferri, F. J., & Villanueva, W. D. (2007). Experiments about the generalization ability of common vector based methods for face recognition. In Proceedings of PRIS 2007, Madeira, pp. 129–37.
Bobillet, W., Diversi, R., Grivel, E., Guidorzi, R., Najim, & Soverini, M., U (2007). Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes. IEEE Transaction on Signal Processing, 55, 5564–5578.
Dash, T. K., & Solanki, S. S. (2017). Comparative study of speech enhancement algorithms and their effect on speech intelligibility. Second International Conference on Communication and Electronic Systems, 1, 270–276.
Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communications, 10(1), 45–57.
Doclo, S., & Moonen, M. (2005). On the output SNR of the speech-distortion weighted multichannel Wiener filter. IEEE Signal Processing Letters, 12(12), 809–811.
Doclo, S., & Moonen, N. (2002). GSVD-based optimal filtering for signal and multi-microphone speech enhancement. IEEE Transaction on Signal Processing, 50, 2230–2244.
Durak, M. H., Seke, E., & Özkan, K. (2015). Denoising speech signal using common vector approach. In 23nd signal processing and communication, conference (SIU), Malatya, pp. 1961–1964.
Ephraim, Y., & Van Trees, H. L. (1995a). A spectrally-based signal subspace approach for speech enhancement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 804 807.
Ephraim, Y., & Van Trees, H. L. (1995b). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3, 251–266.
Ephraim, Y., Van Trees, H. L., Nilsson, M., & Soli, S. (1996). Enhancement of noisy speech for the hearing impaired using the signal subspace approach. In Proceedings of the national interdisciplinary forum on hearing aid research and development, Bethesda.
Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1383–1393.
Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Transactions on Speech and Audio Processing, 9, 655–662.
Gulmezoglu, M. B.,. Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7, 620–628.
Günal, S., Ergin, S., Gülmezoglu, M. B., & Gerek, ÖN. (2006). On feature extraction for spam e-mail detection. In Multimedia content representation, classification and security (pp. 635–642). New York: Springer.
Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. Proceedings of International Conference on Spoken Language Processing, 7, 2819–2822.
Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proceedings of ISCA ITRW ASR 2000, Paris.
Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.
Hu, Y., & Loizou, P. C. (2006). Subjective comparison of speech enhancement algorithms. In IEEE proceedings of international conference on acoustics, speech and signal processing, ICASSP 2006.
Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49(7), 588–601.
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.
Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.
Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘82, vol. 7, pp. 1278–1281.
Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.
Lu, Y., & Loizou, P. C. (2011). Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19(5), 1123–1137.
Mellahi, Y., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications (AEU), 69, 545–554.
Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In Proceedings of the IEEE workshop applications of signal processing, audio acoustics, pp. 45–48.
Mysore, G. J., & Smaragdis, P. (2011). A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, pp. 17–20.
Özkan, K., & Işık, Ş (2015). A novel multi-scale and multi-expert edge detector based on common vector approach. International Journal of Electronics and Communications (AEU). https://doi.org/10.1016/j.aeue.2015.05.011.
Özkan, K., & Seke, E. (2015). Image denoising using common vector approach. IET Image Processing. https://doi.org/10.1049/iet-ipr.2014.0979.
Paliwal, K., Wójcicki, K., & Schwerin, B. (2010). Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Communication, 52(5), 450–475.
Park, S., & Choi, S. (2008). A constrained sequential EM algorithm for speech enhancement. Neural Networks, 21, 1401–1409.
Sigg, C. D., Dikk, T., & Buhmann, J. M. (2010). Speech enhancement with sparse coding in learned dictionaries. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, Dallas, pp. 4758–4761.
Wang, G., Li, C., & Dong, L. (2010). Noise estimation using mean square cross prediction error for speech enhancement. IEEE Transactions on Circuits and Systems I: Regular Papers, 57, 1489–1499.
Wang, J., Xie, X., & Kuang, J. (2018). Microphone array speech enhancement based on tensor filtering methods. China Communications, 15(4), 141–152.
Wei, Q., & Xia, Y. S. (2013). A novel prewhitening subspace method for enhancing speech corrupted by colored noise. In 6th international congress on image and signal processing.
Zhang, L., Dong, W. S., Zhang, D., & Shi, G. M. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43, 1531–1549.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Seke, E., Özkan, K. A new speech signal denoising algorithm using common vector approach. Int J Speech Technol 21, 659–670 (2018). https://doi.org/10.1007/s10772-018-9529-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-018-9529-2