Abstract
Previous researches show that the approaches based on the total variability space (TVS) followed by Gaussian probabilistic linear discriminant analysis (GPLDA) work effectively for dealing with convolutional noise (such as channel noise) and can bring some degree of gains in term of accuracy under additive noisy environment as well. However they meet difficulty while many types of noises are unseen and non-stationary in real world. To address this issue, we introduce the robust principal component analysis (RPCA) into the TVS modeled speaker verification system, called RPCA-TVS, which regards the noise spectrum as the low-rank component and the speech spectrum as the sparse component in short-time Fourier transform (SFT) domain. The highlighting of this paper is to improve the robustness of speaker verification under additive noisy environment, especially in non-stationary and unseen noise conditions. For evaluating the performance, we designed and generated an additive noisy corpus, based on the TIMIT and NUST603-2014 database, using the NaFT tools with 12 types of noise samples deriving from NOISEX-92 and FREESOUND. Experimental results demonstrate that the proposed RPCA-TVS can achieve better performance than the competing methods at various signal-to-noise ratio (SNR) levels. Especially, RPCA-TVS reduces the equal error rate (EER) by 5.12 % in average than the multi-condition system under additive noise conditions at SNR = 8 dB.
This work is supported by the National Science Foundation of China (Grand no. 61473154).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
available from http://www.freesound.com.
References
Lei, Y., Burget, L., Ferrer, L., et al.: Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 4253–4256 (2012)
Sun, M., Zhang, X., Van Hamme, H., et al.: Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 93–104 (2016)
Candès, E.J., Li, X., Ma, Y., et al.: Robust principal component analysis. J. ACM 58(3), 1–73 (2011)
Dat, T.T., Jin, Y.K., Kim, H.G., et al.: Robust speaker verification using low-rank recovery under total variability space. In: International Conference on IT Convergence and Security, Kuala Lumpur, pp. 1–4 (2015)
Dehak, N., Kenny, P., Dehak, R., et al.: Front-end factor analysis for speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Li, W., Fu, T., Zhu, J.: An improved i-vector extraction algorithm for speaker verification. EURASIP J. Audio Speech Music Process. 2015(1), 1–9 (2015)
Li, N., Mak, M.W.: SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1648–1659 (2015)
Kanagasundaram, A., Dean, D., Sridharan, S., et al.: I-vector based speaker recognition using advanced channel compensation techniques. Comput. Speech Lang. 28(1), 121–140 (2014)
Jiang, Y., Lee, K.A., Wang, L.B.: PLDA in the I-SUPERVECTOR space for text-independent speaker verification. EURASIP J. Audio Speech Music Process. 1–13, 2014 (2014)
Mak, M.W., Pang, X., Chien, J.T.: Mixture of PLDA for noise robust i-vector speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 130–142 (2016)
Huang, P.S., Chen, S.D., Smaragdis, P., et al.: Singing-voice separation from monaural recordings using robust principal component analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 57–60 (2012)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 13, 556–562 (2001)
Gemmeke, J.F., Virtanen, T., Hurmalainen, A.: Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)
Hu, Y., Liu, G.: Separation of singing voice using nonnegative matrix partial co-factorization for singer identification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 643–653 (2015)
Li, J., Deng, L., Gong, Y., et al.: An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)
Kheder, W.B., Matrouf, D., Bonastre, J.F., et al.: Additive noise compensation in the i-vector space for speaker recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 35–39 (2015)
Gonzalez-Rodriguez, J.: Evaluating automatic speaker recognition systems: An overview of the NIST speaker recognition evaluations (1996-2014). Loquens 1(1), 1–15 (2014)
Wang, M.H., Chen, Y., Tang, Z.M., et al.: I-vector based speaker gender recognition. In: IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, pp. 729–732 (2015)
Avila, A.R., Sarria-Paja, M., Fraga, F.J., et al.: Improving the performance of far-field speaker verification using multi-condition training: The case of GMM-UBM and i-vector systems. In: Fifteenth Conference of the International Speech Communication Association (ISCA), Singapore, pp. 1096–1100 (2014)
Mekonnen, B.W., Dufera, B.D.: Noise robust speaker verification using GMM-UBM multi-condition training. In: AFRICON, Addis Ababa, pp. 1–5 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, M., Zhang, E., Tang, Z. (2016). Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_49
Download citation
DOI: https://doi.org/10.1007/978-981-10-3005-5_49
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)