Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions

Minghe Wang¹⁶,
Erhua Zhang¹⁶ &
Zhenmin Tang¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Chinese Conference on Pattern Recognition

2308 Accesses

Abstract

Previous researches show that the approaches based on the total variability space (TVS) followed by Gaussian probabilistic linear discriminant analysis (GPLDA) work effectively for dealing with convolutional noise (such as channel noise) and can bring some degree of gains in term of accuracy under additive noisy environment as well. However they meet difficulty while many types of noises are unseen and non-stationary in real world. To address this issue, we introduce the robust principal component analysis (RPCA) into the TVS modeled speaker verification system, called RPCA-TVS, which regards the noise spectrum as the low-rank component and the speech spectrum as the sparse component in short-time Fourier transform (SFT) domain. The highlighting of this paper is to improve the robustness of speaker verification under additive noisy environment, especially in non-stationary and unseen noise conditions. For evaluating the performance, we designed and generated an additive noisy corpus, based on the TIMIT and NUST603-2014 database, using the NaFT tools with 12 types of noise samples deriving from NOISEX-92 and FREESOUND. Experimental results demonstrate that the proposed RPCA-TVS can achieve better performance than the competing methods at various signal-to-noise ratio (SNR) levels. Especially, RPCA-TVS reduces the equal error rate (EER) by 5.12 % in average than the multi-condition system under additive noise conditions at SNR = 8 dB.

This work is supported by the National Science Foundation of China (Grand no. 61473154).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

Article 20 July 2017

Using combined features to improve speaker verification in the face of limited reverberant data

Article 01 September 2023

Closed-set speaker identification using VQ and GMM based models

Article 17 September 2021

Notes

1.
available from http://www.freesound.com.

References

Lei, Y., Burget, L., Ferrer, L., et al.: Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 4253–4256 (2012)
Google Scholar
Sun, M., Zhang, X., Van Hamme, H., et al.: Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 93–104 (2016)
Article Google Scholar
Candès, E.J., Li, X., Ma, Y., et al.: Robust principal component analysis. J. ACM 58(3), 1–73 (2011)
Article MathSciNet MATH Google Scholar
Dat, T.T., Jin, Y.K., Kim, H.G., et al.: Robust speaker verification using low-rank recovery under total variability space. In: International Conference on IT Convergence and Security, Kuala Lumpur, pp. 1–4 (2015)
Google Scholar
Dehak, N., Kenny, P., Dehak, R., et al.: Front-end factor analysis for speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Article Google Scholar
Li, W., Fu, T., Zhu, J.: An improved i-vector extraction algorithm for speaker verification. EURASIP J. Audio Speech Music Process. 2015(1), 1–9 (2015)
Article Google Scholar
Li, N., Mak, M.W.: SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1648–1659 (2015)
Article Google Scholar
Kanagasundaram, A., Dean, D., Sridharan, S., et al.: I-vector based speaker recognition using advanced channel compensation techniques. Comput. Speech Lang. 28(1), 121–140 (2014)
Article Google Scholar
Jiang, Y., Lee, K.A., Wang, L.B.: PLDA in the I-SUPERVECTOR space for text-independent speaker verification. EURASIP J. Audio Speech Music Process. 1–13, 2014 (2014)
Google Scholar
Mak, M.W., Pang, X., Chien, J.T.: Mixture of PLDA for noise robust i-vector speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 130–142 (2016)
Article Google Scholar
Huang, P.S., Chen, S.D., Smaragdis, P., et al.: Singing-voice separation from monaural recordings using robust principal component analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 57–60 (2012)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 13, 556–562 (2001)
Google Scholar
Gemmeke, J.F., Virtanen, T., Hurmalainen, A.: Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)
Article Google Scholar
Hu, Y., Liu, G.: Separation of singing voice using nonnegative matrix partial co-factorization for singer identification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 643–653 (2015)
Article Google Scholar
Li, J., Deng, L., Gong, Y., et al.: An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)
Article Google Scholar
Kheder, W.B., Matrouf, D., Bonastre, J.F., et al.: Additive noise compensation in the i-vector space for speaker recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 35–39 (2015)
Google Scholar
Gonzalez-Rodriguez, J.: Evaluating automatic speaker recognition systems: An overview of the NIST speaker recognition evaluations (1996-2014). Loquens 1(1), 1–15 (2014)
Article Google Scholar
Wang, M.H., Chen, Y., Tang, Z.M., et al.: I-vector based speaker gender recognition. In: IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, pp. 729–732 (2015)
Google Scholar
Avila, A.R., Sarria-Paja, M., Fraga, F.J., et al.: Improving the performance of far-field speaker verification using multi-condition training: The case of GMM-UBM and i-vector systems. In: Fifteenth Conference of the International Speech Communication Association (ISCA), Singapore, pp. 1096–1100 (2014)
Google Scholar
Mekonnen, B.W., Dufera, B.D.: Noise robust speaker verification using GMM-UBM multi-condition training. In: AFRICON, Addis Ababa, pp. 1–5 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
Minghe Wang, Erhua Zhang & Zhenmin Tang

Authors

Minghe Wang
View author publications
You can also search for this author in PubMed Google Scholar
Erhua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenmin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minghe Wang .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China
Xuelong Li
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Xilin Chen
Tsinghua University , Beijing, China
Jie Zhou
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
University of Electronic Science and Technology, Chengdu, Sichuan, China
Hong Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, M., Zhang, E., Tang, Z. (2016). Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_49

Download citation

DOI: https://doi.org/10.1007/978-981-10-3005-5_49
Published: 22 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

Using combined features to improve speaker verification in the face of limited reverberant data

Closed-set speaker identification using VQ and GMM based models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Enhanced speaker verification using an adaptive multiple low-rank representation based on the modified adaptive Gaussian mixture model framework

Using combined features to improve speaker verification in the face of limited reverberant data

Closed-set speaker identification using VQ and GMM based models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation