Abstract
In speaker recognition fields, score normalization is a widely used and effective technique to enhance the recognition performances and is developing further. In this paper, we are focused on the comparison among many kinds of candidates of score normalization methods and a new implementation of the speaker adaptive test normalization (ATnorm) based on a cross similarity measurement is presented which doesn’t need an extra corpus for speaker adaptive impostor cohort selection. The use of ATnorm for the language robustness of the multi-language speaker verification is also investigated. Experiments are conducted on the core task of the 2006 NIST Speaker Recognition Evaluation (SRE) corpus. The experimental results indicate that all the score normalization methods mentioned can improve the recognition performances and ATnorm behaves best. Moreover, ATnorm can further contribute to the performance as a means of language robustness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Li, K.P., Porter, J.E.: Normalizations and Selection of Speech Segments for Speaker Recognition Scoring. In: ICASSP’88. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 595–598. IEEE Computer Society Press, New York (1988)
Reynolds, D.A.: The Effect of Handset Variability on Speaker Recognition Performance: Experiments on the Switchboard Corpus. In: ICASSP ’96. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta Ga USA, vol. 1, pp. 113–116. IEEE Computer Society Press, Los Alamitos (1996)
Auckenthaler, C., Thomas, L.: Score Normalization for Text-Independent Speaker Verification Systems. Digital Signal Processing, 1-3 (2000)
Bimbot, F., Bonastre, J., Fredouille, C., et al.: A Tutorial on Text-Independent Speaker Verification. EURASIP Journal on Applied Signal Processing, 430–451 (2004)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 19–41 (2000)
Sturim, D.E., Reynolds, D.A.: Speaker Adaptive Cohort Selection for Tnorm in Text-Independent Speaker Verification. IEEE, 171-174 (2005)
Yassine, M., Charlet, D.: Speaker Recognition by Location in the Space of Reference Speakers. Speech Communication, 127-141 (2006)
The NIST Year 2006 Speaker Recognition Evaluation Plan. [Online]. Available: http://www.nist.gov/speech/tests/spk/2006/index.htm
Hermansky, H., Morgan, N.: RASTA Processing of Speech. IEEE Transactions on Speech and Audio Processing, 578-589 (1994)
Reynolds, D.A.: Channel Robust Speaker Verification via Feature Mapping. In: ICASSP’03. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, vol. 2, pp. 53–56. IEEE Computer Society Press, Los Alamitos (2003)
Kumar, N.: Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph.d. thesis, Baltimore USA, John Hopkins University (1997)
Yang, H., Dong, Y., Zhao, X.Y., Zhao, J., et al.: Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification. In: The 5th International Symposium on Chinese Spoken Language Processing 2006, Singapore (in press)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice-Hall, New Jersey (2002)
Torre, Á., Peinado, A.M., Segura, J.C. et al.: Histogram Equalization of Speech Representation for Robust Speech Recognition. IEEE Transaction, Speech and Audio Processing 13, 355–366 (2005)
The NIST Year 2004 Speaker Recognition Evaluation Plan. [Online]. Available: http://www.nist.gov/speech/tests/spk/2004/index.htm
Gauvain, J.L., Lee, C.H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains. IEEE Transaction, Speech and Audio Processing 2, 291–298 (1994)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Zhao, J., Dong, Y., Zhao, X., Yang, H., Lu, L., Wang, H. (2007). Discussion on Score Normalization and Language Robustness in Text-Independent Multi-language Speaker Verification. In: Huang, DS., Heutte, L., Loog, M. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2007. Lecture Notes in Computer Science, vol 4681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74171-8_114
Download citation
DOI: https://doi.org/10.1007/978-3-540-74171-8_114
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74170-1
Online ISBN: 978-3-540-74171-8
eBook Packages: Computer ScienceComputer Science (R0)