Discussion on Score Normalization and Language Robustness in Text-Independent Multi-language Speaker Verification

Jian Zhao²,
Yuan Dong^1,2,
Xianyu Zhao¹,
Hao Yang²,
Liang Lu² &
…
Haila Wang¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4681))

Included in the following conference series:

International Conference on Intelligent Computing

1508 Accesses
2 Citations

Abstract

In speaker recognition fields, score normalization is a widely used and effective technique to enhance the recognition performances and is developing further. In this paper, we are focused on the comparison among many kinds of candidates of score normalization methods and a new implementation of the speaker adaptive test normalization (ATnorm) based on a cross similarity measurement is presented which doesn’t need an extra corpus for speaker adaptive impostor cohort selection. The use of ATnorm for the language robustness of the multi-language speaker verification is also investigated. Experiments are conducted on the core task of the 2006 NIST Speaker Recognition Evaluation (SRE) corpus. The experimental results indicate that all the score normalization methods mentioned can improve the recognition performances and ATnorm behaves best. Moreover, ATnorm can further contribute to the performance as a means of language robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 103.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 129.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improving Performance of Speaker Identification Systems Using Score Level Fusion of Two Modes of Operation

Investigating Language Variability on the Performance of Speaker Verification Systems

Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent and Text-Independent Operation Modalities

References

Li, K.P., Porter, J.E.: Normalizations and Selection of Speech Segments for Speaker Recognition Scoring. In: ICASSP’88. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 595–598. IEEE Computer Society Press, New York (1988)
Google Scholar
Reynolds, D.A.: The Effect of Handset Variability on Speaker Recognition Performance: Experiments on the Switchboard Corpus. In: ICASSP ’96. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta Ga USA, vol. 1, pp. 113–116. IEEE Computer Society Press, Los Alamitos (1996)
Google Scholar
Auckenthaler, C., Thomas, L.: Score Normalization for Text-Independent Speaker Verification Systems. Digital Signal Processing, 1-3 (2000)
Google Scholar
Bimbot, F., Bonastre, J., Fredouille, C., et al.: A Tutorial on Text-Independent Speaker Verification. EURASIP Journal on Applied Signal Processing, 430–451 (2004)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 19–41 (2000)
Google Scholar
Sturim, D.E., Reynolds, D.A.: Speaker Adaptive Cohort Selection for Tnorm in Text-Independent Speaker Verification. IEEE, 171-174 (2005)
Google Scholar
Yassine, M., Charlet, D.: Speaker Recognition by Location in the Space of Reference Speakers. Speech Communication, 127-141 (2006)
Google Scholar
The NIST Year 2006 Speaker Recognition Evaluation Plan. [Online]. Available: http://www.nist.gov/speech/tests/spk/2006/index.htm
Hermansky, H., Morgan, N.: RASTA Processing of Speech. IEEE Transactions on Speech and Audio Processing, 578-589 (1994)
Google Scholar
Reynolds, D.A.: Channel Robust Speaker Verification via Feature Mapping. In: ICASSP’03. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, vol. 2, pp. 53–56. IEEE Computer Society Press, Los Alamitos (2003)
Google Scholar
Kumar, N.: Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph.d. thesis, Baltimore USA, John Hopkins University (1997)
Google Scholar
Yang, H., Dong, Y., Zhao, X.Y., Zhao, J., et al.: Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification. In: The 5th International Symposium on Chinese Spoken Language Processing 2006, Singapore (in press)
Google Scholar
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice-Hall, New Jersey (2002)
Google Scholar
Torre, Á., Peinado, A.M., Segura, J.C. et al.: Histogram Equalization of Speech Representation for Robust Speech Recognition. IEEE Transaction, Speech and Audio Processing 13, 355–366 (2005)
Article Google Scholar
The NIST Year 2004 Speaker Recognition Evaluation Plan. [Online]. Available: http://www.nist.gov/speech/tests/spk/2004/index.htm
Gauvain, J.L., Lee, C.H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains. IEEE Transaction, Speech and Audio Processing 2, 291–298 (1994)
Article Google Scholar

Download references

Author information

Authors and Affiliations

France Telecom Research & Development Center, Beijing, 100080, China
Yuan Dong, Xianyu Zhao & Haila Wang
Beijing University of Posts and Telecommunications, Beijing, 100876, China
Jian Zhao, Yuan Dong, Hao Yang & Liang Lu

Authors

Jian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xianyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Haila Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

De-Shuang Huang Laurent Heutte Marco Loog

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, J., Dong, Y., Zhao, X., Yang, H., Lu, L., Wang, H. (2007). Discussion on Score Normalization and Language Robustness in Text-Independent Multi-language Speaker Verification. In: Huang, DS., Heutte, L., Loog, M. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Theoretical and Methodological Issues. ICIC 2007. Lecture Notes in Computer Science, vol 4681. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74171-8_114

Download citation

DOI: https://doi.org/10.1007/978-3-540-74171-8_114
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74170-1
Online ISBN: 978-3-540-74171-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discussion on Score Normalization and Language Robustness in Text-Independent Multi-language Speaker Verification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Improving Performance of Speaker Identification Systems Using Score Level Fusion of Two Modes of Operation

Investigating Language Variability on the Performance of Speaker Verification Systems

Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent and Text-Independent Operation Modalities

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Discussion on Score Normalization and Language Robustness in Text-Independent Multi-language Speaker Verification

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Improving Performance of Speaker Identification Systems Using Score Level Fusion of Two Modes of Operation

Investigating Language Variability on the Performance of Speaker Verification Systems

Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent and Text-Independent Operation Modalities

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation