Abstract
Although speaker recognition technology has evolved into some new stages recently, GMM-UBM (Gaussian Mixture Model-Universal Background Model) has always been the base module for the newly developed methods such as SVM, JFA and i-vector. Because of its simplicity, flexibility and robustness, GMM-UBM has been used as a benchmark system for research reference. For traditional UBM construction, speech data from a lot of speakers other than the target speakers should be obtained, which means much cost of data collection. In this paper, we make preliminary exploration on a new approach to train the UBM, named as self-contained UBM, in which only the target speakers’ training data were used. We study several strategies of speaker selection for the self-contained UBM construction, gradually reduced from 50 to 3 speakers. Experiments on MASC@CCNT show that our self-contained UBM obtain considerable recognition rate compared with traditional UBM, while needing far less training data thus less training time. Furthermore, we find out that the obtained good ternary UBM speakers have an interesting characteristic of spanning a triangle (UBM speaker triangle) after dimension reduction of MFCC features with PCA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kinnunen, T., Li, H.Z.: An overview of text-independent speaker recognition: From features to supervectors. Speech Commun. 52, 12–40 (2010)
Wu, T., Yang, Y.C., Wu, Z.H.: A speech corpus in mandarin for emotion analysis and affective speaker recognition. In: IEEE Odyssey Speaker and Language Recognition Workshop, pp. 1–5. IEEE Press, New York (2006)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digit. Signal Proces. 10, 19–41 (2000)
Hasan, T., Hansen, J.H.L.: A study on universal background model training in speaker verification. IEEE T. Audio Speech Lang. Proces. 19, 1890–1899 (2011)
Shan, Z.Y., Yang, Y.C.: Universal back ground model reduction based efficient speaker recognition. J. Zhejiang Univ.(Eng. Sci.) 43, 978–983 (2009). (in Chinese)
Reynolds, D.A.: Automatic speaker recognition using Gaussian Mixture speaker model. Lincoln Lab. J. 8, 173–192 (1996)
Huang, T., Yang, Y.C., Wu, Z.H.: Combining MFCC and pitch to enhance the performance of the gender recognition. In: 8th International Conference on Signal Processing, pp. 1–4. IEEE Press, New York (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Yang, Y., Sun, Y. (2015). Preliminary Study on Self-contained UBM Construction for Speaker Recognition. In: Yang, J., Yang, J., Sun, Z., Shan, S., Zheng, W., Feng, J. (eds) Biometric Recognition. CCBR 2015. Lecture Notes in Computer Science(), vol 9428. Springer, Cham. https://doi.org/10.1007/978-3-319-25417-3_56
Download citation
DOI: https://doi.org/10.1007/978-3-319-25417-3_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25416-6
Online ISBN: 978-3-319-25417-3
eBook Packages: Computer ScienceComputer Science (R0)