Speaker Detection Using Phoneme Specific Hidden Markov Models

Edvin Pakoci²²,
Nikša Jakovljević²²,
Branislav Popović²²,
Dragiša Mišković²² &
…
Darko Pekar²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8773))

Included in the following conference series:

International Conference on Speech and Computer

Abstract

The paper presents a speaker detection system based on phoneme specific hidden Markov model in combination with Gaussian mixture model. Our motivation stems from the fact that the phoneme specific HMM system can model temporal variations and provides possibility to ponder the scores of specific phonemes as well as efficient pruning. The performance of the system has been evaluated on speech database which contains utterances in Serbian from 250 speakers (10 of them being the target speakers). The proposed model is compared to a system based on Gaussian mixture model - universal background model, and showed a significant improvement in detection performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Speaker Classification via Supervised Hierarchical Clustering Using ICA Mixture Model

Speaker identification based on state space model

Article 13 August 2015

Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI

References

Beigi, H.: Fundamentals of Speaker Recognition. Springer (2011)
Google Scholar
Auckenthaler, R., Parris, E., Carey, M.: Improving a GMM speaker verification system by phonetic weighting. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1999), vol. 1, pp. 313–316. Phoenix, Arizona (1999)
Google Scholar
Kajarekar, S., Hermansky, H.: Speaker verification based on broad phonetic categories. In: A Speaker Odyssey - The Speaker Recognition Workshop (2001)
Google Scholar
Hansen, E., Slyh, R., Anderson, T.: Speaker recognition using phoneme-specific GMMs. In: ODYSSEY 2004-The Speaker and Language Recognition Workshop, pp. 179–184 (2004)
Google Scholar
Dunn, R., Reynolds, D., Quatieri, T.: Approaches to speaker detection and tracking in conversational speech. Digit. Signal Process. 10, 93–112 (2000)
Article Google Scholar
Kinnunen, T., Li, H.: An Overview of Text-Independent Speaker Recognition: From Features to Supervectors. Speech Commun 52, 12–40 (2010)
Article Google Scholar
Scheffer, N., Ferrer, L., Graciarena, M., Kajarekar, S., Shriberg, E., Stolcke, A.: The SRI NIST 2010 Speaker Recognition Evaluation System. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011), pp. 5292–5295. Prague, Czech Republic (2011)
Chapter Google Scholar
Antal, M.: Phonetic Speaker Recognition. In: 7th International Conference COMMUNICATIONS, pp. 67–72 (2008)
Google Scholar
Reynolds, D., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Gaussian Mixture Models. Digit. Signal Process. 10, 19–41 (2000)
Article Google Scholar
Delić, V., Sečujski, M., Jakovljević, N., Janev, M., Obradović, R., Pekar, D.: Speech Technologies for Serbian and Kindred South Slavic Languages. In: Advances in Speech Recognition, pp. 141–165 (2010)
Google Scholar
Young, S.J., Evermann, G., Gales, M.J.F., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book, version 3.4 (2006)
Google Scholar
Gales, M., Young, S.: The Application of Hidden Markov Models in Speech Recognition. Foundations and Trends in Signal Processing 1(3), 195–304 (2007)
Article MATH Google Scholar
Jakovljević, N., Miškovic, D., Janev, M., Sečujski, M., Delić, V.: Comparison of Linear Discriminant Analysis Approaches in Automatic Speech Recognition. Elektronika Ir Elektrotechnika 19(7), 76–79 (2013)
Google Scholar
Delić, V., Sečujski, M., Jakovljević, N., Pekar, D., Mišković, D., Popović, B., Ostrogonac, S., Bojanić, M., Knežević, D.: Speech and language resources within speech recognition and synthesis systems for serbian and kindred south slavic languages. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 319–326. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Technical Sciences, University of Novi Sad, Serbia
Edvin Pakoci, Nikša Jakovljević, Branislav Popović, Dragiša Mišković & Darko Pekar

Authors

Edvin Pakoci
View author publications
You can also search for this author in PubMed Google Scholar
Nikša Jakovljević
View author publications
You can also search for this author in PubMed Google Scholar
Branislav Popović
View author publications
You can also search for this author in PubMed Google Scholar
Dragiša Mišković
View author publications
You can also search for this author in PubMed Google Scholar
Darko Pekar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences, 39, 14th line, 199178, St. Petersburg, Russia
Andrey Ronzhin
Institute of Applied and Mathematical Linguistics, Moscow State Linguistic University, 38, Ostozhenka, 119034, Moscow, Russia
Rodmonga Potapova
Faculty of Technical Sciences, University of Novi Sad, 6, Trg Dositeja Obradovića, 21000, Novi Sad, Serbia
Vlado Delic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pakoci, E., Jakovljević, N., Popović, B., Mišković, D., Pekar, D. (2014). Speaker Detection Using Phoneme Specific Hidden Markov Models. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_51

Download citation

DOI: https://doi.org/10.1007/978-3-319-11581-8_51
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Speaker Detection Using Phoneme Specific Hidden Markov Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Speaker Classification via Supervised Hierarchical Clustering Using ICA Mixture Model

Speaker identification based on state space model

Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Speaker Detection Using Phoneme Specific Hidden Markov Models

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Speaker Classification via Supervised Hierarchical Clustering Using ICA Mixture Model

Speaker identification based on state space model

Speaker Detection in Audio Stream via Probabilistic Prediction Using Generalized GEBI

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation