Ideas for Clustering of Similar Models of a Speaker in an Online Speaker Diarization System

Marie Kunešová^15,16 &
Vlasta Radová^15,16

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1865 Accesses

Abstract

During online speaker diarization, a situation may occur where a single speaker is being represented by several different models. Such situation leads to worsened diarization results, because the diarization system considers every change of a model to be a change of speakers. In the article we describe a method for detecting this situation and propose several ways of solving it. Experiments show that the most suitable option is treating multiple GMMs as belonging to a single speaker, i.e. updating all of them with the same data every time one of them is assigned a new segment. In that case, there was a relative improvement in Diarization Error Rate of 30.69% in comparison with the baseline system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Unsupervised adaptation of PLDA models for broadcast diarization

Article Open access 27 December 2019

Hybridization DE with K-means for speaker clustering in speaker diarization of broadcasts news

Article 11 September 2019

References

Anguera, X., Bozonnet, S., Evans, N., Fredouille, C., Friedland, G., Vinyals, O.: Speaker Diarization: A Review of Recent Research. IEEE Transactions on Audio, Speech, and Language Processing 20, 356–370 (2012)
Article Google Scholar
Campr, P., Kunešová, M., Vaněk, J., Čech, J., Psutka, J.: Audio-video speaker diarization for unsupervised speaker and face model creation. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS, vol. 8655, pp. 465–472. Springer, Heidelberg (2014)
Google Scholar
Geiger, J., Wallhoff, F., Rigoll, G.: GMM-UBM based open-set online speaker diarization. In: Proc. Interspeech, pp. 2330–2333 (2010)
Google Scholar
Markov, K., Nakamura, S.: Never-ending learning system for on-line speaker diarization. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2007, pp. 699–704 (2007)
Google Scholar
Reynolds, D., Singer, E., Carlson, B., O’Leary G., McLaughlin, J., Zissman, M.: Blind clustering of speech utterances based on speaker and language characteristics. In: Proceedings of the 5th International Conference on Spoken Language Processing, vol. 7, pp. 3193–3196 (1998)
Google Scholar
Sato, M., Ishii, S.: On-line EM algorithm for the Normalized Gaussian Network. Neural Computation 12, 407–432 (2000)
Article Google Scholar
National Institute of Standards and Technology. http://www.itl.nist.gov

Download references

Author information

Authors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Marie Kunešová & Vlasta Radová
Faculty of Applied Sciences, New Technologies for the Information Society, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Marie Kunešová & Vlasta Radová

Authors

Marie Kunešová
View author publications
You can also search for this author in PubMed Google Scholar
Vlasta Radová
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marie Kunešová .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Pavel Král
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kunešová, M., Radová, V. (2015). Ideas for Clustering of Similar Models of a Speaker in an Online Speaker Diarization System. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-24033-6_26
Published: 11 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24032-9
Online ISBN: 978-3-319-24033-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ideas for Clustering of Similar Models of a Speaker in an Online Speaker Diarization System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Unsupervised adaptation of PLDA models for broadcast diarization

Hybridization DE with K-means for speaker clustering in speaker diarization of broadcasts news

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Ideas for Clustering of Similar Models of a Speaker in an Online Speaker Diarization System

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Unsupervised adaptation of PLDA models for broadcast diarization

Hybridization DE with K-means for speaker clustering in speaker diarization of broadcasts news

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation