Article

ARASID: Artificial Reverberation-Adjusted Indoor Speaker Identification Dealing with Variable Distances

Authors:

Zeya Chen,

Mohsin Y. Ahmed,

Asif Salekin,

John A. StankovicAuthors Info & Claims

EWSN '19: Proceedings of the 2019 International Conference on Embedded Wireless Systems and Networks

Pages 154 - 165

Published: 15 March 2019 Publication History

Publisher Site

Abstract

Indoor speaker identification systems have been researched for a long time and are widely used in many human interaction acoustic monitoring systems. Many works have focused on improving accuracy in dealing with different realisms, including noise and varying distances from the microphone. However, these works either require significant extra effort such as measuring room types and dimensions, obtaining many speakers’ samples, or requiring expensive hardware such as microphone arrays and complex deployment settings. In this paper, we introduce a complete speaker identification solution using an artificial reverberation generator with different parameters to adjust the original close-distance speech samples so that each speaker has different artificial voice samples. Samples in different environments are not required because these artificial samples are close approximations to different environments. Two kinds of models, GMM-UBM and the i-vector, are evaluated. The models are trained on all samples separately, and testing is done against all in parallel. A score fusing approach with two thresholds, a minimum value and a minimum difference, is applied to the scores in producing the final result. Also, several standard acoustic pre-processing routines, including a voice activity detection algorithm and an overlapped speech remover, are included to make the system fully deployable. Finally, to assess the improvements when applying a reverberation adjustment, we evaluate our system with two literature speech databases, one has 251 people and the other one has four kinds of emotions. Further, we perform an inlab speaking experiment. The evaluation results show our system has more than 90% accuracy in identifying speakers within 6 meters if the emotion is neutral, and a 10% improvement over no reverberation adjustments when speakers have non-neutral emotions.

References

[1]

A. Akula, V. R. Apsingekar, and P. L. De Leon. Speaker identification in room reverberation using gmm-ubm. In Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop, 2009. DSP/SPE 2009. IEEE 13th, pages 37–41. IEEE, 2009.

Abstract

References

Cited By

Recommendations

Blind spectral weighting for robust speaker identification under reverberation mismatch

Text-Independent Speaker Identification Using Vowel Formants

Speaker Identification Using Whispered Speech

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations