Abstract
This paper proposes a framework for the optimization of the feature set, in an HMM-based text-dependent speaker verification system, in which we distinguish the alignment task from the scoring task. The optimization is based on the search, among a set for potential features, for the feature subset that gives the minimal experimental Equal Error Rate. We have studied and compared various heuristics to find the optimal subset. We have also extended this optimization principle to the search for an optimal weighting of the different axes of the acoustic space. The optimal weighting was found by using a genetic algorithm.
The proposed framework was applied to study cepstral coefficients and their first and second derivatives, in order to find a speaker and text-independent optimal feature set. Experiments were conducted on a large scale telephone database. The results indicate that the selection of an appropriate feature set, or the appropriate weighting of the features, could significantly improve verification performance, especially when little training data is available. Practically, it was found that cepstral coefficients of high order and first derivatives of all cepstral coefficients are the most useful for speaker verification.
Preview
Unable to display preview. Download preview PDF.
References
S. Furui, Research on individuality features in speech waves and automatic speaker recognition techniques, Speech Communication, vol. 5, n∘2, 1986, pp. 9–22.
J.F. Bonastre and H. Meloni, Avantages d'une approche analytique orientée connaissances en reconnaissance du locuteur, XXèmes JEP, 1994, pp. 259–263, Trégastel, France.
J. Thompson and J.S. Mason, Within Class Optimization of Cepstra for Speaker Recognition, Eurospeech, 1993, pp. 165–168, Berlin, Germany.
M.R. Sambur, Selection of Acoustic Features for Speaker Identification, IEEE Transactions on ASSP, vol. 23, n∘2, april 1975, pp. 176–182.
R.S. Cheung et B.A. Eisenstein, Feature Selection via Dynamic Programming for Text-Independent Speaker Verification, IEEE Transactions on ASSP, vol. 26, n∘5, October 1978, pp. 397–403.
H.L. Higgins, L.G. Bahler et J.E. Porter, Speaker Verification using Randomized Phrase Prompting, Digital Signal Processing, vol. 1, n∘2, 1991, pp. 89–106
C. Mokbel, P. Pachès-Leal, D. Jouvet and J. Monné, Compensation of Telephone Line Effects for Robust Speech Recognition, ICSLP 1994, pp. 987–990, Yokohama, Japan.
D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Reading, Massachusetts, Addison-Wesley, 1989.
N.N. Schraudolph and J.J. Grefenstette, GAucsd, ftp: cs.ucsd.edu, University of California, San Diego, U.S.A.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Charlet, D., Jouvet, D. (1997). Optimizing feature set for speaker verification. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015997
Download citation
DOI: https://doi.org/10.1007/BFb0015997
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive