Abstract
In this work, a new phonemes recognition system is proposed. The base of decision of the proposed system is the tongue position and roundedness of the lips. The features of the speech are the coefficients of Wavelet Packet Transform with sub-bands selected through the Mel scale. The SVM (Support Vector Machine) is used as classifier in the structure of a Hierarchical Committee Machine. The database used for the recognition was a set of oral vocalic phonemes of the Portuguese language. The experimental results show success rates of 97.50% for the user-dependent case and 91.01% for the user-independent case. This new proposal increased 3.5% the success rate in relation to the “one vs. all” decision strategy.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Burrus, S.C., Gopinath, R.A., Guo, H.: Introduction to Wavelets and Wavelets Transforms. Prentice Hall, New Jersey (1998)
Daubechies, I.: The Wavelet Transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory, 961–1005 (1990)
Duda, R.O., Hart, P.E.: Pattern classification and scene analysis. John Wiley & Sons, New York (1973)
Farooq, O., Datta, S.: Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Processing Letters 08(07), 196–198 (2001)
Gowdy, J.N., Tufekci, Z.: Mel-scaled discrete wavelet coefficients for speech recognition. In: Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1351–1354 (2000)
Haykin, S.: Redes Neurais, Princípios e prática. 2a Edição, Porto Alegre, Editora Bookman (2001)
Hosom, J.P.: Automatic Phoneme Alignment Based on Acoustic-Phonetic Modeling. In: International Conference on Spoken Language Processing-ICSLP 2002, September 2002, vol. I, pp. 357–360. Boulder, Co. (2002)
Juneja, A., Espy-Wilson, C.: Speech segmentation using probabilistic phonetic feature hierarchy and support vector machines. In: Proceedings of International Joint Conference on Neural Networks, Portland, Oregan (2003)
Russell, M.J., Bilmes, J.A.: Introduction to the special issue on new computational paradigms for acoustic modeling in speech recognition. Editorial, Computer Speech and Language 17, 107–112 (2003)
Santos, S.C., Alcaim, A.: Sílabas como unidades fonéticas para o reconhecimento de voz em português. SBA Controle & Automação 12(01) (2001)
Silva, T.C.: Fonética e Fonologia do Português. 7º Edição, Paulo, S. (ed.) Contexto (2003)
Stevens, S.S., Volkman, J., e Newman, E.B.: A Scale for Measurement of the Psychological Magnitude Picth. Journal of the Acoustical Society of America 08, 185–190 (1937)
Vapnik, V.N.: Principles of risk minimization for learning theory. Advances in Neural Information Processing Systems 04, 831–838 (1992)
Young, S.: A Review of Large-Vocabulary Continuous-Speech Recognition. IEEE Signal Processing Magazine, 45–57 (September 1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de A. Bresolin, A., Neto, A.D.D., Alsina, P.J. (2006). A New Hierarchical Decision Structure Using Wavelet Packet and SVM for Brazilian Phonemes Recognition. In: King, I., Wang, J., Chan, LW., Wang, D. (eds) Neural Information Processing. ICONIP 2006. Lecture Notes in Computer Science, vol 4233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11893257_18
Download citation
DOI: https://doi.org/10.1007/11893257_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46481-5
Online ISBN: 978-3-540-46482-2
eBook Packages: Computer ScienceComputer Science (R0)