Abstract
This paper proposes the application of measures based on nonlinear dynamics for emotional speech characterization. Measures such as mutual information, dimension correlation, entropy correlation, Shannon’s entropy, Lempel–Ziv complexity and Hurst exponent are extracted from the samples of a database of emotional speech. Then, summary statistics such as mean, standard deviation, skewness and kurtosis are applied on the extracted measures. Experiments were conducted on the Berlin emotional speech database for a three-class problem (neutral, fear and anger as emotional states). Feature selection is accomplished and a methodology is proposed to find the best features. In order to evaluate the discrimination ability of the selected features, a neural network classifier is used. The global success rate is 93.78 ± 3.18 %.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Yildirim S, Narayanan S, Potamianos A. Detecting emotional state of a child in a conversational computer game. Comput Speech Lang. 2011;25:29–44.
Burkhardt F, Polzehl T, Stegmann J, Metze F, Huber R. Detecting real life anger. In: Proceedings of the IEEE international conference on acoustics, speech and signal process. Taipei: IEEE Press; 2009. p. 4761–4764.
Lefter I, Rothkrantz LJM, van Leeuwen DA, Wiggers P. Automatic stress detection in emergency (Telephone) calls. Int J Intell Def Support Syst. 2011;4(2):148–68.
Polzehl T, Schmitt A, Metze F, Wagner M. Anger recognition in speech using acoustic and linguistic cues. Speech Comm. 2011;53(9–10):1198–209. doi:10.1016/j.specom.2011.05.002.
Wu S, Falk TH, Wai-Yip, C. Automatic recognition of speech emotion using long-term spectro-temporal features. In: Proceedings of the 16th IEEE international conference on digital signal process. Santorini, Greece; 5–7 July 2009, p. 1–6.
Giannakopoulos T, Pikrakis A, Theodoridis SA. Dimensional approach to emotion recognition of speech from movies. In: Proceedings of the 34th IEEE international conference on acoustic, speech and signal process. (ICASSP 2009). Taipei, Taiwan; 19–24 April 2009, p. 65–68.
Schuller B, Batliner A, Steidl S, Seppi D. Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Comm. 2011;53(9–10):1062–87. doi:10.1016/j.specom.2011.01.011.
Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B. A database of German emotional speech. In: Proceedings of the 6th annual conference of the international speech communication association (Interspeech 2005), Lisbon, Portugal; 4–8 September 2005, p. 1517–1520. http://pascal.kgw.tuberlin.de/emodb/.
Wu S, Falk TH, Wai-Yip C. Automatic speech emotion recognition using modulation spectral features. Speech Comm. 2011;53:768–85.
Henríquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Díaz-de-María F. Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process. 2009;17(6):1186–95.
Alonso JB, Díaz-de-María F, Travieso CM, Ferrer MA. Using nonlinear features for voice disorder detection. In: Proceedings of 3rd international conference nonlinear speech process. Barcelona, Spain; 2005, p. 94–106.
Vaziri G, Almasganj F, Jenabi MS. On the fractal self- similarity of laryngeal pathologies detection: the estimation of hurst parameter. In: Proceedings of the 5th International conference on Information Technology and Application in Biomedicine. Shenzhen, China; 2008, p. 383–386.
Vaziri G, Almasganj F, Behroozmand R. Pathological assessment of patients’ speech signals using nonlinear dynamical analysis. Comput Biol Med. 2010;40:54–63.
Tsanas A, Little MA, McSharry PE, Spielman J, Ramig LO. Novel speech signal processing algorithms for high-accuracy classification of Parkinson’s disease. IEEE Trans Biomed Eng. 2012;59(5):1264–71.
Little MA, McSharry PE, Hunter EJ, Spielman J, Ramig LO. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans Biomed Eng. 2009;56(4):1015–22.
Little MA, McSharry PE, Roberts SJ, Costello DA, Moroz IM. Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection. Biomed Eng Online. 2007;6:23.
Takens F. Detecting strange attractors in turbulence. Lecture notes in math, vol. 898. New York: Springer; 1981. p. 366–81.
Fraser AM, Swinney HL. Independent coordinates for strange attractors from mutual information. Phys Rev A. 1986;33(2):1134–40.
Kennel MB, Brown R, Abarbanel HDI. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A. 1992;45(6):3403–11.
Kantz H, Schreiber T. Nonlinear time series analysis. 2nd ed. Cambridge: Cambridge University Press; 1997.
Theiler J. Lacunarity in a best estimator of fractal dimension. Phys Lett A. 1988;133:195–200.
Kaspar F, Shuster HG. Easily calculable measure for complexity of spatiotemporal patterns. Phys Rev A. 1987;36:842–8.
Lempel A, Ziv J. On the complexity of finite sequences. IEEE Trans Inform Theory. 1976;22:75–81.
Hurst HE, Black RP, Simaika YM. Long-term storage: an experimental study. London: Constable; 1965.
Pudil P, Novovicová J, Kittler J. Floating search methods in feature selection. Pattern Recognit Lett. 1994;15:1119–25.
Ruelle D. Deterministic chaos: the science and the fiction. Proc R Soc Lond A. 1990;427:241–8.
Kienast M, Sendlmeier WF. Acoustical analysis of spectral and temporal changes in emotional speech. In: Proceedings of the ISCA workshop on speech and emotion. Newcastle, UK; 5–7 September 2000, p. 92–97.
Acknowledgments
This work has been funded by the Spanish government MCINN TEC2009-14123-C04 research project and a research training grant from the ACIISI of the Canary Autonomous Government (Spain) with a co-financing rate of 85 % from the European Social Fund (ESF). This work was also granted by CODI at Universidad de Antioquia, project MC11-1-03.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Henríquez Rodríguez, P., Alonso Hernández, J.B., Ferrer Ballester, M.A. et al. Global Selection of Features for Nonlinear Dynamics Characterization of Emotional Speech. Cogn Comput 5, 517–525 (2013). https://doi.org/10.1007/s12559-012-9157-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-012-9157-0