Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

Shashidhar G. Koolagudi¹,
Y. V. Srinivasa Murthy ORCID: orcid.org/0000-0001-6146-5272¹ &
Siva P. Bhaskar¹

996 Accesses
41 Citations
Explore all metrics

Abstract

In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n—number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (\(\mu\)) and standard deviation (\(\sigma\)) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

Article 22 April 2023

Emotion Recognition from Speech Using Multiple Features and Clusters

Efficiency of chosen speech descriptors in relation to emotion recognition

Article Open access 20 February 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.
Article MathSciNet MATH Google Scholar
Anagnostopoulos, C.-N., Iliou, T., & Giannoukos, I. (2012). Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011. Artificial Intelligence Review, 43, 1–23.
Google Scholar
Ananthapadmanabha, T., & Yegnanarayana, B. (1979). Epoch extraction from linear prediction residual for identification of closed glottis interval. IEEE Transactions on Acoustics, Speech and Signal Processing, 27(4), 309–319.
Article Google Scholar
Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614.
Article Google Scholar
Bhatti, M. W., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion recognition in speech. In ISCAS’04. Proceedings of the 2004 international symposium on Circuits and systems, 2004, (Vol. 2, pp II–181). IEEE
Bishop, C. M., et al. (1995). Neural networks for pattern recognition. New York: Oxford University Press
Bitouk, D., Verma, R., & Nenkova, A. (2010). Class-level spectral features for emotion recognition. Speech Communication, 52(7), 613–625.
Article Google Scholar
Black, M. J., & Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In Proceedings, fifth international conference on Computer vision, 1995, (pp. 374–381). IEEE.
Bou-Ghazale, S. E., & Hansen, J. H. L. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing, 8(4), 429–442.
Article Google Scholar
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.
Article Google Scholar
Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C. M., Kazemzadeh, A., Lee, S., Neumann, U., & Narayanan, S. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205–211). ACM
Chakraborty, R., Pandharipande, M., & Kopparapu, S. K. (2016). Knowledge-based framework for intelligent emotion recognition in spontaneous speech. Procedia Computer Science, 96, 587–596.
Article Google Scholar
Chauhan, A., Koolagudi, S. G., Kafley, S., & Rao, K. S. (2010). Emotion recognition using lp residual. In Students’ technology symposium (TechSym), 2010 IEEE (pp. 255–261). IEEE.
Chavhan, Y., Dhore, M. L., & Yesaware, P. (2010). Speech emotion recognition using support vector machine. International Journal of Computer Applications, 1(20), 6–9.
Article Google Scholar
Chen, C., You, M., Song, M., Bu, J., Liu, J. (2006). An enhanced speech emotion recognition system based on discourse information. In Computational Science–ICCS 2006 (pp. 449–456). New York: Springer (2006).
Chung-Hsien, W., & Liang, W.-B. (2011). Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transactions on Affective Computing, 2(1), 10–21.
Article Google Scholar
Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1), 5–32.
Article MATH Google Scholar
Dai, K., Fell, H. J., & MacAuslan, J. (2008). Recognizing emotion in speech using neural networks. Telehealth and Assistive Technologies, 31, 38–43.
Google Scholar
Deller, J. R. P., John G., & Hansen, J. H.L. (2000). Discrete-time processing of speech signals. New York: IEEE.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.
MathSciNet MATH Google Scholar
Deng, J., Xinzhou, X., Zhang, Z., Frühholz, S., & Schuller, B. (2017). Universum autoencoder-based domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 24(4), 500–504.
Article Google Scholar
Deng, J., Zhang, Z., Eyben, F., & Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9), 1068–1072.
Article Google Scholar
El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.
Article MATH Google Scholar
El-Yazeed, M. F., El Gamal, M. A., & El Ayadi, M. M. H. (2004). On the determination of optimal model order for gmm-based text-independent speaker identification. EURASIP Journal on Applied Signal Processing, 1078–1087, 2004.
MATH Google Scholar
Essa, I. A., & Pentland, A. P. (1997). Coding, analysis, interpretation, and recognition of facial expressions. IEEE transactions on Pattern analysis and machine intelligence, 19(7):757–763.
Farrus, M., & Hernando, J. (2009). Using jitter and shimmer in speaker verification. IET Signal Processing, 3(4), 247–257.
Article Google Scholar
Firoz, S.A., Raji, S.A., & Babu, A.P. (2009). Automatic emotion recognition from speech using artificial neural networks with gender-dependent databases. In ACT’09. International conference on Advances in computing, control, & telecommunication technologies, 2009, (pp. 162–164). IEEE
Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden markov models. Speech Communication, 41(4), 603–623.
Article Google Scholar
Fu, L., Mao, X., & Chen, L. (2008). Relative speech emotion recognition based artificial neural network. In Computational intelligence and industrial application, 2008. PACIIA’08. Pacific-Asia workshop on (Vol. 2, pp. 140–144). IEEE
Giannoulis, Panagiotis, & Potamianos, Gerasimos (2012). A hierarchical approach with feature selection for emotion recognition from speech. In LREC (pp. 1203–1206)
Grimm, M., Kroschel, K., & Narayanan, S. (2007). Support vector regression for automatic recognition of spontaneous emotions in speech. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007, (vol. 4, pp. IV–1085). IEEE
Han, J., & Kamber, M. (2006). Data Mining. Southeast Asia Edition: Concepts and Techniques. Morgan kaufmann.
MATH Google Scholar
Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In Fifteenth annual conference of the international speech communication association.
Hernando, J., Nadeu, C., & Mariño, J. B. (1997). Speech recognition in a noisy car environment based on lp of the one-sided autocorrelation sequence and robust similarity measuring techniques. Speech Communication, 21(1), 17–31.
Article Google Scholar
Hess, W. J. (2008). Pitch and voicing determination of speech with an extension toward music signals. In Springer Handbook of Speech Processing, (pp. 181–212). Berlin: Springer.
Heuft, B., Portele, T., & Rauth, M. (1996). Emotions in time domain synthesis. In Proceedings, fourth international conference on Spoken Language, 1996. ICSLP 96, (Vol. 3, pp. 1974–1977). IEEE
Huang, J., Yang, W., & Zhou, D. (2012). Variance-based gaussian kernel fuzzy vector quantization for emotion recognition with short speech. In 2012 IEEE 12th international conference on Computer and information technology (CIT), (pp. 557–560). IEEE.
Iida, A., Campbell, N., Higuchi, F., & Yasumura, M. (2003). A corpus-based speech synthesis system with emotion. Speech Communication, 40(1), 161–187.
Article MATH Google Scholar
Iida, A., Campbell, N., Iga, S., Higuchi, F., Yasumura, M. (2000). A speech synthesis system with emotion for assisting communication. In ISCA tutorial and research workshop (ITRW) on speech and emotion.
Ingale, A. B., & Chaudhari, D. S. (2012). Speech emotion recognition. International Journal of Soft Computing and Engineering (IJSCE), 2(1), 235–238.
Google Scholar
Jawarkar, N. P., et al. (2007). Emotion recognition using prosody features and a fuzzy min-max neural classifier. The Institution of Electronics and Telecommunication Engineers, 24(5), 369–373.
Google Scholar
Kaiser, L. (1962). Communication of affects by single vowels. Synthese, 14(4), 300–319.
Article Google Scholar
Kenji, M. A. S. E. (1991). Recognition of facial expression from optical flow. IEICE TRANSACTIONS on Information and Systems, 74(10), 3474–3483.
Google Scholar
Khanchandani, K. B., & Hussain, M. A. (2009). Emotion recognition using multilayer perceptron and generalized feed forward neural network. Journal of Scientific and Industrial Research, 68(5), 367.
Google Scholar
Khanna, P., & Kumar, M. S. (2011). Application of vector quantization in emotion recognition from human speech. In Information intelligence, systems, technology and management (pp. 118–125). New York: Springer.
Kohavi, R., et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14, 1137–1145.
Google Scholar
Konar, A., & Chakraborty, A. (2014). Emotion recognition: A pattern analysis approach. Wiley: Hobroken, NJ.
Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.
Article Google Scholar
Koolagudi, S. G., Maity, S., Kumar, V. A., Chakrabarti, S., & Rao, K. S. (2009). Iitkgp-sesc: Speech database for emotion analysis. In International conference on contemporary computing (pp. 485–492). New York: Springer
Koolagudi, S. G., Nandy, S., & Rao, K. S. (2009). Spectral features for emotion classification. In Advance computing conference, 2009. IACC 2009. IEEE International (pp. 1292–1296). IEEE
Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In 2010 international conference on Signal processing and communications (SPCOM), (pp. 1–5). IEEE.
Kostoulas, T.P., & Fakotakis, N. (2006). A speaker dependent emotion recognition framework. In Proceedings 5th international symposium, communication systems, networks and digital signal processing (CSNDSP), University of Patras (pp. 305–309)
Krothapalli, S. R., & Koolagudi, S. G. (2013). Speech emotion recognition: A review. In Emotion recognition using speech features, pp. 15–34. New York: Springer.
Kwon, O.-W., Chan, K., Hao, J., & Lee, T.-W. (2003). Emotion recognition by speech signals. In INTERSPEECH.
Le Bouquin, R. (1996). Enhancement of noisy speech signals: Application to mobile radio communications. Speech Communication, 18(1), 3–19.
Article Google Scholar
Lee, K.-F., & Hon, H.-W. (1989). Speaker-independent phone recognition using hidden markov models. IEEE transactions on acoustics, speech and signal processing , 37(11), 1641–1648.
Article Google Scholar
Lee, C. M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., & Narayanan, S. (2004). Emotion recognition based on phoneme classes. In INTERSPEECH (pp. 205–211).
Li, Y., & Zhao, Y. (1998). Recognizing emotions in speech using short-term and long-term features. In ICSLP.
Li, J. Q. & Barron, A. R. (1999). Mixture density estimation. In Advances in neural information processing systems 12. Citeseer.
Li, X., Tao, J., Johnson, M. T., Soltis, J., Savage, A., Leong, K. M., & Newman, J. D. (2007). Stress and emotion classification using jitter and shimmer features. In Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on (Vol. 4, pp. IV–1081). IEEE.
Lilliefors, H. W. (1967). On the kolmogorov-smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402.
Article Google Scholar
Lin, Y.-L., & Wei, G. (2005). Speech emotion recognition based on hmm and svm. In Proceedings of 2005 international conference on Machine learning and cybernetics, 2005, (Vol. 8, pp. 4898–4901). IEEE.
Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantizer design. IEEE transactions on Communications, 28(1):84–95
Liu, H., & Lei, Y. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on knowledge and data engineering, 17(4), 491–502.
Article MathSciNet Google Scholar
Luengo, I., Navas, E., Hernáez, I., Sánchez, J. (2005). Automatic emotion recognition using prosodic parameters. In INTERSPEECH (pp. 493–496).
Mardia., K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530.
Article MathSciNet MATH Google Scholar
Motamed, S., Setayeshi, S., & Rabiee, A. (2017). Speech emotion recognition based on a modified brain emotional learning model. Biologically Inspired Cognitive Architectures, 19, 32–38.
Article Google Scholar
Muslea, I., Minton, S., & Knoblock, C. A. (2006). Active learning with multiple views. Journal of Artificial Intelligence Research, 27, 203–233.
MathSciNet MATH Google Scholar
Muthusamy, H., Polat, K., & Yaacob, S. (2015). Improved emotion recognition using gaussian mixture model and extreme learning machine in speech and glottal signals. Mathematical Problems in Engineering, 2015.
Neiberg, D., Elenius, K., & Laskowski, K. (2006). Emotion recognition in spontaneous speech using gmms. In INTERSPEECH.
Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing & Applications, 9(4), 290–296.
Article MATH Google Scholar
Nogueiras, A., Moreno, A., Bonafonte, A., & Mariño, J. B. (2001). Speech emotion recognition using hidden markov models. In INTERSPEECH (pp. 2679–2682).
Nooteboom, S. (1997). The prosody of speech: Melody and rhythm. The Handbook of Phonetic Sciences, 5, 640–673.
Google Scholar
Nwe, T. L., Wei, F. S., & De Silva, L. C. (2001). Speech based emotion classification. In TENCON 2001, Proceedings of IEEE region 10 international conference on electrical and electronic technology, IEEE, (Vol. 1, pp. 297–301).
Ortony, A. (1990). The cognitive structure of emotions. Cambridge: Cambridge University Press.
Google Scholar
Pan, Y., Shen, P., & Shen, L. (2012). Speech emotion recognition using support vector machine. International Journal of Smart Home, 6(2), 101–107.
Google Scholar
Partila, P., & Voznak, M. (2013). Speech emotions recognition using 2-d neural classifier. In Nostradamus 2013: Prediction, modeling and analysis of complex systems (pp. 221–231). New York: Springer.
Petrushin, V. A. (2000). Emotion recognition in speech signal: experimental study, development, and application. Studies, 3, 4.
Google Scholar
Polzin, T. S. & Waibel, A. (1998). Detecting emotions in speech. In Proceedings of the CMC (Vol. 16). Citeseer
Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals (Vol. 100). Englewood Cliffs: Prentice-hall.
Rabiner, L. R., & Juang, B.-H. (1993). In Fundamentals of speech recognition (Vol. 14). Englewood Cliffs: PTR Prentice Hall .
Rao, S. K., Koolagudi, S. G., & Vempada, R. R. (2013). Emotion recognition from speech using global and local prosodic features. International Journal of Speech Technology, 16(2), 143–160.
Article Google Scholar
Rao, K. S., & Koolagudi, S. G. (2012). Emotion recognition using speech features. New York: Springer Science & Business Media.
Rao, K. S., Reddy, R., Maity, S., & Koolagudi, S. G. (2010). Characterization of emotions using the dynamics of prosodic. In Proceedings of speech prosody (Vol. 4).
Razak, A. A., Komiya, R., Izani, M., & Abidin, Z. (2005). Comparison between fuzzy and nn method for speech emotion recognition. In ICITA 2005. Third international conference on Information technology and applications, 2005, (Vol. 1, pp. 297–302). IEEE
Reddy, S. Arundathy, Singh, Amarjot, Kumar, N. Sumanth, & Sruthi, K.S. (2011). The decisive emotion identifier. In 2011 3rd international conference on electronics computer technology (ICECT), (Vol. 2, pp. 28–32). IEEE.
Rencher, A. C., & Christensen, W. F. (2012). Methods of multivariate analysis (Vol. 709). New York: Wiley.
Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10(1), 19–41.
Article Google Scholar
Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on speech and audio processing, 3(1), 72–83.
Article Google Scholar
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.
Article MATH Google Scholar
Rojas, R. (2013). Neural networks: A systematic introduction. Berlin: Springer Science & Business Media.
MATH Google Scholar
Sato, N., & Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835–848.
Google Scholar
Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.
Article MATH Google Scholar
Scherer, K. R. (1989). Vocal correlates of emotional arousal and affective disturbance. In Handbook of social psychophysiology (pp. 165–197).
Schuller, B., Müller, R., Lang, M., & Rigoll, G. (2005). Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In Ninth European Conference on Speech Communication and Technology.
Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden markov model-based speech emotion recognition. In Proceedings. (ICASSP’03). 2003 IEEE international conference on acoustics, speech, and signal processing, 2003, (Vol. 2, pp. II–1). IEEE.
Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Proceedings (ICASSP’04). IEEE international conference on acoustics, speech, and signal processing, 2004, (Vol. 1, pp. I–577). IEEE.
Seehapoch, T., & Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on Knowledge and smart technology (KST) (pp. 86–91). IEEE.
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611.
Article MathSciNet MATH Google Scholar
Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 International conference on electronic and mechanical engineering and information technology (EMEIT), (Vol. 2, pp. 621–625). IEEE.
Siqing, W., Falk, T. H., & Chan, W.-Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53(5), 768–785.
Article Google Scholar
Soares, C., & Brazdil, P. B. (2000). Zoomed ranking: Selection of classification algorithms based on relevant performance information. In European conference on principles of data mining and knowledge discovery, (pp. 126–135). New York: Springer
Song, P., Jin, Y., Zhao, L., & Xin, M. (2014). Speech emotion recognition using transfer learning. IEICE TRANSACTIONS on Information and Systems, 97(9), 2530–2532.
Article Google Scholar
Song, P., Zheng, W., Shifeng, O., Zhang, X., Jin, Y., Liu, J., et al. (2016). Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization. Speech Communication, 83, 34–41.
Article Google Scholar
Soong, F. K., Rosenberg, A. E., Juang, B.-H., & Rabiner, L. R. (1987). Report: A vector quantization approach to speaker recognition. AT&T Technical Journal, 66(2):14–26
Stuhlsatz, A., Meyer, C., Eyben, F., ZieIke, T., Meier, G., & Schuller, B. (2011). Deep neural networks for acoustic emotion recognition: Raising the benchmarks. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5688–5691). IEEE
Takahashi, K. (2004). Remarks on svm-based emotion recognition from multi-modal bio-potential signals. In ROMAN 2004. 13th IEEE international workshop on Robot and human interactive communication, 2004, (pp. 95–100). IEEE.
Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, p. 37
Tang, H., Chu, S. M., Hasegawa-Johnson, M., & Huang, T. S. (2009). Emotion recognition from speech via boosted gaussian mixture models. In IEEE international conference on Multimedia and expo, 2009. ICME 2009, (pp. 294–297). IEEE.
Tian, Y., Kanade, T., & Cohn, J. F. (2000). Recognizing lower face action units for facial expression analysis. In Proceedings, fourth IEEE international conference on Automatic face and gesture recognition, 2000, (pp. 484–490). IEEE.
Traunmüller, H., & Eriksson, A. (1995). The frequency range of the voice fundamental in the speech of male and female adults. Unpublished Manuscript
Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5200–5204). IEEE
Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.
Article Google Scholar
Ververidis, D., & Kotropoulos, C. (2005). Emotional speech classification using gaussian mixture models. In IEEE international symposium on circuits and systems, 2005. ISCAS 2005, (pp. 2871–2874). IEEE.
Vlassis, N., & Likas, A. (1999). A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 29(4), 393–399.
Article Google Scholar
Vlassis, N., & Likas, A. (2002). A greedy em algorithm for gaussian mixture learning. Neural Processing Letters, 15(1), 77–87.
Article MATH Google Scholar
Vogt, T., André, E., & Bee, N. (2008). EmoVoice—A framework for online recognition of emotions from voice. In Perception in multimodal dialogue systems (pp. 188–199). Springer.
Wang, K., An, N., Li, B. N., Zhang, Y., & Li, L. (2015). Speech emotion recognition using fourier parameters. IEEE Transactions on Affective Computing, 6(1), 69–75.
Article Google Scholar
Wang, L. (2005). Support vector machines: Theory and applications, (Vol. 177). Springer Science & Business Media.
Wenjing, H., Haifeng, L., & Chunyu, G. (2009). A hybrid speech emotion perception method of vq-based feature processing and ann recognition. In WRI global congress on Intelligent systems, 2009. GCIS’09, (Vol. 2, pp. 145–149). IEEE.
Womack, B. D., & Hansen, J. H. L. (1999). N-channel hidden markov models for combined stressed speech classification and recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 668–677.
Article Google Scholar
Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech communication, 53(5), 768–785.
Article Google Scholar
Xiong, H., Junjie, W., & Chen, J. (2009). K-means clustering versus validation measures: A data-distribution perspective. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(2), 318–331.
Article Google Scholar
Yacoob, Y., & Davis, L. (1994). Computing spatio-temporal representations of human faces. In 1994 IEEE computer society conference on Computer vision and pattern recognition, 1994. Proceedings CVPR’94, (pp. 70–75). IEEE
Yamada, T., Hashimoto, H., & Tosa, N. (1995). Pattern recognition of emotion with neural network. In Proceedings of the 1995 IEEE IECON 21st international conference on Industrial electronics, control, and instrumentation, 1995, (Vol. 1, pp. 183–187). IEEE
Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5), 1415–1423.
Article MATH Google Scholar
Yegnanarayana, B. (1994). Artificial neural networks for pattern recognition. Sadhana, 19(2), 189–238.
Article MathSciNet Google Scholar
Yu, C., Tian, Q., Cheng, F., & Zhang, S. (2011). Speech emotion recognition using support vector machines. In Advanced research on computer science and information engineering (pp. 215–220). New York: Springer
Zheng, W., Xin, M., Wang, X., & Wang, B. (2014). A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Processing Letters, 21(5), 569–572.
Article Google Scholar
Zhou, Y., Sun, Y., Zhang, J., & Yan, Y. (2009). Speech emotion recognition using both spectral and prosodic features. In ICIECS 2009. International conference on Information engineering and computer science, 2009, (pp. 1–4). IEEE.
Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and svm. In 5th IEEE international conference on Cognitive informatics, 2006. ICCI 2006, (Vol. 1, pp. 53–61). IEEE.

Download references

Author information

Authors and Affiliations

Department of CSE, National Institute of Technology Karnataka, Mangalore, 575 025, India
Shashidhar G. Koolagudi, Y. V. Srinivasa Murthy & Siva P. Bhaskar

Authors

Shashidhar G. Koolagudi
View author publications
You can also search for this author in PubMed Google Scholar
Y. V. Srinivasa Murthy
View author publications
You can also search for this author in PubMed Google Scholar
Siva P. Bhaskar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Y. V. Srinivasa Murthy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koolagudi, S.G., Murthy, Y.V.S. & Bhaskar, S.P. Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition. Int J Speech Technol 21, 167–183 (2018). https://doi.org/10.1007/s10772-018-9495-8

Download citation

Received: 13 August 2017
Accepted: 22 January 2018
Published: 01 February 2018
Issue Date: March 2018
DOI: https://doi.org/10.1007/s10772-018-9495-8

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

Emotion Recognition from Speech Using Multiple Features and Clusters

Efficiency of chosen speech descriptors in relation to emotion recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

Emotion Recognition from Speech Using Multiple Features and Clusters

Efficiency of chosen speech descriptors in relation to emotion recognition

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation