Abstract
This paper presents an investigation of speech emotion systems and how the accuracy can be further improved by exploring machine learning algorithms and hybrid solutions. The accuracy of machine learning algorithms and speech noise reduction techniques are investigated on an embedded system. Research suggests improvements could be made to the feature selection from speech signals and pattern recognition algorithms for emotion recognition. The system deployed to perform the experiments is EmotionPi, using the Raspberry Pi 3 B+. Pattern recognition is investigated by using K-Nearest Neighbour (K-NN), Support Vector Machine (SVM), Random Forest Classifier (RFC), Multi-Layer Perception (MLP) and Convolutional Neural Networks (CNN) algorithms. Experiments are conducted to determine the accuracy of the speech emotion system using the speech database and our own recorded dataset. We propose a hybrid solution which has proven to increase the accuracy of the emotion recognition results. Results obtained from testing, show the system needs to be trained using real cases rather than using speech databases (as it is more accurate in detecting the user’s emotion).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schuller, B.W.: Speech emotion recognition: two decades in a nutshell, benchmarks, and ongoing trends. Commun. ACM 61, 90–99 (2018). https://doi.org/10.1145/3129340
Kerkeni, L., Serrestou, Y., Mbarki, M., Raoof, K., Mahjoub, M.A.: A review on speech emotion recognition: case of pedagogical interaction in classroom. In: 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp. 1–7 (2017)
Ivanovic, M., et al.: Emotional intelligence and agents: survey and possible applications. In: Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS 2014), Thessaloniki, Greece, pp. 52:1–52:7. ACM (2014)
Lugović, S., Dunđer, I., Horvat, M.: Techniques and applications of emotion recognition in speech. In: 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 1278–1283 (2016)
Mishra, A., Patil, D., Karkhanis, N., Gaikar, V., Wani, K.: Real time emotion detection from speech using Raspberry Pi 3. In: 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), pp. 2300–2303 (2017)
Ahmed, M.Y., Chen, Z., Fass, E., Stankovic, J.: Real time distant speech emotion recognition in indoor environments. In: Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, pp. 215–224 (2017). https://doi.org/10.1145/3144457.3144503
Dimitrieva, E., Nikitin, K.: Design of automatic speech emotion recognition system. In: Proceedngs of the International Workshop on Applications in Information Technology (2015)
Wu, A., Huang, Y., Zhang, G.: Feature fusion methods for robust speech emotion recognition based on deep belief networks. In: Proceedings of the Fifth International Conference on Network, Communication and Computing, pp. 6–10. ACM (2016)
Ayadi, M.E., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011). https://doi.org/10.1016/j.patcog.2010.09.020
Bandela, S.R., Kumar, T.K.: Emotion recognition of stressed speech using teager energy and linear prediction features. In: 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT), pp. 422–425 (2018)
Pan, Y., Shen, P., Shen, L.: Speech emotion recognition using support vector machine. Int. J. Smart Home 6, 101–108 (2012)
Basu, S., Chakraborty, J., Bag, A., Aftabuddin, M.: A review on emotion recognition using speech. In: 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 109–114 (2017)
Wang, K., An, N., Li, B.N., Zhang, Y., Li, L.: Speech emotion recognition using Fourier parameters. IEEE Trans. Affect. Comput. 6, 69–75 (2015). https://doi.org/10.1109/TAFFC.2015.2392101
Majkowski, A., Kolodziej, M., Rak, R.J., Korczyacuteηski, R.: Classification of emotions from speech signal. In: Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 276–281. IEEE (2016)
Rajisha, T.M., Sunija, A.P., Riyas, K.S.: Performance analysis of malayalam language speech emotion recognition system using ANN/SVM. Procedia Technol. 24, 1097–1104 (2016). https://doi.org/10.1016/j.protcy.2016.05.242
Zheng, L., Li, Q., Ban, H., Liu, S.: Speech emotion recognition based on convolution neural network combined with random forest. In: 2018 Chinese Control And Decision Conference (CCDC), pp. 4143–4147 (2018)
Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd., Birmingham (2017)
Anagnostopoulos, C.-N., Iliou, T.: Towards emotion recognition from speech: definition, problems and the materials of research. In: Studies in Computational Intelligence, pp. 127–143 (2010)
Livingstone, S.R., Russo, F.A.: The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English (2018). https://doi.org/10.1371/journal.pone.0196391
Rong, J., Chen, Y.P., Chowdhury, M., Li, G.: Acoustic features extraction for emotion recognition. In: 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), pp. 419–424 (2007)
Basu, S., Chakraborty, J., Aftabuddin, M.: Emotion recognition from speech using convolutional neural network with recurrent neural network architecture. In: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), pp. 333–336 (2017)
Zhang, B., Provost, E.M., Essl, G.: Cross-corpus acoustic emotion recognition from singing and speaking: A multi-task learning approach. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5805–5809 (2016)
Zhang, B., Essl, G., Provost, E.M.: Recognizing emotion from singing and speaking using shared models. In: 2015 International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 139–145 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Deusi, J.S., Popa, E.I. (2019). An Investigation of the Accuracy of Real Time Speech Emotion Recognition. In: Bramer, M., Petridis, M. (eds) Artificial Intelligence XXXVI. SGAI 2019. Lecture Notes in Computer Science(), vol 11927. Springer, Cham. https://doi.org/10.1007/978-3-030-34885-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-34885-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34884-7
Online ISBN: 978-3-030-34885-4
eBook Packages: Computer ScienceComputer Science (R0)