Abstract
Human emotions are easy to identify through facial expressions, body movements, and gestures. Speech carries a lot of emotional cues including variations in pitch, tone, intensity, and rhythm. In recent years, the increasing demand for human–computer interaction has spurred the development of speech recognition methods. Traditional Speech emotion detection methods are less effective in recognizing emotions, considering features like pitch, intensity, and spectral characteristics. To address these issues, this paper proposed a novel method named Dual Kernel Support Vector based Levy Dung Beetle (DKSV-LDB) Algorithm to accurately identify emotions like happiness, anger, sadness, etc. from speech patterns. In this study, the model is designed by combining a Dual Kernel Support Vector Machine (SVM) method with a Dung beetle Optimization algorithm, enriched by the Levy Flight strategy. This work conducted experiments in the datasets namely the CREMA-D, TESS, and EMO-DB (German). The performance evaluation measures such as accuracy, precision, recall, F-measure, and specificity are utilized for the evaluation of the proposed DKSV-LDB method and these results are compared with existing methods. The DKSV-LDB method achieved accuracy, precision, recall, F-measure, and specificity of 98.57%, 97.91%, 97.86%, 97.84%, and 97.78%. The experimental results depict the performance of the developed DKSV-LDB technique for speech emotion identification.
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
B.J. Abbaschian, D. Sierra-Sosa, A. Elmaghraby, Deep learning techniques for speech emotion recognition, from databases to models. Sensors 21(4), 1249 (2021)
G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)
M.B. Akçay, K. Oğuz, Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Commun. 116, 56–76 (2020)
F. Albu, D. Hagiescu, L. Vladutu, M.A. Puica.: Neural network approaches for children's emotion recognition in intelligent learning applications. In EDULEARN15 Proceedings (pp. 3229–3239). IATED(2015)
M.J. Al-Dujaili, A. Ebrahimi-Moghadam, Speech emotion recognition: a comprehensive survey. Wireless Personal Commun. 129(4), 2525–2561 (2023)
H.N. AlEisa, F. Alrowais, N. Negm, N. Almalki, M. Khalid, R. Marzouk, A.A. Alneil, Henry gas solubility optimization with deep learning based facial emotion recognition for human computer Interface. IEEE Access (2023). https://doi.org/10.1109/ACCESS.2023.3284457
S.B. Alex, L. Mary, B.P. Babu, Attention and feature selection for automatic speech emotion recognition using utterance and syllable-level prosodic features. Circuits Syst. and Signal Process. 39(11), 5681–5709 (2020)
L. Alzubaidi, J. Bai, A. Al-Sabaawi, J. Santamaría, A.S. Albahri, B.S.N. Al-dabbagh, M.A. Fadhel, M. Manoufali, J. Zhang, A.H. Al-Timemy, Y. Duan, A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. J. Big Data 10(1), 46 (2023)
S. Amirsadri, S.J. Mousavirad, H. Ebrahimpour-Komleh, A Levy flight-based grey wolf optimizer combined with back-propagation algorithm for neural network training. Neural Comput. Appl. 30, 3707–3720 (2018)
H. Aouani, Y.B. Ayed, Speech emotion recognition with deep learning. Proc. Comput. Sci. 176, 251–260 (2020)
G. Bonifazi, F. Cauteruccio, E. Corradini, M. Marchetti, A. Pierini, G. Terracina, L. Virgili, An approach to detect backbones of information diffusers among different communities of a social platform. Data Knowl. Eng. 140, 102048 (2022)
M.M. Chalapathi, M.R. Kumar, N. Sharma, S. Shitharth, Ensemble learning by high-dimensional acoustic features for emotion recognition from speech audio signal. Secur. Commun. Netw. 2022(1), 10 (2022)
A. Christy, S. Vaithyasubramanian, A. Jesudoss, M.A. Praveena, Multimodal speech emotion recognition and classification using convolutional neural network techniques. Int. J. Speech Technol. 23, 381–388 (2020)
C. Deepika, S. Kuchibhotla, Deep-CNN based knowledge learning with Beluga Whale optimization using chaogram transformation using intelligent sensors for speech emotion recognition. Measurement Sensors 32, 101030 (2024)
M. Farooq, F. Hussain, N.K. Baloch, F.R. Raja, H. Yu, Y.B. Zikria, Impact of feature selection algorithm on speech emotion recognition using deep convolutional neural network. Sensors 20(21), 6008 (2020)
B. Gianluca, C. Enrico, U. Domenico, V. Luca, Defining user spectra to classify Ethereum users based on their behavior. Journal of Big Data (2022). https://doi.org/10.1186/s40537-022-00586-3
M. Hamdi, Affirmative ant colony optimization based support vector machine for sentiment classification. Electronics 11(7), 1051 (2022)
C. Hema, F.P.G. Marquez, Emotional speech recognition using cnn and deep learning techniques. Appl. Acoust. 211, 109492 (2023)
https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess
https://www.kaggle.com/datasets/piyushagni5/berlin-database-of-emotional-speech-emodb
S. Huang, H. Dang, R. Jiang, Y. Hao, C. Xue, W. Gu, Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition. Electronics 10(23), 2891 (2021)
P. Kantithammakorn, P. Punyabukkana, P.N. Pratanwanich, S. Hemrungrojn, C. Chunharas, D. Wanvarie, Using automatic speech recognition to assess Thai speech language fluency in the Montreal cognitive assessment (MoCA). Sensors 22(4), 1583 (2022)
H. Khan, M. Ullah, F. Al-Machot, F.A. Cheikh, M. Sajjad, Deep learning based speech emotion recognition for Parkinson patient. Electr. Imag. 35, 298–301 (2023)
M. Khan, W. Gueaieb, A. El Saddik, S. Kwon, MSER: Multimodal speech emotion recognition using cross-attention with deep fusion. Expert Syst. Appl. 245, 122946 (2024)
J.H. Kim, A. Poulose, D.S. Han, CVGG-19: customized visual geometry group deep learning architecture for facial emotion recognition. IEEE Access 12, 41557–41578 (2024)
N. Koppula, K.S. Rao, S.A. Nabi, A. Balaram, A novel optimized recurrent network-based automatic system for speech emotion identification. Wireless Personal Commun. 128(3), 2217–2243 (2023)
S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked 20, 100424 (2020)
S. Mishra, N. Bhatnagar, P.P.T.R. Sureshkumar, Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model. Multimedia Tools Appl. 83(13), 37603–37620 (2024)
M.B. Mustafa, M.A. Yusoof, Z.M. Don, M. Malekzadeh, Speech emotion recognition research: an analysis of research focus. Int. J. Speech Technol. 21, 137–156 (2018)
L. Pepino, P. Riera, L. Ferrer, A. Gravano.: Fusion approaches for emotion recognition from speech using acoustic and text-based features. In ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6484–6488. IEEE (2020)
P.R. Prakash, D. Anuradha, J. Iqbal, M.G. Galety, R. Singh, S. Neelakandan, A novel convolutional neural network with gated recurrent unit for automated speech emotion recognition and classification. J. Control Decis. 10(1), 54–63 (2023)
W. Qiu, Q. Tang, K. Zhu, W. Yao, J. Ma, Y. Liu, Cyber spoofing detection for grid distributed synchrophasor using dynamic dual-kernel SVM. IEEE Trans. Smart Grid 12(3), 2732–2735 (2020)
Z. Qu, Z. Chen, S. Dehdashti, P. Tiwari, QFSM: a novel quantum federated learning algorithm for speech emotion recognition with minimal gated unit in 5G IoV. IEEE Trans. Intell. Vehic. (2024). https://doi.org/10.1109/TIV.2024.3370398
I. Shahin, O.A. Alomari, A.B. Nassif, I. Afyouni, I.A. Hashem, A. Elnagar, An efficient feature selection method for arabic and english speech emotion recognition using Grey Wolf Optimizer. Appl. Acoust. 205, 109279 (2023)
Y.B. Singh, S. Goel, A systematic literature review of speech emotion recognition approaches. Neurocomputing 492, 245–263 (2022)
M. Swain, B. Maji, P. Kabisatpathy, A. Routray, A DCRNN-based ensemble classifier for speech emotion recognition in Odia language. Complex Intell. Syst. 8(5), 4237–4249 (2022)
M.Z. Uddin, E.G. Nilsson, Emotion recognition using speech and neural structured learning to facilitate edge intelligence. Eng. Appl. Artif. Intell. 94, 103775 (2020)
A. Valiyavalappil Haridas, R. Marimuthu, V.G. Sivakumar, B. Chakraborty, Emotion recognition of speech signal using Taylor series and deep belief network based classification. Evol. Intell. 15(2), 1145–1158 (2022)
J. Xue, B. Shen, Dung beetle optimizer: a new meta-heuristic algorithm for global optimization. J. Supercomput. 79(7), 7305–7336 (2023)
X. Zhang, H. Xiao, Enhancing speech emotion recognition with the improved weighted average support vector method. Biomed. Signal Process. Control 93, 106140 (2024)
Funding
This research was funded by Jinhua Science and Technology Bureau, grant number 2023-4-058 and Jinhua Advanced Research Institute, grant number G202409 and G202412.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human and Animal Rights
This article does not contain any studies with human or animal subjects performed by any of the authors.
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Han, T., Zhang, Z., Ren, M. et al. A Novel Dual Kernel Support Vector-Based Levy Dung Beetle Algorithm for Accurate Speech Emotion Detection. Circuits Syst Signal Process 43, 7249–7284 (2024). https://doi.org/10.1007/s00034-024-02791-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-024-02791-2