Abstract
The proven ability of music to transmit emotions provokes the increasing interest in the development of new algorithms for music emotion recognition (MER). In this work, we present an automatic system of emotional classification of music by implementing a neural network. This work is based on a previous implementation of a dimensional emotional prediction system in which a multilayer perceptron (MLP) was trained with the freely available MediaEval database. Although these previous results are good in terms of the metrics of the prediction values, they are not good enough to obtain a classification by quadrant based on the valence and arousal values predicted by the neural network, mainly due to the imbalance between classes in the dataset. To achieve better classification values, a pre-processing phase was implemented to stratify and balance the dataset. Three different classifiers have been compared: linear support vector machine (SVM), random forest, and MLP. The best results are obtained with the MLP. An averaged F-measure of 50% is obtained in a four-quadrant classification schema. Two binary classification approaches are also presented: one vs. rest (OvR) approach in four-quadrants and binary classifier in valence and arousal. The OvR approach has an average F-measure of 69%, and the second one obtained F-measure of 73% and 69% in valence and arousal respectively. Finally, a dynamic classification analysis with different time windows was performed using the temporal annotation data of the MediaEval database. The results obtained show that the classification F-measures in four quadrants are practically constant, regardless of the duration of the time window. Also, this work reflects some limitations related to the characteristics of the dataset, including size, class balance, quality of the annotations, and the sound features available.
Similar content being viewed by others
References
Aljanaki A, Yang YH, Soleymani M (2017) Developing a benchmark for emotional analysis of music. PLoS ONE 12(3):1–22
Bai J, Peng, Shi J, Tang D, Wu Y, Li J, Luo K (2016) Dimensional music emotion recognition by valence-arousal regression. In: 2016 IEEE 15th international conference on cognitive informatics & cognitive computing (ICCI*CC). IEEE, Palo Alto, pp 42–49
Batista GE, Bazzan AL, Monard MC (2003) Balancing training data for automated annotation of keywords: a case study. In: WOB, pp 10–18
Berstad TJD, Riegler MA, Espeland H, De Lange T, Smedsrud PH, Pogorelov K, Stensland HK, Halvorsen P (2019) Tradeoffs using binary and multiclass neural network classification for medical multidisease detection. In: Proceedings - 2018 IEEE international symposium on multimedia. ISM 2018, pp 1–8
Bi J, Zhang C (2018) An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl-Based Syst 158(June):81–93
Chang CC, Lin CJ (2011) LIBSVM. ACM Trans Intell Syst Tech 2(3):1–27
Delbouys R, Hennequin R, Piccoli F, Royo-Letelier J, Moussallam M (2018) Music mood detection based on audio and lyrics with deep neural net. In: Proceedings of the 19th international society for music information retrieval Conference, Paris, pp 370–375
Deng JJ, Leung CH, Milani A, Chen L (2015) Emotional states associated with music: classification, prediction of changes, and consideration in recommendation. ACM Trans Interactive Intell Sys(TiiS) 5(1):4
Diaz-Vico D, Figueiras-Vidal AR, Dorronsoro JR (2018) Deep MLPs for imbalanced classification. In: Proceedings of the International Joint Conference on Neural Networks 2018-July
Eerola T, Vuoskoski JK (2011) A comparison of the discrete and dimensional models of emotion in music. Psychol Music 39(1):18–49
Fernandes J (2010) Automatic playlist generation via music mood analysis. Msc. thesis, University of Coimbra
Garcia-Gathright J, St. Thomas B, Hosey C, Nazari Z, Diaz F (2018) Understanding and evaluating user satisfaction with music discovery. In: The 41st international ACM SIGIR conference on research & development in information retrieval - SIGIR ’18. ACM Press, New York, pp 55–64
Grekow J (2015) Audio features dedicated to the detection of four basic emotions pp 583–591. Springer International Publishing, Cham
Grekow J (2017) Audio features dedicated to the detection of arousal and valence in music recordings. In: 2017 IEEE International conference on innovations in intelligent systems and applications (INISTA). IEEE, Gdynia, pp 40–44
Huang S, Zhou L, Liu Z, Ni S, He J (2018) Empirical research on a fuzzy model of music emotion classification based on pleasure-arousal model. In: 2018 37th Chinese Control Conference (CCC) 2018-july, IEEE, pp 3239–3244
Jing L, Tian K, Huang JZ (2015) Stratified feature sampling method for ensemble clustering of high dimensional data. Pattern Recogn 48(11):3688–3702
Kamasak ME (2018) Emotion based music recommendation system using wearable physiological sensors. IEEE Trans Consum Electron, pp 196–203
Karsoliya S (2012) Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture. Int J Eng Trends Tech 3(6):714–717
Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. In: Information retrieval, ismir, pp 255–266
Kingma DP, Ba J, Adam A (2014) Method for stochastic optimization. In: 3rd International Conference on Learning Representations. ICLR 2015 - Conference Track Proceedings, pp 1–15
Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets : a review. Science 30 (1):25–36
Li T, Ogihara M, Li Q (2003) A comparative study on content-based music genre classification. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, ACM, pp 282–289
Longadge R, Dongre S (2013) Class imbalance problem in data mining review. European Journal of Internal Medicine 24(1):e256
Medina Ospitia Y, Beltrán JR, Sanz C, Baldassarri S (2019) dimensional emotion prediction through low-level musical features. In: ACM audio mostly (AM’19), p. 4. Nottingha
Ng A, Soo K (2018) Random forests. In: Data science – was ist das eigentlich?!. Springer, Berlin, pp 117–127
Ospitia Medina Y, Baldassarri S, Beltrán JR (2019) High-level libraries for emotion recognition in music: a review. In: Agredo V, Ruiz P (eds) Human-computer interaction. HCI-COLLAB 2018. Springer, Popayán, pp 158–168
Panda R, Malheiro R, Paiva RP (2018) Musical texture and expressivity features for music emotion recognition, In: Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, France, September 23-27, 2018, pp. 383–391. {http://ismir2018.ircam.fr/doc/pdfs/250_Paper.pdf}
Panda R, Paiva RP (2012) Music emotion classification: dataset acquisition and comparative analysis. In: 5th Int Conference on digital audio effects (DAFx-12), york, september 17-21, 2012, York
Panda R, Rocha B, Paiva RP (2013) Dimensional music emotion recognition: combining standard and melodic audio features. In: Proc 10th International Symposium on Computer Music Multidisciplinary Research, pp 1–11
Picard RW (1997) Affective computing. MIT press, Cambridge
Powell J (2012) Así es la música, titivilus edn, Titivilus Barcelona
Rajanna AR, Aryafar K, Shokoufandeh A, Ptucha R (2015) Deep neural networks: a case study for music genre classification. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA), IEEE, pp 655–660
Rumelhart DE, Hinton GE, Williams RJ (1988) Learning representations by back-propagating errors. cognitive modeling. Nature 1986b:533–536
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Schedl M, Zamani H, Chen CW, Deldjoo Y, Elahi M (2018) Current challenges and visions in music recommender systems research. Inter J Multimedia Information Retrieval 7(2):95–116
Scherer KR (2004) Which emotions can be induced by music? What are the underlying mechanisms?,And how can we measure them?. Journal of New Music Research 33(3):239–251
Schmidt EM, Turnbull D, Kim YE (2010) Feature selection for content-based, time-varying musical emotion regression. In: Proceedings of the international conference on Multimedia information retrieval - MIR ’10. ACM Press, New York, p 267
Sloboda JA (1986) The musical mind, Oxford psy edn. Oxford University Press, New York
Soleymani M, Aljanaki A, Yang YH (2016) DEAM: MediaEval database for emotional analysis in Music. pp 3–5
Vluymans S, Fernández A, Saeys Y, Cornelis C, Herrera F (2018) Dynamic affinity-based classification of multi-class imbalanced data with one-versus-one decomposition: a fuzzy rough set approach. Knowl Inf Syst 56 (1):55–84
Yang D, Lee WS (2009) Music emotion identification from lyrics. In: 2009 11th IEEE international symposium on multimedia, IEEE, pp 624–629
Yang X, Dong Y, Li J (2018) Review of data features-based music emotion recognition methods. Multimedia Systems 24(4):365–389
Yang YH, Chen HH (2012) Machine recognition of music emotion : a review. ACM Trans Intell Syst Tech 3(3):30
Zhang F, Meng H, Li M (2016) Emotion extraction and recognition from music. In: 2016 12th international conference on natural computation, fuzzy systems and knowledge discovery, ICNC-FSKD 2016, pp 1728–1733
Zhou N (2019) Database design of regional music characteristic culture resources based on improved neural network in data mining. Pers Ubiquit Comput, pp. 1–12
Funding
This work has been partly financed by the Spanish Science, Innovation and University Ministry (MCIU), the National Research Agency (AEI), and the EU (FEDER) through the contract RTI2018-096986-B-C31 and the contract TIN2015-72241-EXP, and by the Aragonese Government and the European Union through the FEDER 2014-2020 “construyendo Europa desde Aragón” action (Group T25_17D)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Medina, Y.O., Beltrán, J.R. & Baldassarri, S. Emotional classification of music using neural networks with the MediaEval dataset. Pers Ubiquit Comput 26, 1237–1249 (2022). https://doi.org/10.1007/s00779-020-01393-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-020-01393-4