Abstract
Nonverbal emotional vocalizations play a crucial role in conveying emotions during human interactions. Validated corpora of these vocalizations have facilitated emotion-related research and found wide-ranging applications. However, existing corpora have lacked representation from diverse cultural backgrounds, which may limit the generalizability of the resulting theories. The present paper introduces the Chinese Nonverbal Emotional Vocalization (CNEV) corpus, the first nonverbal emotional vocalization corpus recorded and validated entirely by Mandarin speakers from China. The CNEV corpus contains 2415 vocalizations across five emotion categories: happiness, sadness, fear, anger, and neutrality. It also includes a database containing subjective evaluation data on emotion category, valence, arousal, and speaker gender, as well as the acoustic features of the vocalizations. Key conclusions drawn from statistical analyses of perceptual evaluations and acoustic analysis include the following: (1) the CNEV corpus exhibits adequate reliability and high validity; (2) perceptual evaluations reveal a tendency for individuals to associate anger with male voices and fear with female voices; (3) acoustic analysis indicates that males are more effective at expressing anger, while females excel in expressing fear; and (4) the observed perceptual patterns align with the acoustic analysis results, suggesting that the perceptual differences may stem not only from the subjective factors of perceivers but also from objective expressive differences in the vocalizations themselves. For academic research purposes, the CNEV corpus and database are freely available for download at https://osf.io/6gy4v/.
Similar content being viewed by others
Data availability
The data and materials for all experiments in this study are available online at https://osf.io/6gy4v/.
Code availability
The code used for data analysis in this study is available from the corresponding author upon reasonable request.
References
Amorim, M., Roberto, M. S., Kotz, S. A., & Pinheiro, A. P. (2022). The perceived salience of vocal emotions is dampened in non-clinical auditory verbal hallucinations. Cognitive Neuropsychiatry, 27(2–3), 169–182. https://doi.org/10.1080/13546805.2021.1949972
Anikin, A., & Persson, T. (2017). Nonlinguistic vocalizations from online amateur videos for emotion research: A validated corpus. Behavior Research Methods, 49(2), 758–771. https://doi.org/10.3758/s13428-016-0736-y
Atmaja, B. T., & Sasou, A. (2022). Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0 (arXiv:2209.13146).arXiv.
Bänziger, T., Mortillaro, M., & Scherer, K. R. (2012). Introducing the Geneva Multimodal expression corpus for experimental research on emotion perception. Emotion, 12(5), 1161–1179. https://doi.org/10.1037/a0025827
Barrett, L. F., & Bliss-Moreau, E. (2009). She’s emotional. He’s having a bad day: Attributional explanations for emotion stereotypes. Emotion (Washington, D.C.), 9(5), 649–658. https://doi.org/10.1037/a0016821
Belin, P., Fillion-Bilodeau, S., & Gosselin, F. (2008). The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods, 40(2), 531–539. https://doi.org/10.3758/BRM.40.2.531
Bordenave, D., & McCune, L. (2021). Grunt Vocalizations in Children With Disabilities: Relationships With Assessed Cognition and Language. Journal of Speech, Language, and Hearing Research: JSLHR, 64(11), 4138–4148. https://doi.org/10.1044/2021_JSLHR-21-00202
Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: The Self-Assessment Manikin and the Semantic Differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1), 49–59. https://doi.org/10.1016/0005-7916(94)90063-9
Brunswik, E. (1956). Perception and the representative design of psychological experiments, 2nd ed (pp. xii, 154). University of California Press.
Bryant, G. A., Fessler, D. M. T., Fusaroli, R., Clint, E., Aarøe, L., Apicella, C. L., Petersen, M. B., Bickham, S. T., Bolyanatz, A., Chavez, B., De Smet, D., Díaz, C., Fančovičová, J., Fux, M., Giraldo-Perez, P., Hu, A., Kamble, S. V., Kameda, T., Li, N. P., … Zhou, Y. (2016). Detecting affiliation in colaughter across 24 societies. Proceedings of the National Academy of Sciences, 113(17), 4682–4687. https://doi.org/10.1073/pnas.1524993113
Cartocci, G., Giorgi, A., Inguscio, B. M. S., Scorpecci, A., Giannantonio, S., De Lucia, A., Garofalo, S., Grassia, R., Leone, C. A., Longo, P., Freni, F., Malerba, P., & Babiloni, F. (2021). Higher Right Hemisphere Gamma Band Lateralization and Suggestion of a Sensitive Period for Vocal Auditory Emotional Stimuli Recognition in Unilateral Cochlear Implant Children: An EEG Study. Frontiers in Neuroscience, 15, 608156. https://doi.org/10.3389/fnins.2021.608156
Chou, H.-C., Lin, W.-C., Chang, L.-C., Li, C.-C., Ma, H.-P., & Lee, C.-C. (2017). NNIME: The NTHU-NTUA Chinese interactive multimodal emotion corpus. Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), 2017, 292–298. https://doi.org/10.1109/ACII.2017.8273615
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6(4), 284–290. https://doi.org/10.1037/1040-3590.6.4.284
Conde, T., Correia, A. I., Roberto, M. S., Scott, S. K., Lima, C. F., & Pinheiro, A. P. (2022). The time course of emotional authenticity detection in nonverbal vocalizations. Cortex, 151, 116–132. https://doi.org/10.1016/j.cortex.2022.02.016
Cordaro, D. T., Keltner, D., Tshering, S., Wangchuk, D., & Flynn, L. M. (2016). The voice conveys emotion in ten globalized cultures and one remote village in Bhutan. Emotion, 16(1), 117–128. https://doi.org/10.1037/emo0000100
Correia, A. I., Castro, S. L., MacGregor, C., Müllensiefen, D., Schellenberg, E. G., & Lima, C. F. (2022). Enhanced recognition of vocal emotions in individuals with naturally good musical abilities. Emotion, 22(5), 894–906. https://doi.org/10.1037/emo0000770
Cosme, G., Tavares, V., Nobre, G., Lima, C., Sá, R., Rosa, P., & Prata, D. (2022). Cultural differences in vocal emotion recognition: A behavioural and skin conductance study in Portugal and Guinea-Bissau. Psychological Research Psychologische Forschung, 86(2), 597–616. https://doi.org/10.1007/s00426-021-01498-2
Craig, B. M., & Lee, A. J. (2020). Stereotypes and Structure in the Interaction between Facial Emotional Expression and Sex Characteristics. Adaptive Human Behavior and Physiology, 6(2), 212–235. https://doi.org/10.1007/s40750-020-00141-5
Damasio, A. R., Grabowski, T. J., Bechara, A., Damasio, H., Ponto, L. L. B., Parvizi, J., & Hichwa, R. D. (2000). Subcortical and cortical brain activity during the feeling of self-generated emotions. Nature Neuroscience, 3(10), 1049–1056. https://doi.org/10.1038/79871
DeKlerk, H. M., Dada, S., & Alant, E. (2014). Children’s identification of graphic symbols representing four basic emotions: Comparison of Afrikaans-speaking and Sepedi-speaking children. Journal of Communication Disorders, 52, 1–15. https://doi.org/10.1016/j.jcomdis.2014.05.006
Donhauser, P. W., Belin, P., & Grosbras, M.-H. (2014). Biasing the perception of ambiguous vocal affect: A TMS study on frontal asymmetry. Social Cognitive and Affective Neuroscience, 9(7), 1046–1051. https://doi.org/10.1093/scan/nst080
Ekert, J. O., Gajardo-Vidal, A., Lorca-Puls, D. L., Hope, T. M. H., Dick, F., Crinion, J. T., Green, D. W., & Price, C. J. (2021). Dissociating the functions of three left posterior superior temporal regions that contribute to speech perception and production. NeuroImage, 245, 118764. https://doi.org/10.1016/j.neuroimage.2021.118764
Ekman, P., & Friesen, W. V. (1971). Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17, 124–129. https://doi.org/10.1037/h0030377
Elfenbein, H. A. (2013). Nonverbal Dialects and Accents in Facial Expressions of Emotion. Emotion Review, 5(1), 90–96. https://doi.org/10.1177/1754073912451332
Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., André, E., Busso, C., Devillers, L. Y., Epps, J., Laukka, P., Narayanan, S. S., & Truong, K. P. (2016). The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Transactions on Affective Computing, 7(2), 190–202. https://doi.org/10.1109/TAFFC.2015.2457417
Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile: The Munich versatile and fast open-source audio feature extractor. Proceedings of the 18th ACM International Conference on Multimedia, 1459–1462. https://doi.org/10.1145/1873951.1874246
Frühholz, S., & Schweinberger, S. R. (2021). Nonverbal auditory communication – Evidence for integrated neural systems for voice signal production and perception. Progress in Neurobiology, 199, 101948. https://doi.org/10.1016/j.pneurobio.2020.101948
Gendron, M., Roberson, D., Van Der Vyver, J. M., & Barrett, L. F. (2014). Cultural Relativity in Perceiving Emotion From Vocalizations. Psychological Science, 25(4), 911–920. https://doi.org/10.1177/0956797613517239
Gifford, R. (1994). A lens-mapping framework for understanding the encoding and decoding of interpersonal dispositions in nonverbal behavior. Journal of Personality and Social Psychology, 66(2), 398–412. https://doi.org/10.1037/0022-3514.66.2.398
Holz, N., Larrouy-Maestri, P., & Poeppel, D. (2021). The paradoxical role of emotional intensity in the perception of vocal affect. Scientific Reports, 11(1), 9663. https://doi.org/10.1038/s41598-021-88431-0
Holz, N., Larrouy-Maestri, P., & Poeppel, D. (2022). (VIVAE) The variably intense vocalizations of affect and emotion corpus prompts new perspective on nonspeech perception. Emotion, 22(1), 213–225. https://doi.org/10.1037/emo0001048
Kamiloğlu, R. G., Boateng, G., Balabanova, A., Cao, C., & Sauter, D. A. (2021). Superior Communication of Positive Emotions Through Nonverbal Vocalisations Compared to Speech Prosody. Journal of Nonverbal Behavior, 45(4), 419–454. https://doi.org/10.1007/s10919-021-00375-1
Kuschner, E. S., Kim, M., Bloy, L., Dipiero, M., Edgar, J. C., & Roberts, T. P. L. (2021). MEG-PLAN: A clinical and technical protocol for obtaining magnetoencephalography data in minimally verbal or nonverbal children who have autism spectrum disorder. Journal of Neurodevelopmental Disorders, 13, 8. https://doi.org/10.1186/s11689-020-09350-1
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Laukka, P. (2010). Presenting the VENEC Corpus: Development of a Cross-Cultural Corpus of Vocal Emotion Expressions and a Novel Method of Annotating Emotion Appraisals. the LREC 2010 Workshop on Corpora for Research on Emotion and Affect.
Laukka, P., & Elfenbein, H. A. (2021). Cross-Cultural Emotion Recognition and In-Group Advantage in Vocal Expression: A Meta-Analysis. Emotion Review, 13(1), 3–11. https://doi.org/10.1177/1754073919897295
Laukka, P., Elfenbein, H. A., Thingujam, N. S., Rockstuhl, T., Iraki, F. K., Chui, W., & Althoff, J. (2016). The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features. Journal of Personality and Social Psychology, 111(5), 686–705. https://doi.org/10.1037/pspi0000066
Laukka, P., Neiberg, D., & Elfenbein, H. A. (2014). Evidence for cultural dialects in vocal emotion expression: Acoustic classification within and across five nations. Emotion (Washington, D.C.), 14(3), 445–449. https://doi.org/10.1037/a0036048
Li, B., Blijd-Hoogewys, E. M. A., Stockmann, L., & Rieffe, C. (2023). The early development of emotion recognition in autistic children: Decoding basic emotions from facial expressions and from emotion-provoking situations. Development and Psychopathology, 1–12. https://doi.org/10.1017/S0954579423000913
Liebenthal, E., Silbersweig, D. A., & Stern, E. (2016). The Language, Tone and Prosody of Emotions: Neural Substrates and Dynamics of Spoken-Word Emotion Perception. Frontiers in Neuroscience, 10, 506. https://doi.org/10.3389/fnins.2016.00506
Lima, C. F., Anikin, A., Monteiro, A. C., Scott, S. K., & Castro, S. L. (2019). Automaticity in the recognition of nonverbal emotional vocalizations. Emotion (Washington, D.C.), 19(2), 219–233. https://doi.org/10.1037/emo0000429
Lima, C. F., Arriaga, P., Anikin, A., Pires, A. R., Frade, S., Neves, L., & Scott, S. K. (2021). Authentic and posed emotional vocalizations trigger distinct facial responses. Cortex; a Journal Devoted to the Study of the Nervous System and Behavior, 141, 280–292. https://doi.org/10.1016/j.cortex.2021.04.015
Lima, C. F., Castro, S. L., & Scott, S. K. (2013). When voices get emotional: A corpus of nonverbal vocalizations for research on emotion processing. Behavior Research Methods, 45(4), 1234–1245. https://doi.org/10.3758/s13428-013-0324-3
Liu, L., Li, W., Wu, X., & Zhou, B. X. (2019). Infant cry language analysis and recognition: An experimental approach. IEEE/CAA Journal of Automatica Sinica, 6(3), 778–788. IEEE/CAA Journal of Automatica Sinica. https://doi.org/10.1109/JAS.2019.1911435
Liu, P., & Pell, M. D. (2012). Recognizing vocal emotions in Mandarin Chinese: A validated database of Chinese vocal emotional stimuli. Behavior Research Methods, 44(4), 1042–1051. https://doi.org/10.3758/s13428-012-0203-3
Liu, T., Luo, Y., Ma, H., & Huang, Y. (2006). The Establishment and Assessment of a Native Affective Sound System. Psychological Science, 2(29), 406–408.
Martins, I., Lima, C. F., & Pinheiro, A. P. (2022). Enhanced salience of musical sounds in singers and instrumentalists. Cognitive, Affective, & Behavioral Neuroscience, 22(5), 1044–1062. https://doi.org/10.3758/s13415-022-01007-x
Narain, J., Johnson, K. T., Quatieri, T. F., Picard, R. W., & Maes, P. (2022). Modeling Real-World Affective and Communicative Nonverbal Vocalizations From Minimally Speaking Individuals. IEEE Transactions on Affective Computing, 13(4), 2238–2253. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2022.3208233
Pell, M. D., Sethi, S., Rigoulot, S., Rothermich, K., Liu, P., & Jiang, X. (2022). Emotional voices modulate perception and predictions about an upcoming face. Cortex, 149, 148–164. https://doi.org/10.1016/j.cortex.2021.12.017
Perlman, M., Paul, J., & Lupyan, G. (2022). Vocal communication of magnitude across language, age, and auditory experience. Journal of Experimental Psychology: General, 151(4), 885–896. https://doi.org/10.1037/xge0001103
Phan, K. L., Wager, T., Taylor, S. F., & Liberzon, I. (2002). Functional neuroanatomy of emotion: A meta-analysis of emotion activation studies in PET and fMRI. NeuroImage, 16(2), 331–348. https://doi.org/10.1006/nimg.2002.1087
Pinheiro, A. P., Anikin, A., Conde, T., Sarzedas, J., Chen, S., Scott, S. K., & Lima, C. F. (2021). Emotional authenticity modulates affective and social trait inferences from voices. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 376(1840), 20200402. https://doi.org/10.1098/rstb.2020.0402
Pisanski, K., Bryant, G. A., Cornec, C., Anikin, A., & Reby, D. (2022). Form follows function in human nonverbal vocalisations. Ethology Ecology & Evolution, 34(3), 303–321. https://doi.org/10.1080/03949370.2022.2026482
Provoost, S., Lau, H. M., Ruwaard, J., & Riper, H. (2017). Embodied Conversational Agents in Clinical Psychology: A Scoping Review. Journal of Medical Internet Research, 19(5), e151. https://doi.org/10.2196/jmir.6553
Rainville, P., Bechara, A., Naqvi, N., & Damasio, A. R. (2006). Basic emotions are associated with distinct patterns of cardiorespiratory activity. International Journal of Psychophysiology, 61(1), 5–18. https://doi.org/10.1016/j.ijpsycho.2005.10.024
Salas, C. E., Radovic, D., & Turnbull, O. H. (2012). Inside-out: Comparing internally generated and externally generated basic emotions. Emotion (Washington, D.C.), 12(3), 568–578. https://doi.org/10.1037/a0025811
Salvia, E., Bestelmeyer, P. E. G., Kotz, S. A., Rousselet, G. A., Pernet, C. R., Gross, J., & Belin, P. (2014). Single-subject analyses of magnetoencephalographic evoked responses to the acoustic properties of affective non-verbal vocalizations. Frontiers in Neuroscience, 8. https://doi.org/10.3389/fnins.2014.00422
Sauter, D. A., Eisner, F., Calder, A. J., & Scott, S. K. (2010). Perceptual Cues in Nonverbal Vocal Expressions of Emotion. Quarterly Journal of Experimental Psychology, 63(11), 2251–2272. https://doi.org/10.1080/17470211003721642
Schröder, M. (2003). Experimental study of affect bursts. Speech Communication, 40(1–2), 99–116. https://doi.org/10.1016/S0167-6393(02)00078-X
Stanciu, M. A., Rafal, R. D., & Turnbull, O. H. (2019). Preserved re-experience of discrete emotions: Amnesia and executive function. Journal of Neuropsychology, 13(2), 305–327. https://doi.org/10.1111/jnp.12147
Uhlmann, A., Ipser, J. C., Wilson, D., & Stein, D. J. (2018). Social cognition and aggression in methamphetamine dependence with and without a history of psychosis. Metabolic Brain Disease, 33(2), 559–568. https://doi.org/10.1007/s11011-017-0157-3
Visser, N., Alant, E., & Harty, M. (2008). Which graphic symbols do 4-year-old children choose to represent each of the four basic emotions? Augmentative and Alternative Communication (Baltimore, Md.: 1985), 24(4), 302–312. https://doi.org/10.1080/07434610802467339
Widen, S. C., & Russell, J. A. (2003). A closer look at preschoolers’ freely produced labels for facial expressions. Developmental Psychology, 39(1), 114–128. https://doi.org/10.1037/0012-1649.39.1.114
Funding
This research was supported by the Humanities and social science project of the China educational ministry (24YJA190006), Department of Education of Liaoning Province (LJ212410165053). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
Zhongqing Jiang, Yanling Long, and Xi’e Zhang designed this study. Yanling Long and Xi’e Zhang collected the data. Yanling Long and Zhongqing Jiang analyzed the data. Yanling Long, Zhongqing Jiang, Yangtao Liu, and Xue Bai wrote the article.
Corresponding author
Ethics declarations
Ethics approval
This research was reviewed and approved by the Ethics Committee of Liaoning Normal University. The committee confirmed that the research met ethical guidelines, and approval was granted for publication (Ethical Review Number: LL2023064).
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent for publication
All authors consent to the publication of this manuscript.
Conflicts of interest/Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jiang, Z., Long, Y., Zhang, X. et al. CNEV: A corpus of Chinese nonverbal emotional vocalizations with a database of emotion category, valence, arousal, and gender. Behav Res 57, 62 (2025). https://doi.org/10.3758/s13428-024-02595-x
Accepted:
Published:
DOI: https://doi.org/10.3758/s13428-024-02595-x