[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3395035.3425253acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper

Speech Emotion Recognition among Couples using the Peak-End Rule and Transfer Learning

Published: 27 December 2020 Publication History

Abstract

Extensive couples? literature shows that how couples feel after a conflict is predicted by certain emotional aspects of that conversation. Understanding the emotions of couples leads to a better understanding of partners? mental well-being and consequently their relationships. Hence, automatic emotion recognition among couples could potentially guide interventions to help couples improve their emotional well-being and their relationships. It has been shown that people's global emotional judgment after an experience is strongly influenced by the emotional extremes and ending of that experience, known as the peak-end rule. In this work, we leveraged this theory and used machine learning to investigate, which audio segments can be used to best predict the end-of-conversation emotions of couples. We used speech data collected from 101 Dutch-speaking couples in Belgium who engaged in 10-minute long conversations in the lab. We extracted acoustic features from (1) the audio segments with the most extreme positive and negative ratings, and (2) the ending of the audio. We used transfer learning in which we extracted these acoustic features with a pre-trained convolutional neural network (YAMNet). We then used these features to train machine learning models - support vector machines - to predict the end-of-conversation valence ratings (positive vs negative) of each partner. The results of this work could inform how to best recognize the emotions of couples after conversation-sessions and eventually, lead to a better understanding of couples? relationships either in therapy or in everyday life.

References

[1]
[n.d.]. YAMNet. https://github.com/tensorflow/models/tree/master/research/audioset/yamnet.
[2]
Matthew P Black, Athanasios Katsamanis, Brian R Baucom, Chi-Chun Lee, Adam C Lammert, Andrew Christensen, Panayiotis G Georgiou, and Shrikanth S Narayanan. 2013. Toward automating a human behavioral coding system for married couples? interactions using speech acoustic features. Speech communication, Vol. 55, 1 (2013), 1--21.
[3]
George Boateng. 2020. Towards Real-Time Multimodal Emotion Recognition among Couples. In Proceedings of the 2020 International Conference on Multimodal Interaction (ICMI '20), October 25--29, 2020, Virtual event, Netherlands.
[4]
George Boateng, Laura Sels, Peter Kuppens, Janina Lüscher, Urte Scholz, and Tobias Kowatsch. 2020. Emotion Elicitation and Capture among Real Couples in the Lab. In 1st Momentary Emotion Elicitation & Capture workshop (MEEC 2020), co-located with the ACM CHI Conference on Human Factors in Computing Systems.
[5]
Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N Chang, Sungbok Lee, and Shrikanth S Narayanan. 2008. IEMOCAP: Interactive emotional dyadic motion capture database. Language resources and evaluation, Vol. 42, 4 (2008), 335.
[6]
Carlos Busso, Srinivas Parthasarathy, Alec Burmania, Mohammed AbdelWahab, Najmeh Sadoughi, and Emily Mower Provost. 2016. MSP-IMPROV: An acted corpus of dyadic interactions to study emotion perception. IEEE Transactions on Affective Computing, Vol. 8, 1 (2016), 67--80.
[7]
Laura L Carstensen, John M Gottman, and Robert W Levenson. 1995. Emotional behavior in long-term marriage. Psychology and aging, Vol. 10, 1 (1995), 140.
[8]
Sandeep Nallan Chakravarthula, Haoqi Li, Shao-Yen Tseng, Maija Reblin, and Panayiotis Georgiou. 2019. Predicting Behavior in Cancer-Afflicted Patient and Spouse Interactions Using Speech and Language. Proc. Interspeech 2019 (2019), 3073--3077.
[9]
Egon Dejonckheere, Merijn Mestdagh, Marlies Houben, Isa Rutten, Laura Sels, Peter Kuppens, and Francis Tuerlinckx. 2019. Complex affect dynamics add limited information to the prediction of psychological well-being. Nature human behaviour, Vol. 3, 5 (2019), 478--491.
[10]
Sidney K D'mello and Jacqueline Kory. 2015. A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys (CSUR), Vol. 47, 3 (2015), 1--36.
[11]
Kexin Feng and Theodora Chaspari. 2020. A Review of Generalizable Transfer Learning in Automatic Emotion Recognition. Frontiers in Computer Science, Vol. 2 (2020), 9.
[12]
Barbara L Fredrickson. 2000. Extracting meaning from past affective experiences: The importance of peaks, ends, and specific emotions. Cognition & Emotion, Vol. 14, 4 (2000), 577--606.
[13]
Lisa Gaelick, Galen V Bodenhausen, and Robert S Wyer. 1985. Emotional communication in close relationships. Journal of personality and social psychology, Vol. 49, 5 (1985), 1246.
[14]
Robert L Geist and David G Gilbert. 1996. Correlates of expressed and felt emotion during marital conflict: Satisfaction, personality, process, and outcome. Personality and Individual Differences, Vol. 21, 1 (1996), 49--60.
[15]
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio set: An ontology and human-labeled dataset for audio events. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 776--780.
[16]
James Gibson, Athanasios Katsamanis, Matthew P Black, and Shrikanth Narayanan. 2011. Automatic identification of salient acoustic instances in couples' behavioral interactions using diverse density support vector machines. In Twelfth Annual Conference of the International Speech Communication Association.
[17]
James Gibson, Bo Xiao, Panayiotis G Georgiou, and Shrikanth Narayanan. 2013. An audio-visual approach to learning salient behaviors in couples' problem solving discussions. In 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE, 1--4.
[18]
John Mordechai Gottman. 2005. The mathematics of marriage: Dynamic nonlinear models .MIT Press.
[19]
John Mordechai Gottman. 2014. What predicts divorce?: The relationship between marital processes and marital outcomes .Psychology Press.
[20]
John M Gottman and Robert W Levenson. 1985. A valid procedure for obtaining self-report of affect in marital interaction. Journal of consulting and clinical psychology, Vol. 53, 2 (1985), 151.
[21]
John M Gottman and Robert W Levenson. 1986. Assessing the role of emotion in marriage. Behavioral Assessment (1986).
[22]
Shawn Hershey, Sourish Chaudhuri, Daniel PW Ellis, Jort F Gemmeke, Aren Jansen, R Channing Moore, Manoj Plakal, Devin Platt, Rif A Saurous, Bryan Seybold, et al. 2017. CNN architectures for large-scale audio classification. In 2017 ieee international conference on acoustics, speech and signal processing (icassp). IEEE, 131--135.
[23]
Richard E Heyman. 2001. Observation of couple conflicts: Clinical assessment applications, stubborn truths, and shaky foundations. Psychological assessment, Vol. 13, 1 (2001), 5.
[24]
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[25]
Daniel Kahneman. 2000. Evaluation by moments: Past and future. Choices, values, and frames (2000), 693--708.
[26]
Athanasios Katsamanis, James Gibson, Matthew P Black, and Shrikanth S Narayanan. 2011. Multiple instance learning for classification of human behavior observations. In International Conference on Affective Computing and Intelligent Interaction. Springer, 145--154.
[27]
Patricia K Kerig and Donald H Baucom. 2004. Couple observational coding systems .Taylor & Francis.
[28]
Chi-Chun Lee, Athanasios Katsamanis, Matthew P Black, Brian R Baucom, Panayiotis G Georgiou, and Shrikanth Narayanan. 2011a. An analysis of PCA-based vocal entrainment measures in married couples' affective spoken interactions. In Twelfth Annual Conference of the International Speech Communication Association.
[29]
Chi-Chun Lee, Athanasios Katsamanis, Matthew P Black, Brian R Baucom, Panayiotis G Georgiou, and Shrikanth S Narayanan. 2011b. Affective state recognition in married couples? interactions using PCA-based vocal entrainment measures with multiple instance learning. In International Conference on Affective Computing and Intelligent Interaction. Springer, 31--41.
[30]
Robert W Levenson, Laura L Carstensen, and John M Gottman. 1994. Influence of age and gender on affect, physiology, and their interrelations: A study of long-term marriages. Journal of personality and social psychology, Vol. 67, 1 (1994), 56.
[31]
Robert W Levenson and John M Gottman. 1983. Marital interaction: physiological linkage and affective exchange. Journal of personality and social psychology, Vol. 45, 3 (1983), 587.
[32]
Angeliki Metallinou, Chi-Chun Lee, Carlos Busso, Sharon Carnicke, Shrikanth Narayanan, et al. 2010. The USC CreativeIT database: A multimodal database of theatrical improvisation. Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (2010), 55.
[33]
Angeliki Metallinou and Shrikanth Narayanan. 2013. Annotation and processing of continuous emotional attributes: Challenges and opportunities. In 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG). IEEE, 1--8.
[34]
Philipp M Müller, Sikandar Amin, Prateek Verma, Mykhaylo Andriluka, and Andreas Bulling. 2015. Emotion recognition from embedded bodily expressions and speech during dyadic interactions. In 2015 International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE, 663--669.
[35]
Hong-Wei Ng, Viet Dung Nguyen, Vassilios Vonikakis, and Stefan Winkler. 2015. Deep learning for emotion recognition on small datasets using transfer learning. In Proceedings of the 2015 ACM on international conference on multimodal interaction. 443--449.
[36]
Sally Olderbak, Andrea Hildebrandt, Thomas Pinkpank, Werner Sommer, and Oliver Wilhelm. 2014. Psychometric challenges and proposed solutions when scoring facial emotion expression codes. Behavior Research Methods, Vol. 46, 4 (2014), 992--1006.
[37]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research, Vol. 12 (2011), 2825--2830.
[38]
Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hussain. 2017. A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, Vol. 37 (2017), 98--125.
[39]
Anna Marie Ruef and Robert W Levenson. 2007. Continuous measurement of emotion. Handbook of emotion elicitation and assessment (2007), 286--297.
[40]
James A Russell. 1980. A circumplex model of affect. Journal of personality and social psychology, Vol. 39, 6 (1980), 1161.
[41]
James A Russell, Anna Weiss, and Gerald A Mendelsohn. 1989. Affect grid: a single-item scale of pleasure and arousal. Journal of personality and social psychology, Vol. 57, 3 (1989), 493.
[42]
Sourav Sahoo, Puneet Kumar, Balasubramanian Raman, and Partha Pratim Roy. 2019. A Segment Level Approach to Speech Emotion Recognition Using Transfer Learning. In Asian Conference on Pattern Recognition. Springer, 435--448.
[43]
Laura Sels, Jed Cabrieto, Emily Butler, Harry Reis, Eva Ceulemans, and Peter Kuppens. 2019 a. The occurrence and correlates of emotional interdependence in romantic relationships. Journal of personality and social psychology (2019).
[44]
Laura Sels, Eva Ceulemans, and Peter Kuppens. 2019 b. All's well that ends well? A test of the peak-end rule in couples? conflict discussions. European Journal of Social Psychology, Vol. 49, 4 (2019), 794--806.
[45]
Laura Sels, Yan Ruan, Peter Kuppens, Eva Ceulemans, and Harry Reis. 2020. Actual and perceived emotional similarity in couples? daily lives. Social Psychological and Personality Science, Vol. 11, 2 (2020), 266--275.
[46]
Tessa V West and David A Kenny. 2011. The truth and bias model of judgment. Psychological review, Vol. 118, 2 (2011), 357.

Cited By

View all
  • (2024)RobinNet: A Multimodal Speech Emotion Recognition System With Speaker Recognition for Social InteractionsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.322864911:1(478-487)Online publication date: Feb-2024
  • (2023)Facing Emotions: Between- and Within-Sessions Changes in Facial Expression During Psychological Treatment for DepressionClinical Psychological Science10.1177/2167702623119579312:5(840-854)Online publication date: 25-Sep-2023
  • (2023)Parkinson’s Speech Detection Using YAMNet2023 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA)10.1109/ICAECA56562.2023.10200704(1-5)Online publication date: 16-Jun-2023
  • Show More Cited By

Index Terms

  1. Speech Emotion Recognition among Couples using the Peak-End Rule and Transfer Learning
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI '20 Companion: Companion Publication of the 2020 International Conference on Multimodal Interaction
    October 2020
    548 pages
    ISBN:9781450380027
    DOI:10.1145/3395035
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 December 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. affective computing
    2. convolutional neural network
    3. couples
    4. peak-end rule
    5. speech emotion recognition
    6. speech processing
    7. support vector machine
    8. transfer learning

    Qualifiers

    • Short-paper

    Conference

    ICMI '20
    Sponsor:
    ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
    October 25 - 29, 2020
    Virtual Event, Netherlands

    Acceptance Rates

    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)RobinNet: A Multimodal Speech Emotion Recognition System With Speaker Recognition for Social InteractionsIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.322864911:1(478-487)Online publication date: Feb-2024
    • (2023)Facing Emotions: Between- and Within-Sessions Changes in Facial Expression During Psychological Treatment for DepressionClinical Psychological Science10.1177/2167702623119579312:5(840-854)Online publication date: 25-Sep-2023
    • (2023)Parkinson’s Speech Detection Using YAMNet2023 2nd International Conference on Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA)10.1109/ICAECA56562.2023.10200704(1-5)Online publication date: 16-Jun-2023
    • (2023)Multimodal Transfer Learning for Oral Presentation AssessmentIEEE Access10.1109/ACCESS.2023.329583211(84013-84026)Online publication date: 2023
    • (2023)Deep transfer learning for automatic speech recognition: Towards better generalizationKnowledge-Based Systems10.1016/j.knosys.2023.110851277(110851)Online publication date: Oct-2023
    • (2022)Robat-e-Beheshti: A Persian Wake Word Detection Dataset for Robotic Purposes2022 12th International Conference on Computer and Knowledge Engineering (ICCKE)10.1109/ICCKE57176.2022.9960092(434-439)Online publication date: 17-Nov-2022
    • (2021)“You made me feel this way”: Investigating Partners’ Influence in Predicting Emotions in Couples’ Conflict Interactions using Speech DataCompanion Publication of the 2021 International Conference on Multimodal Interaction10.1145/3461615.3485424(390-394)Online publication date: 18-Oct-2021
    • (2021)Digital Health InterventionsConnected Business10.1007/978-3-030-76897-3_4(71-95)Online publication date: 12-Aug-2021
    • (2020)Towards a wearable system for assessing couples' dyadic interactions in daily lifeAdjunct Proceedings of the 2020 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2020 ACM International Symposium on Wearable Computers10.1145/3410530.3414331(208-211)Online publication date: 10-Sep-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media