Computer Science > Computation and Language

arXiv:2204.00400 (cs)

[Submitted on 1 Apr 2022 (v1), last revised 26 Jul 2022 (this version, v2)]

Title:Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

Authors:Andreas Triantafyllopoulos, Johannes Wagner, Hagen Wierstorf, Maximilian Schmitt, Uwe Reichel, Florian Eyben, Felix Burkhardt, Björn W. Schuller

View PDF

Abstract:Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance -- and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should be included in their testing.

Comments:	Accepted in INTERSPEECH 2022
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2204.00400 [cs.CL]
	(or arXiv:2204.00400v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2204.00400
Journal reference:	Proc. Interspeech 2022, 146-150
Related DOI:	https://doi.org/10.21437/Interspeech.2022-10371

Submission history

From: Andreas Triantafyllopoulos [view email]
[v1] Fri, 1 Apr 2022 12:47:45 UTC (310 KB)
[v2] Tue, 26 Jul 2022 10:06:08 UTC (310 KB)

Computer Science > Computation and Language

Title:Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators