research-article

Adversarial alignment and graph fusion via information bottleneck for multimodal emotion recognition in conversations

Authors:

Yuntao Shou,

Tao Meng,

Wei Ai,

Fuchen Zhang,

Nan Yin,

Keqin LiAuthors Info & Claims

Volume 112, Issue C

https://doi.org/10.1016/j.inffus.2024.102590

Published: 18 October 2024 Publication History

Abstract

With the rapid development of social media and human–computer interaction, multimodal emotion recognition in conversations (MERC) tasks have begun to receive widespread research attention. The MERC task is to extract and fuse complementary semantic information from different modalities to classify the speaker’s emotion. However, the existing feature fusion methods usually directly map the features of other modalities into the same feature space for information fusion, which cannot eliminate the heterogeneity between different modalities and make the subsequent emotion class boundary learning more difficult. In addition, existing graph contrastive learning methods obtain consistent feature representations by maximizing mutual information between multiple views, which may lead to overfitting of the model. To tackle the above problem, we propose a novel Adversarial Alignment and Graph Fusion via Information Bottleneck for Multimodal Emotion Recognition in Conversations (AGF-IB) method. Firstly, we input video, audio, and text features into a multi-layer perceptron (MLP) to map them into separate feature spaces. Secondly, we build a generator and a discriminator for the three modal features, respectively, through adversarial representation to achieve information interaction between modalities and eliminate the heterogeneity among modalities. Thirdly, we introduce graph contrastive representation learning to capture intra-modal and inter-modal complementary semantic information and learn intra-class and inter-class boundary information of emotion categories. Furthermore, instead of maximizing the mutual information (MI) between multiple views, we use information bottleneck theory to minimize the MI between views. Specifically, we construct a graph structure for the three modal features respectively and perform contrastive representation learning on nodes with different emotions in the same modality and nodes with the same emotion in different modalities, to improve the feature representation ability of nodes. Finally, we use MLP to complete the emotional classification of the speaker. Extensive experiments show that AGF-IB can improve emotion recognition accuracy on IEMOCAP and MELD datasets. Furthermore, since AGF-IB is a general multimodal fusion and contrastive learning method, it can be applied to other multimodal tasks in a plug-and-play manner, e.g., humor detection.

Highlights

•

A multimodal emotion recognition architecture through adversarial alignment and graph fusion is proposed.

•

A cross-modal feature alignment method with adversarial learning is designed to eliminate inter-modal heterogeneity.

•

A graph contrastive learning method via information bottleneck is proposed to enhance multimodal semantic association.

•

Our method can be applied to other multimodal tasks in a plug-and-play manner, e.g., humor detection.

References

[1]

Huang F., Li X., Yuan C., Zhang S., Zhang J., Qiao S., Attention-emotion-enhanced convolutional LSTM for sentiment analysis, IEEE Trans. Neural Netw. Learn. Syst. 33 (9) (2022) 4332–4345.

Abstract

Highlights

References

Index Terms

Recommendations

Bi-stream graph learning based multimodal fusion for emotion recognition in conversation

A multimodal emotion recognition method based on multiple fusion of audio-visual modalities

Unimodal and Multimodal Integrated Representation Learning via Improved Information Bottleneck for Multimodal Sentiment Analysis

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations