More Web Proxy on the site http://driver.im/

research-article

Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data

Authors:

Wei-Long Zheng,

Huiguang HeAuthors Info & Claims

MM '18: Proceedings of the 26th ACM international conference on Multimedia

Pages 108 - 116

https://doi.org/10.1145/3240508.3240528

Published: 15 October 2018 Publication History

Abstract

There are threefold challenges in emotion recognition. First, it is difficult to recognize human's emotional states only considering a single modality. Second, it is expensive to manually annotate the emotional data. Third, emotional data often suffers from missing modalities due to unforeseeable sensor malfunction or configuration issues. In this paper, we address all these problems under a novel multi-view deep generative framework. Specifically, we propose to model the statistical relationships of multi-modality emotional data using multiple modality-specific generative networks with a shared latent space. By imposing a Gaussian mixture assumption on the posterior approximation of the shared latent variables, our framework can learn the joint deep representation from multiple modalities and evaluate the importance of each modality simultaneously. To solve the labeled-data-scarcity problem, we extend our multi-view model to semi-supervised learning scenario by casting the semi-supervised classification problem as a specialized missing data imputation task. To address the missing-modality problem, we further extend our semi-supervised multi-view model to deal with incomplete data, where a missing view is treated as a latent variable and integrated out during inference. This way, the proposed overall framework can utilize all available (both labeled and unlabeled, as well as both complete and incomplete) data to improve its generalization ability. The experiments conducted on two real multi-modal emotion datasets demonstrated the superiority of our framework.

Supplementary Material

ZIP File (fp0119.zip)

Supplementary Material of Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data

Download
255.59 KB

References

[1]

Massih Reza Amini, Nicolas Usunier, and Cyril Goutte. 2009. Learning from Multiple Partially Observed Views -- an Application to Multilingual Text Categorization. NIPS (2009), 28--36.

Digital Library

[2]

Galen Andrew, Raman Arora, Jeff A Bilmes, and Karen Livescu. 2013. Deep canonical correlation analysis. In ICML. 1247--1255.

Digital Library

[3]

Yuri Burda, Roger Grosse, and Ruslan Salakhutdinov. 2016. Importance Weighted Autoencoders. In ICLR .

[4]

Xiao Cai, Feiping Nie, Weidong Cai, and Heng Huang. 2013. Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model. In ICCV. 1737--1744.

Digital Library

[5]

Rafael A Calvo and Sidney D'Mello. 2010. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, Vol. 1, 1 (2010), 18--37.

Digital Library

[6]

Sarath Chandar, Mitesh M Khapra, Hugo Larochelle, and Balaraman Ravindran. 2016. Correlational neural networks. Neural computation, Vol. 28, 2 (2016), 257--285.

Digital Library

[7]

Samuel Gershman, Matt Hoffman, and David Blei. 2012. Nonparametric variational inference. In ICML .

Digital Library

[8]

Trevor Hastie, Rahul Mazumder, Reza Zadeh, and Reza Zadeh. 2015. Matrix completion and low-rank SVD via fast alternating least squares. Journal of Machine Learning Research, Vol. 16, 1 (2015), 3367--3402.

Digital Library

[9]

Elad Hazan, Roi Livni, and Yishay Mansour. 2015. Classification with Low Rank and Missing Data. In ICML. 257--266.

Digital Library

[10]

Xiaowei Jia, Kang Li, Xiaoyi Li, and Aidong Zhang. 2014. A novel semi-supervised deep learning framework for affective state recognition on EEG signals. In International Conference on Bioinformatics and Bioengineering (BIBE). IEEE, 30--37.

Digital Library

[11]

Raghunandan H. Keshavan, Sewoong Oh, and Andrea Montanari. 2009. Matrix completion from a few entries. IEEE Transactions on Information Theory, Vol. 56, 6 (2009), 2980--2998.

Digital Library

[12]

Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[13]

Diederik P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In NIPS. 3581--3589.

Digital Library

[14]

Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improving Variational Inference with Inverse Autoregressive Flow. In NIPS. 4743--4751.

Digital Library

[15]

Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR .

[16]

Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. 2012. Deap: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, Vol. 3, 1 (2012), 18--31.

Digital Library

[17]

Wei Liu, Wei Long Zheng, and Bao Liang Lu. 2016. Emotion Recognition Using Multimodal Deep Learning. In International Conference on Neural Information Processing. 521--529.

Digital Library

[18]

Yifei Lu, Wei-Long Zheng, Binbin Li, and Bao-Liang Lu. 2015. Combining Eye Movements and EEG to Enhance Emotion Recognition. In IJCAI. 1170--1176.

Digital Library

[19]

Tran Luan, Xiaoming Liu, Jiayu Zhou, and Rong Jin. 2017. Missing Modalities Imputation via Cascaded Residual Autoencoder. In CVPR. 4971--4980.

[20]

Lars Maaløe, Casper Kaae Sønderby, Søren Kaae Sønderby, and Ole Winther. 2016. Auxiliary deep generative models. In ICML. 1445--1453.

Digital Library

[21]

Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In ICML. 689--696.

Digital Library

[22]

Feiping Nie, Jing Li, Xuelong Li, et almbox. 2016. Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification. In IJCAI. 1881--1887.

Digital Library

[23]

Lei Pang, Shiai Zhu, and Chong-Wah Ngo. 2015. Deep multimodal learning for affective analysis and retrieval. IEEE Transactions on Multimedia, Vol. 17, 11 (2015), 2008--2020.

Digital Library

[24]

Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hussain. 2017. A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, Vol. 37 (2017), 98--125.

Digital Library

[25]

Brian Quanz and Jun Huan. 2012. CoNet:feature generation for multi-view semi-supervised learning with partially observed views. In CIKM. 1273--1282.

Digital Library

[26]

Hiranmayi Ranganathan, Shayok Chakraborty, and Sethuraman Panchanathan. 2016. Multimodal emotion recognition using deep learning architectures. In Applications of Computer Vision. 1--9.

[27]

Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In NIPS. 1278--1286.

[28]

Martin Schels, Markus K"achele, Michael Glodek, David Hrabal, Steffen Walter, and Friedhelm Schwenker. 2014. Using unlabeled data to improve classification of emotional states in human computer interaction. Journal on Multimodal User Interfaces, Vol. 8, 1 (2014), 5--16.

[29]

Iulian V Serban, II Ororbia, G Alexander, Joelle Pineau, and Aaron Courville. 2016. Multi-modal Variational Encoder-Decoders. arXiv preprint arXiv:1612.00377 (2016).

[30]

Chao Shang, Aaron Palmer, Jiangwen Sun, Ko Shin Chen, Jin Lu, and Jinbo Bi. 2017. VIGAN: Missing view imputation with generative adversarial networks. In IEEE International Conference on Big Data. 766--775.

[31]

Mohammad Soleymani, Sadjad Asghari-Esfeden, Yun Fu, and Maja Pantic. 2016. Analysis of EEG signals and facial expressions for continuous emotion detection. IEEE Transactions on Affective Computing, Vol. 7, 1 (2016), 17--28.

Digital Library

[32]

Nitish Srivastava and Ruslan Salakhutdinov. 2014. Multimodal Learning with Deep Boltzmann Machines. Journal of Machine Learning Research, Vol. 15 (2014), 2949--2980.

Digital Library

[33]

Panagiotis Tzirakis, George Trigeorgis, Mihalis A Nicolaou, Björn W Schuller, and Stefanos Zafeiriou. 2017. End-to-end multimodal emotion recognition using deep neural networks. IEEE Journal of Selected Topics in Signal Processing, Vol. 11, 8 (2017), 1301--1309.

[34]

Johannes Wagner, Elisabeth Andre, Florian Lingenfelser, and Jonghwa Kim. 2011. Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Transactions on Affective Computing, Vol. 2, 4 (2011), 206--218.

Digital Library

[35]

C. Wang, X. Liao, L Carin, and D. B. Dunson. 2010. Classification with Incomplete Data Using Dirichlet Process Priors. Journal of Machine Learning Research, Vol. 11, 18 (2010), 3269.

Digital Library

[36]

Weiran Wang, Raman Arora, Karen Livescu, and Jeff A Bilmes. 2015. On Deep Multi-View Representation Learning. In ICML. 1083--1092.

Digital Library

[37]

Weiran Wang, Xinchen Yan, Honglak Lee, and Karen Livescu. 2016. Deep Variational Canonical Correlation Analysis. arXiv: 1610.03454 (2016).

[38]

D Williams, X. Liao, Y. Xue, L Carin, and B Krishnapuram. 2007. On classification with incomplete data. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, 3 (2007), 427.

Digital Library

[39]

C. Xu, D. Tao, and C. Xu. 2015. Multi-view Learning with Incomplete Views. IEEE Transactions on Image Processing, Vol. 24, 12 (2015), 5812--5825.

Digital Library

[40]

Shipeng Yu, Balaji Krishnapuram, Rómer Rosales, and R. Bharat Rao. 2011. Bayesian Co-Training. Journal of Machine Learning Research, Vol. 12, 3 (2011), 2649--2680.

Digital Library

[41]

Lei Zhang, Yao Zhao, Zhenfeng Zhu, Dinggang Shen, and Shuiwang Ji. 2018. Multi-View Missing Data Completion. IEEE Transactions on Knowledge and Data Engineering, Vol. 30, 7 (2018), 1296--1309.

[42]

Zixing Zhang, Fabien Ringeval, Bin Dong, Eduardo Coutinho, Erik Marchi, and Björn Schüller. 2016. Enhanced semi-supervised learning for multimodal emotion recognition. In ICASSP. IEEE, 5185--5189.

[43]

Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. 2018. EmotionMeter: A Multimodal Framework for Recognizing Human Emotions. IEEE Transactions on Cybernetics (2018), 1--13.

[44]

Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating Critical Frequency Bands and Channels for EEG-based Emotion Recognition with Deep Neural Networks. IEEE Transactions on Autonomous Mental Development, Vol. 7, 3 (2015), 162--175.

Digital Library

Cited By

Fan QZuo HLiu RLian ZGao GTao JGhosh SLian ZCai ZSchuller BDhall AZhao GKollias DCambria EGoecke RGedeon T(2024)Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data ScenariosProceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing10.1145/3689092.3689411(116-124)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689092.3689411
Wang YZhang BDi L(2024)Research Progress of EEG-Based Emotion Recognition: A SurveyACM Computing Surveys10.1145/366600256:11(1-49)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3666002
Chen CZhang P(2024)Modality-collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364034320:5(1-23)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3640343
Show More Cited By

Index Terms

Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Semi-supervised learning settings
2. Information systems
  1. Data management systems
    1. Database design and models
      1. Data model extensions
        Incomplete data
  2. Information systems applications
    1. Multimedia information systems

Recommendations

Semi-Supervised Multi-Label Learning from Crowds via Deep Sequential Generative Model
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Multi-label classification (MLC) is pervasive in real-world applications. Conventional MLC algorithms assume that enough ground truth labels are available for training a classifier. While in reality, obtaining ground truth labels is expensive and time-...
Improving multi-view semi-supervised learning with agreement-based sampling
Combined Learning Methods and Mining Complex Data

Semi-supervised learning algorithms are widely used to build strong learning models when there are not enough labeled instances. Some semi-supervised learning algorithms, including co-training and co-EM, use multiple views to build learning models. Past ...
Multi-view semi-supervised learning for image classification

With the massive growth of digital image data uploaded to the Internet, classifying each image into appropriate semantic category with respect to its image content for image index and image retrieval has become an increasingly difficult and laborious ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '18: Proceedings of the 26th ACM international conference on Multimedia

October 2018

2167 pages

ISBN:9781450356657

DOI:10.1145/3240508

General Chairs:
Susanne Boll
University of Oldenburg, Germany
,
Kyoung Mu Lee
Seoul National University, Korea
,
Jiebo Luo
University of Rochester, USA
,
Wenwu Zhu
Tsinghua University, China
,
Program Chairs:
Hyeran Byun
Yonsei University, Korea
,
Chang Wen Chen
State Univ. Of New York at Buffalo, USA
,
Rainer Lienhart
University of Augsburg, Germany
,
Tao Mei
JD AI, China

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Youth Innovation Promotion Association of the Chinese Academy of Sciences
National Natural Science Foundation of China
Beijing Municipal Science & Technology Commission

Conference

MM '18

Sponsor:

SIGMM

MM '18: ACM Multimedia Conference

October 22 - 26, 2018

Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;

Overall Acceptance Rate 2,098 of 8,321 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

52
Total Citations
View Citations
719
Total Downloads

Downloads (Last 12 months)88
Downloads (Last 6 weeks)11

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Fan QZuo HLiu RLian ZGao GTao JGhosh SLian ZCai ZSchuller BDhall AZhao GKollias DCambria EGoecke RGedeon T(2024)Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data ScenariosProceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing10.1145/3689092.3689411(116-124)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3689092.3689411
Wang YZhang BDi L(2024)Research Progress of EEG-Based Emotion Recognition: A SurveyACM Computing Surveys10.1145/366600256:11(1-49)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3666002
Chen CZhang P(2024)Modality-collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364034320:5(1-23)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3640343
Fu KDu CWang SHe H(2024)Multi-View Multi-Label Fine-Grained Emotion Decoding From Human Brain ActivityIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321776735:7(9026-9040)Online publication date: Jul-2024
https://doi.org/10.1109/TNNLS.2022.3217767
Zhai PCong HZhu EZhao GYu YLi J(2024)MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3-D CT LesionsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320341235:6(7376-7390)Online publication date: Jun-2024
https://doi.org/10.1109/TNNLS.2022.3203412
Liu HLou TZhang YWu YXiao YJensen CZhang D(2024)EEG-Based Multimodal Emotion Recognition: A Machine Learning PerspectiveIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.336913073(1-29)Online publication date: 2024
https://doi.org/10.1109/TIM.2024.3369130
Chen FXin JPhoha V(2024)SSPRA: A Robust Approach to Continuous Authentication Amidst Real-World Adversarial ChallengesIEEE Transactions on Biometrics, Behavior, and Identity Science10.1109/TBIOM.2024.33695906:2(245-260)Online publication date: Apr-2024
https://doi.org/10.1109/TBIOM.2024.3369590
Liu RZuo HLian ZSchuller BLi H(2024)Contrastive Learning Based Modality-Invariant Feature Acquisition for Robust Multimodal Emotion Recognition With Missing ModalitiesIEEE Transactions on Affective Computing10.1109/TAFFC.2024.337857015:4(1856-1873)Online publication date: Oct-2024
https://doi.org/10.1109/TAFFC.2024.3378570
Makantasis KPinitas KLiapis AYannakakis G(2024)From the Lab to the Wild: Affect Modeling Via Privileged InformationIEEE Transactions on Affective Computing10.1109/TAFFC.2023.326507215:2(380-392)Online publication date: Apr-2024
https://doi.org/10.1109/TAFFC.2023.3265072
Xaviar SYang XArdakanian O(2024)Centaur: Robust Multimodal Fusion for Human Activity RecognitionIEEE Sensors Journal10.1109/JSEN.2024.338889324:11(18578-18591)Online publication date: 1-Jun-2024
https://doi.org/10.1109/JSEN.2024.3388893
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents