[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3240508.3240528acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data

Published: 15 October 2018 Publication History

Abstract

There are threefold challenges in emotion recognition. First, it is difficult to recognize human's emotional states only considering a single modality. Second, it is expensive to manually annotate the emotional data. Third, emotional data often suffers from missing modalities due to unforeseeable sensor malfunction or configuration issues. In this paper, we address all these problems under a novel multi-view deep generative framework. Specifically, we propose to model the statistical relationships of multi-modality emotional data using multiple modality-specific generative networks with a shared latent space. By imposing a Gaussian mixture assumption on the posterior approximation of the shared latent variables, our framework can learn the joint deep representation from multiple modalities and evaluate the importance of each modality simultaneously. To solve the labeled-data-scarcity problem, we extend our multi-view model to semi-supervised learning scenario by casting the semi-supervised classification problem as a specialized missing data imputation task. To address the missing-modality problem, we further extend our semi-supervised multi-view model to deal with incomplete data, where a missing view is treated as a latent variable and integrated out during inference. This way, the proposed overall framework can utilize all available (both labeled and unlabeled, as well as both complete and incomplete) data to improve its generalization ability. The experiments conducted on two real multi-modal emotion datasets demonstrated the superiority of our framework.

Supplementary Material

ZIP File (fp0119.zip)
Supplementary Material of Semi-supervised Deep Generative Modelling of Incomplete Multi-Modality Emotional Data

References

[1]
Massih Reza Amini, Nicolas Usunier, and Cyril Goutte. 2009. Learning from Multiple Partially Observed Views -- an Application to Multilingual Text Categorization. NIPS (2009), 28--36.
[2]
Galen Andrew, Raman Arora, Jeff A Bilmes, and Karen Livescu. 2013. Deep canonical correlation analysis. In ICML. 1247--1255.
[3]
Yuri Burda, Roger Grosse, and Ruslan Salakhutdinov. 2016. Importance Weighted Autoencoders. In ICLR .
[4]
Xiao Cai, Feiping Nie, Weidong Cai, and Heng Huang. 2013. Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model. In ICCV. 1737--1744.
[5]
Rafael A Calvo and Sidney D'Mello. 2010. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, Vol. 1, 1 (2010), 18--37.
[6]
Sarath Chandar, Mitesh M Khapra, Hugo Larochelle, and Balaraman Ravindran. 2016. Correlational neural networks. Neural computation, Vol. 28, 2 (2016), 257--285.
[7]
Samuel Gershman, Matt Hoffman, and David Blei. 2012. Nonparametric variational inference. In ICML .
[8]
Trevor Hastie, Rahul Mazumder, Reza Zadeh, and Reza Zadeh. 2015. Matrix completion and low-rank SVD via fast alternating least squares. Journal of Machine Learning Research, Vol. 16, 1 (2015), 3367--3402.
[9]
Elad Hazan, Roi Livni, and Yishay Mansour. 2015. Classification with Low Rank and Missing Data. In ICML. 257--266.
[10]
Xiaowei Jia, Kang Li, Xiaoyi Li, and Aidong Zhang. 2014. A novel semi-supervised deep learning framework for affective state recognition on EEG signals. In International Conference on Bioinformatics and Bioengineering (BIBE). IEEE, 30--37.
[11]
Raghunandan H. Keshavan, Sewoong Oh, and Andrea Montanari. 2009. Matrix completion from a few entries. IEEE Transactions on Information Theory, Vol. 56, 6 (2009), 2980--2998.
[12]
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[13]
Diederik P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In NIPS. 3581--3589.
[14]
Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improving Variational Inference with Inverse Autoregressive Flow. In NIPS. 4743--4751.
[15]
Diederik P Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR .
[16]
Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. 2012. Deap: A database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, Vol. 3, 1 (2012), 18--31.
[17]
Wei Liu, Wei Long Zheng, and Bao Liang Lu. 2016. Emotion Recognition Using Multimodal Deep Learning. In International Conference on Neural Information Processing. 521--529.
[18]
Yifei Lu, Wei-Long Zheng, Binbin Li, and Bao-Liang Lu. 2015. Combining Eye Movements and EEG to Enhance Emotion Recognition. In IJCAI. 1170--1176.
[19]
Tran Luan, Xiaoming Liu, Jiayu Zhou, and Rong Jin. 2017. Missing Modalities Imputation via Cascaded Residual Autoencoder. In CVPR. 4971--4980.
[20]
Lars Maaløe, Casper Kaae Sønderby, Søren Kaae Sønderby, and Ole Winther. 2016. Auxiliary deep generative models. In ICML. 1445--1453.
[21]
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In ICML. 689--696.
[22]
Feiping Nie, Jing Li, Xuelong Li, et almbox. 2016. Parameter-Free Auto-Weighted Multiple Graph Learning: A Framework for Multiview Clustering and Semi-Supervised Classification. In IJCAI. 1881--1887.
[23]
Lei Pang, Shiai Zhu, and Chong-Wah Ngo. 2015. Deep multimodal learning for affective analysis and retrieval. IEEE Transactions on Multimedia, Vol. 17, 11 (2015), 2008--2020.
[24]
Soujanya Poria, Erik Cambria, Rajiv Bajpai, and Amir Hussain. 2017. A review of affective computing: From unimodal analysis to multimodal fusion. Information Fusion, Vol. 37 (2017), 98--125.
[25]
Brian Quanz and Jun Huan. 2012. CoNet:feature generation for multi-view semi-supervised learning with partially observed views. In CIKM. 1273--1282.
[26]
Hiranmayi Ranganathan, Shayok Chakraborty, and Sethuraman Panchanathan. 2016. Multimodal emotion recognition using deep learning architectures. In Applications of Computer Vision. 1--9.
[27]
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In NIPS. 1278--1286.
[28]
Martin Schels, Markus K"achele, Michael Glodek, David Hrabal, Steffen Walter, and Friedhelm Schwenker. 2014. Using unlabeled data to improve classification of emotional states in human computer interaction. Journal on Multimodal User Interfaces, Vol. 8, 1 (2014), 5--16.
[29]
Iulian V Serban, II Ororbia, G Alexander, Joelle Pineau, and Aaron Courville. 2016. Multi-modal Variational Encoder-Decoders. arXiv preprint arXiv:1612.00377 (2016).
[30]
Chao Shang, Aaron Palmer, Jiangwen Sun, Ko Shin Chen, Jin Lu, and Jinbo Bi. 2017. VIGAN: Missing view imputation with generative adversarial networks. In IEEE International Conference on Big Data. 766--775.
[31]
Mohammad Soleymani, Sadjad Asghari-Esfeden, Yun Fu, and Maja Pantic. 2016. Analysis of EEG signals and facial expressions for continuous emotion detection. IEEE Transactions on Affective Computing, Vol. 7, 1 (2016), 17--28.
[32]
Nitish Srivastava and Ruslan Salakhutdinov. 2014. Multimodal Learning with Deep Boltzmann Machines. Journal of Machine Learning Research, Vol. 15 (2014), 2949--2980.
[33]
Panagiotis Tzirakis, George Trigeorgis, Mihalis A Nicolaou, Björn W Schuller, and Stefanos Zafeiriou. 2017. End-to-end multimodal emotion recognition using deep neural networks. IEEE Journal of Selected Topics in Signal Processing, Vol. 11, 8 (2017), 1301--1309.
[34]
Johannes Wagner, Elisabeth Andre, Florian Lingenfelser, and Jonghwa Kim. 2011. Exploring fusion methods for multimodal emotion recognition with missing data. IEEE Transactions on Affective Computing, Vol. 2, 4 (2011), 206--218.
[35]
C. Wang, X. Liao, L Carin, and D. B. Dunson. 2010. Classification with Incomplete Data Using Dirichlet Process Priors. Journal of Machine Learning Research, Vol. 11, 18 (2010), 3269.
[36]
Weiran Wang, Raman Arora, Karen Livescu, and Jeff A Bilmes. 2015. On Deep Multi-View Representation Learning. In ICML. 1083--1092.
[37]
Weiran Wang, Xinchen Yan, Honglak Lee, and Karen Livescu. 2016. Deep Variational Canonical Correlation Analysis. arXiv: 1610.03454 (2016).
[38]
D Williams, X. Liao, Y. Xue, L Carin, and B Krishnapuram. 2007. On classification with incomplete data. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, 3 (2007), 427.
[39]
C. Xu, D. Tao, and C. Xu. 2015. Multi-view Learning with Incomplete Views. IEEE Transactions on Image Processing, Vol. 24, 12 (2015), 5812--5825.
[40]
Shipeng Yu, Balaji Krishnapuram, Rómer Rosales, and R. Bharat Rao. 2011. Bayesian Co-Training. Journal of Machine Learning Research, Vol. 12, 3 (2011), 2649--2680.
[41]
Lei Zhang, Yao Zhao, Zhenfeng Zhu, Dinggang Shen, and Shuiwang Ji. 2018. Multi-View Missing Data Completion. IEEE Transactions on Knowledge and Data Engineering, Vol. 30, 7 (2018), 1296--1309.
[42]
Zixing Zhang, Fabien Ringeval, Bin Dong, Eduardo Coutinho, Erik Marchi, and Björn Schüller. 2016. Enhanced semi-supervised learning for multimodal emotion recognition. In ICASSP. IEEE, 5185--5189.
[43]
Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. 2018. EmotionMeter: A Multimodal Framework for Recognizing Human Emotions. IEEE Transactions on Cybernetics (2018), 1--13.
[44]
Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating Critical Frequency Bands and Channels for EEG-based Emotion Recognition with Deep Neural Networks. IEEE Transactions on Autonomous Mental Development, Vol. 7, 3 (2015), 162--175.

Cited By

View all
  • (2024)Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data ScenariosProceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing10.1145/3689092.3689411(116-124)Online publication date: 28-Oct-2024
  • (2024)Research Progress of EEG-Based Emotion Recognition: A SurveyACM Computing Surveys10.1145/366600256:11(1-49)Online publication date: 8-Jul-2024
  • (2024)Modality-collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364034320:5(1-23)Online publication date: 11-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '18: Proceedings of the 26th ACM international conference on Multimedia
October 2018
2167 pages
ISBN:9781450356657
DOI:10.1145/3240508
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep generative model
  2. incomplete data
  3. multi-modal emotion recognition
  4. multi-view semi-supervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

MM '18
Sponsor:
MM '18: ACM Multimedia Conference
October 22 - 26, 2018
Seoul, Republic of Korea

Acceptance Rates

MM '18 Paper Acceptance Rate 209 of 757 submissions, 28%;
Overall Acceptance Rate 2,098 of 8,321 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)88
  • Downloads (Last 6 weeks)11
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data ScenariosProceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing10.1145/3689092.3689411(116-124)Online publication date: 28-Oct-2024
  • (2024)Research Progress of EEG-Based Emotion Recognition: A SurveyACM Computing Surveys10.1145/366600256:11(1-49)Online publication date: 8-Jul-2024
  • (2024)Modality-collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/364034320:5(1-23)Online publication date: 11-Jan-2024
  • (2024)Multi-View Multi-Label Fine-Grained Emotion Decoding From Human Brain ActivityIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321776735:7(9026-9040)Online publication date: Jul-2024
  • (2024)MVCNet: Multiview Contrastive Network for Unsupervised Representation Learning for 3-D CT LesionsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.320341235:6(7376-7390)Online publication date: Jun-2024
  • (2024)EEG-Based Multimodal Emotion Recognition: A Machine Learning PerspectiveIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.336913073(1-29)Online publication date: 2024
  • (2024)SSPRA: A Robust Approach to Continuous Authentication Amidst Real-World Adversarial ChallengesIEEE Transactions on Biometrics, Behavior, and Identity Science10.1109/TBIOM.2024.33695906:2(245-260)Online publication date: Apr-2024
  • (2024)Contrastive Learning Based Modality-Invariant Feature Acquisition for Robust Multimodal Emotion Recognition With Missing ModalitiesIEEE Transactions on Affective Computing10.1109/TAFFC.2024.337857015:4(1856-1873)Online publication date: Oct-2024
  • (2024)From the Lab to the Wild: Affect Modeling Via Privileged InformationIEEE Transactions on Affective Computing10.1109/TAFFC.2023.326507215:2(380-392)Online publication date: Apr-2024
  • (2024)Centaur: Robust Multimodal Fusion for Human Activity RecognitionIEEE Sensors Journal10.1109/JSEN.2024.338889324:11(18578-18591)Online publication date: 1-Jun-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media