Abstract
This work presents the methodology to synthesize the complex facial expressions images from the learned representation without specifying emotion labels as input. The proposed methodology consists of three main modules: the basic emotion recognition model, linear regression, and the generative model. The recognition model is designed to extract the expression-related features that are the baseline for generation of complex facial expression. The linear regression is responsible for transforming expression features into latent space, which are taken by a generative model for image generation. In this work, two benchmark facial expressions datasets (Extended Cohn-Kanade and Japanese Female Facial Expressions) are used for the experiment. Based on our results, the proposed methodology provides the complex facial expressions images for compound emotions with comparatively high-visual quality. For quantitative assessment, the basic emotion recognition model can predict the an emotion from the generated compound facial expressions image by the proposed methodology with the accuracy of 67.51% and 62.87% respectively.
Similar content being viewed by others
Data Availability
The data generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: International conference on machine learning, PMLR, pp 214–223
Du S, Martinez AM (2015) Compound facial expressions of emotion: from basic research to clinical applications. Dialogues Clin Neurosci 17(4):443
Du S, Tao Y, Martinez AM (2014) Compound facial expressions of emotion. Proc Natl Acad Sci 111(15):E1454–E1462
Eckman P (1972) Universal and cultural differences in facial expression of emotion. In: Nebraska symposium on motivation, vol. 19, University of Nebraska Press, pp 207–284
Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124
Ekman P, Oster H (1979) Facial expressions of emotion. Annu Rev Psychol 30(1):527–554
Friesen E, Ekman P (1978) Facial action coding system: a technique for the measurement of facial movement. Palo Alto 3(2):5
Geitgey A (2019) Face recognition documentation. Release 1. 2 (3):3–37
Ghatas FS, Hemayed EE (2020) GANKIN: Generating Kin faces using disentangled GAN. SN Appl Sci 2:166. https://doi.org/10.1007/s42452-020-1949-3
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. Adv Neural Inf Process Syst :27
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of wasserstein gans. arXiv:1704.00028
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv Neural Inform Process Syst 30:
Hupka RB (1984) Jealousy: compound emotion or label for a particular situation? Motiv Emot 8(2):141–155. https://doi.org/10.1007/BF00993070
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196
Karras T, Miika A, Janne H, Samuli L, Jaakko L, Timo A (2020) Training generative adversarial networks with limited data. Adv Neural Inf Process Syst 33:12104–12114
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1867–1874, https://doi.org/10.1109/CVPR.2014.241
Khine WSS, Siritanawan P, Kotani K (2020) Generation of compound emotions expressions with emotion generative adversarial networks (EmoGANs). In: 2020 59th Annual conference of the society of instrument and control engineers of Japan (SICE), IEEE, pp 748–755, https://doi.org/10.23919/SICE48898.2020.9240306
Khine WSS, Siritanawan P, Kotani K (2021) Wasserstein based EmoGANs+. In: 2021 Joint 10th international conference on informatics, electronics and vision (ICIEV) and 2021 5th International conference on imaging, vision and pattern recognition (icIVPR), IEEE, pp 1–8, https://doi.org/10.1109/ICIEVicIVPR52578.2021.9564216
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758. https://doi.org/10.5555/1577069.1755843
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980
LaPlante D, Ambady N (2000) Multiple messages: facial recognition advantage for compound expressions. J Nonverbal Behav 24(3):211–224. https://doi.org/10.1023/A:1006641104653
LinderNorén E (2021) Keras-GAN. https://github.com/eriklindernoren/Keras-GAN
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition - workshops, pp 94–101, https://doi.org/10.1109/CVPRW.2010.5543262
Lyons M, Kamachi M, Gyoba J (2017) Japanese female facial expression (JAFFE) database
Martinez A, Du S (2012) A model of the perception of facial expressions of emotion by humans: research overview and perspectives. J Mach Learn Res 13(5):
Mehrabian A, Ferris SR (1967) Inference of attitudes from nonverbal communication in two channels. J Consult Psychol 31(3):248
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv:1606.01583
Parent A (2005) Duchenne de boulogne: a pioneer in neurology and medical photography. Can J Neurol Sci 32(3):369–377. https://doi.org/10.1017/S031716710000431
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Tan M, Le Q (2019) Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, PMLR, pp 6105–6114
Funding
This work was supported by Japan Advanced Institute of Science and Technology Research Grants (Houga).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Shwe Sin Khine Win, Prarinya Siritanawan and Kazunori Kotani are contributed equally to this work.
Appendix: Additional results
Appendix: Additional results
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Win, S.S.K., Siritanawan, P. & Kotani, K. Compound facial expressions image generation for complex emotions. Multimed Tools Appl 82, 11549–11588 (2023). https://doi.org/10.1007/s11042-022-14289-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-14289-7