Abstract
Micro-expression is a type of facial expression that reveals the deepest feeling held within the human heart. Despite the substantial improvement that has been achieved, micro-expression recognition remains a significant challenge considering its low intensity and short duration. In this paper, we investigate the recognition of micro-expression using deep learning techniques and present the RES-CapsNet, which is an improved capsule network that employs Res2Net as the backbone to extract multi-level and multi-scale characteristics. Furthermore, RES-CapsNet adds a squeeze-excitation (SE) block to the primary capsule layer (PrimaryCaps). Benefiting from a SE block, the valuable micro-expression features are highlighted and the useless ones are suppressed. In addition, between the first convolutional layer and the PrimaryCaps in RES-CapsNet, we introduce an effective channel attention (ECA) module that simply includes a few parameters while dramatically improving the performance. The proposed architecture initially obtains apex frames from the micro-expression sequence to capture the most distinct facial muscle movements and then feeds the pre-processed images into RES-CapsNet for further feature extraction and classification. The Leave-One-Subject-Out (LOSO) cross-validation strategy is implemented on three prevalent spontaneous micro-expression databases (i.e., CASME II, SMIC, and SAMM) to assess the feasibility of our RES-CapsNet. Extensive experiments demonstrate that our RES-CapsNet describes considerable details of micro-expression effectively and achieves superiorly higher performance than the baseline CapsuleNet.
Similar content being viewed by others
Data availability
The datasets used in our paper (CASME II, SAMM, and SMIC) are publicly available.
References
Wu, Q., Fu, X.: Micro-expression and its applications. Adv. Psychol. Sci. 18(09), 1359 (2010)
Xie, Z., Yu, X., Niu, J., Li, Y.: Facial microexpression recognition based on adaptive key frame representation. J. Electron. Imaging 28, 1 (2019). https://doi.org/10.1117/1.JEI.28.3.033015
Zong, Y., Zheng, W., Huang, X., Shi, J., Cui, Z., Zhao, G.: Domain regeneration for cross-database micro-expression recognition. IEEE Trans. Image Process. 27, 2484–2498 (2018). https://doi.org/10.1109/TIP.2018.2797479
Peng, M., Wang, C., Chen, T., Liu, G., Fu, X.: Dual temporal scale convolutional neural network for micro-expression recognition. Front. Psychol. 8, 1745 (2017). https://doi.org/10.3389/fpsyg.2017.01745
Peng, M., Wu, Z., Zhang, Z., Chen, T.: From macro to micro expression recognition: deep learning on small datasets using transfer learning. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018). pp. 657–661 (2018)
Khor, H.-Q., See, J., Phan, R.C.W., Lin, W.: Enriched long-term recurrent convolutional network for facial micro-expression recognition. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018). pp. 667–674 (2018)
Xie, Z., Shi, L., Cheng, S., Fan, J., Zhan, H.: Micro-expression recognition based on deep capsule adversarial domain adaptation network. J. Electron. Imaging (2022). https://doi.org/10.1117/1.JEI.31.1.013021
G.E. Hinton, S. Sabour, N. Frosst: Matrix capsules with EM routing. International Conference on Learning Representations (2018)
S. Sabour, N. Frosst, G.E. Hinton: Dynamic routing between capsules. Adv. Neural. Inf. Process. Syst. 30 (2017)
Quang, N. van, Chun, J., Tokuyama, T.: CapsuleNet for Micro-Expression Recognition. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019). pp. 1–7 (2019)
Gao, S.-H., Cheng, M.-M., Zhao, K., Zhang, X.-Y., Yang, M.-H., Torr, P.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43, 652–662 (2021). https://doi.org/10.1109/TPAMI.2019.2938758
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42, 2011–2023 (2020). https://doi.org/10.1109/TPAMI.2019.2913372
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks (2019)
Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 915–928 (2007). https://doi.org/10.1109/TPAMI.2007.1110
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002). https://doi.org/10.1109/TPAMI.2002.1017623
Wang, Y., See, J., Phan, R., Oh, Y.-H.: LBP with six intersection points: reducing redundant information in LBP-TOP for micro-expression recognition. Presented at the May (2015)
Wang, Y., See, J., Phan, R., Oh, Y.-H.: Efficient spatio-temporal local binary patterns for spontaneous facial micro-expression recognition. PLoS One 10, e0124674 (2015). https://doi.org/10.1371/journal.pone.0124674
Liu, Y.-J., Zhang, J.-K., Yan, W.-J., Wang, S.-J., Zhao, G., Fu, X.: A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans. Affect. Comput. 7, 299–310 (2016). https://doi.org/10.1109/TAFFC.2015.2485205
Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 1932–1939 (2009)
Liu, Y.-J., Li, B.-J., Lai, Y.-K.: Sparse MDMO: learning a discriminative feature for micro-expression recognition. IEEE Trans. Affect. Comput. 12, 254–261 (2021). https://doi.org/10.1109/TAFFC.2018.2854166
Liong, S.-T., See, J., Wong, K., Phan, R.C.-W.: Less is more: micro-expression recognition from video using apex frame. Signal Process. Image Commun. 62, 82–92 (2018). https://doi.org/10.1016/j.image.2017.11.006
Kim, D., Baddar, W., Ro, Y.: Micro-Expression Recognition with Expression-State Constrained Spatio-Temporal Feature Representations. Presented at the May (2016)
Wang, S.-J., Li, B.-J., Liu, Y.-J., Yan, W.-J., Ou, X., Huang, X., Xu, F., Fu, X.: Micro-expression recognition with small sample size by transferring long-term convolutional neural network. Neurocomputing (2018). https://doi.org/10.1016/j.neucom.2018.05.107
Gan, Y.S., Liong, S.-T., Yau, W.-C., Huang, Y.-C., Tan, L.-K.: OFF-ApexNet on micro-expression recognition system. Signal Process. Image Commun. 74, 129–139 (2019). https://doi.org/10.1016/j.image.2019.02.005
Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: International conference on artificial neural networks. pp. 44–51. Springer (2011)
Gagana, B., Athri, H.A.U., Natarajan, S.: Activation Function Optimizations for Capsule Networks. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI). pp. 1172–1178 (2018)
Yin, J., Li, S., Zhu, H., Luo, X.: Hyperspectral image classification using CapsNet with well-initialized shallow layers. IEEE Geosci. Remote Sens. Lett. 16, 1095–1099 (2019). https://doi.org/10.1109/LGRS.2019.2891076
Valstar, M., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW’06). p. 149 (2006)
Borza, D., Danescu, R., Itu, R., Darabant, A.: High-speed video system for micro-expression detection and recognition. Sensors. 17, 2913 (2017). https://doi.org/10.3390/s17122913
He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778 (2016)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated Residual Transformations for Deep Neural Networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5987–5995 (2017)
Yu, F., Wang, D., Shelhamer, E., Darrell, T.: Deep Layer Aggregation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2403–2412 (2018)
Yan, W.-J., Li, X., Wang, S.-J., Zhao, G., Liu, Y.-J., Chen, Y.-H., Fu, X.: CASME II: an improved spontaneous micro-expression database and the baseline evaluation. PLoS One 9, 1–8 (2014). https://doi.org/10.1371/journal.pone.0086041
Li, X., Pfister, T., Huang, X., Zhao, G., Pietikäinen, M.: A Spontaneous Micro-expression Database: Inducement, collection and baseline. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). pp. 1–6 (2013)
Davison, A.K., Lansley, C., Costen, N., Tan, K., Yap, M.H.: SAMM: A Spontaneous Micro-Facial Movement Dataset. IEEE Trans Affect Comput. 9, 116–129 (2018). https://doi.org/10.1109/TAFFC.2016.2573832
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Presented at the May (2012)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1–9 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014)
Liong, S.-T., Gan, Y.S., See, J., Khor, H.-Q., Huang, Y.-C.: Shallow Triple Stream Three-dimensional CNN (STSTNet) for Micro-expression Recognition. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019). pp. 1–5 (2019)
Zhou, L., Mao, Q., Xue, L.: Dual-Inception Network for Cross-Database Micro-Expression Recognition. In: 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019). pp. 1–5 (2019)
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China under Grant 62276118.
Funding
The National Natural Science Foundation of China, 62276118, 62276118, 62276118.
Author information
Authors and Affiliations
Contributions
XS: Supervision, Writing—review and editing, Investigation. JL: Writing—original draft, Software, Methodology. LS: Conceptualization, Validation. SH: Data curation, Resources, Validation.
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shu, X., Li, J., Shi, L. et al. RES-CapsNet: an improved capsule network for micro-expression recognition. Multimedia Systems 29, 1593–1601 (2023). https://doi.org/10.1007/s00530-023-01068-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-023-01068-z