[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Using attention LSGB network for facial expression recognition

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Both the multiple sources of the available in-the-wild datasets and noisy information of images lead to huge challenges for discriminating subtle distinctions between combinations of regional expressions in facial expression recognition (FER). Although deep learning-based approaches have made substantial progresses in FER in recent years, small-scale datasets result in over-fitting during training. To this end, we propose a novel LSGB method which focuses on discriminative attention regions accurately and pretrain the model on ImageNet with the aim of alleviating the problem of over-fitting. Specifically, a more efficient manner combined with a key map, multiple partial maps and a position map is presented in local relation (LR) module to construct higher-level entities through compositional relationship of local pixel pairs. A compact global weighted representation is aggregated by region features, of which the weight is obtained by putting original and regional images to the sequential layer of self-attention module. Finally, extensive experiments are conducted to verify the effectiveness of our proposal. The experimental results on three popular benchmarks demonstrate the superiority of our network with 88.8% on FERplus, 58.68% on AffectNet and 94.9% on JAFFE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and material

Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

References

  1. Jack RE, Garrod OG, Yu H, Caldara R, Schyns PG (2012) Facial expressions of emotion are not culturally universal. In Proc Nat Acad Sci 109(19):7241–7244

    Article  Google Scholar 

  2. Tian YI, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115

    Article  Google Scholar 

  3. Bai Y, Gao C, Singh S et al (2018) A framework of rapid regional tsunami damage recognition from post-event terraSAR-x imagery using deep neural networks. IEEE Geosci Remote Sens Lett 15(1):43–47

    Article  Google Scholar 

  4. M Valstar and M Pantic (2010) “Induced disgust, happiness and surprise: an addition to the mmi facial expression database.” In Proceeding of 2010 IEEE 3rd intern workshop on emotion corpora for research on emotion and affect pp 65–70

  5. M Lyons, S Akamatsu, M Kamachi, and J Gyoba (1998) “Coding facial expressions with gabor wavelets.” In Proceeding of 1998 IEEE 3rd international conference on automatic face and gesture recognition, Nara, Japan, pp 200–205

  6. A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, and A N Gomez (2017) “Attention is all you need.” In Proc. 2017 IEEE 31st advances in neural information processing systems, long beach, USA, , pp 6000–6010

  7. E Barsoum, C Zhang, C C Ferrer, and Z Zhang (2016) “Training deep networks for facial expression recognition with crowd-sourced label distribution.” In Proc. the 2019 IEEE 18th ACM international conference on multimodal interaction, Tokyo, Japan, pp 27–28

  8. Zhao G, Huang X, Taini M, Li SZ, PietikaInen M (2011) Facial expression recognition from near-infrared videos. Image Vis Comput 29(9):607–619

    Article  Google Scholar 

  9. A Dhall, O Ramana Murthy, R Goecke, J Joshi, and T Gedeon (2015) “Video and image based emotion recognition challenges in the wild.” In Proc. 2015 IEEE 9th ACM international conference on multimodal interaction (ICMI): Emotiw 2015, Seattle, USA, pp 423–426

  10. C F Benitez-Quiroz, R Srinivasan, and A M Martinez (2016) “Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild.” In Proc. the 2016 IEEE 29th computer vision and pattern recognition (CVPR), Las Vegas, USA, pp 5562–5570

  11. S Li, W Deng, and J Du (2017) “Reliable crowd-sourcing and deep locality preserving learning for expression recognition in the wild.” In proceeding of the 2017 IEEE 30th computer vision and pattern recognition (CVPR), Hawaii, USA, pp 2584–2593

  12. Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. Trans Affect Comput 10(1):18–31

    Article  Google Scholar 

  13. Zhang Z, Luo P, Chen CL, Tang X (2018) From facial expression recognition to interpersonal relation prediction. Int J Comput Vision 126(5):1–20

    Article  MathSciNet  Google Scholar 

  14. Shan C, Gong S, Mcowan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816

    Article  Google Scholar 

  15. Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928

    Article  Google Scholar 

  16. Y Yaddaden, M Adda, and A Bouzouane (2021) “Facial expression recognition using locally linear embedding with LBP and HOG descriptors.” In Proceeding of the 2021 2nd International workshop on human-centric smart environments for health and well-being (IHSH) Boumerdes, Algeria, 221–226

  17. Zhi R, Flierl M, Ruan Q, Kleijn WB (2011) “Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition”, IEEE transactions on systems, man, and cybernetics. Part B (Cybernetics) 41(1):38–52

    Article  Google Scholar 

  18. L Zhong, Q Liu, P Yang, B Liu, J Huang, and D N Metaxas (2012) “Learning active facial patches for expression analysis.” In proceeding of the 2012 IEEE 25th computer vision and pattern recognition (CVPR) Providence, USA, 2562–2569

  19. Pauline CN, Steven H (2003) Sift: predicting amino acid changes that affect protein function. Nucleic Acids Res 31(13):3812–3814

    Article  Google Scholar 

  20. Chengjun L, Wechsler H (2002) Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Trans Image Process 11(4):467–476

    Article  Google Scholar 

  21. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  22. R Vemulapalli, and A Agarwala (2019) “A compact embedding for facial expression similarity.” In Proceeding of the 2019 IEEE 20th computer vision and pattern recognition (CVPR), Long Beach, USA, 5683–5692

  23. X Niu, H Han,, S Yang, Y Huang, and S Shan 9(2019) “Local relationship learning with person specific shape regularization for facial action unit detection.” In proceeding of the 2019 IEEE 20th computer vision and pattern recognition (CVPR), Long Beach, USA, pp 11917–11926

  24. K Zhao, W S Chu, F Torre, J F Cohn, and H Zhang (2015) “Joint patch and multi-label learning for facial action unit detection.” In Proceeding of the 2015 IEEE 18th computer vision and pattern recognition (CVPR), Boston, USA, 2207–2216

  25. Y Li, J Zeng, S Shan, and X Chen (2019) “Self-supervised representation learning from videos for facial action unit detection.” In Proceeding of the 2019 IEEE 20th computer vision and pattern recognition (CVPR), Long Beach, USA, pp 10916–10925

  26. Ekman P, Friesen WV (1978) Facial action coding system: a technique for the measurement of facial movement. Riv Psichiatr 47(2):126–138

    Google Scholar 

  27. Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069

    Article  MATH  Google Scholar 

  28. Li Y, Zeng J, Shan S, Chen X (2019) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Proc 28(5):2439–2450

    Article  MathSciNet  Google Scholar 

  29. K Wang, X Peng, J Yang, S Lu, and Y Qiao (2020) “Suppressing uncertainties for large-scale facial expression recognition.” In proceedings of 2020 IEEE 21th computer vision and pattern recognition (CVPR), Seattle, USA, pp 6896–6905

  30. J She, Y Hu, H Shi, J Wang, Q Shen, and T Mei (2021) “Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition.” In Proceeding of 2021 IEEE 22th computer vision and pattern recognition (CVPR), Nashville, USA, pp 6244–6253

  31. D Ruan, Y Yan, S Lai, Z Chai, C Shen, and H Wang (2021) “Feature decomposition and reconstruction learning for effective facial expression recognition.” In Proceeding of 2021 IEEE 22th computer vision and pattern recognition (CVPR), Nashville, USA, 7656–7665

  32. Gera D, Balasubramanian S (2021) Landmark guidance independent spatio channel attention and complementary context information based facial expression recognition. Pattern Recogn Lett 145:58–66

    Article  Google Scholar 

  33. Q Cao, L Shen, W Xie, O M Parkhi, and A Zisserman (2018) “VGGFACE2: a dataset for recognising face across pose and age.” In proceeding of the 2018 IEEE 13th international conference on automatic face & gesture recognition (FG), Xi'an, China, 67–74

  34. Norouzi E, Ahmadabadi MN, Araabi BN (2011) Attention control with reinforcement learning for face recognition under partial occlusion. Mach Vis Appl 22(2):337–348

    Article  Google Scholar 

  35. D Meng, X Peng, K Wang and Y Qiao (2019) “Attention networks for facial expression recognition in videos.” In proceedings of the 2019 IEEE 26th international conference on image processing (ICIP), Taiwan, 3866–3870

  36. L Zhao, L Xi, Y Zhuang, and J Wang (2017) “Deeply-learned part-aligned representations for person re-identification.” In proceeding of . the 2017 IEEE 16th international conference on computer vision (ICCV), Italy, 3239–3248

  37. Xie S, Hu H, Wu Y (2019) Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn 92:177–191

    Article  Google Scholar 

  38. Long X, Melo GD, He D (2020) Purely attention based local feature integration for video classification. IEEE Trans Software Eng 14:99

    Google Scholar 

  39. J Wang, Y Yuan, and G Yu (2017) “Face attention network: an effective face detector for the occluded faces.” CoRR, abs/1711.07246

  40. J Yang, P Ren, D Zhang, D Chen, F Wen and H Li (2017) “Neural aggregation network for video face recognition.” In proceedings of the 2017 19th IEEE conference on computer vision and pattern recognition (CVPR). Honolulu, USA, pp 5216–5225

  41. Hengshun Zhou, Debin Meng, Yuanyuan Zhang, (2019) “Exploring emotion features and fusion strategies for audio-video emotion recognition ”. In proceeding of the 2019 international conference on multimodal interaction, Suzhou, China, pp 562–566

  42. V Kazemi, and J Sullivan (2014) “One millisecond face alignment with an ensemble of regression trees.” In proceeding of the 2014 IEEE 26th computer vision and pattern recognition (CVPR), Columbus, USA, pp 1867–1874

  43. Fan X, Jiang W, Luo H, Fei M (2019) Spherereid: deep hypersphere manifold embedding for person re-identification. J Vis Commun Image Represent 60:51–58

    Article  Google Scholar 

  44. H Hu, Z Zhang, Z Xie, and S Lin (2019) “Local relation networks for image recognition.” In proceedings the 2019 IEEE 17th international conference on computer vision (ICCV), Seoul, Korea, 3463–3472

  45. Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S (2020) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimedia 22(10):2597–2609

    Article  Google Scholar 

  46. C Huang (2017) “Combining convolutional neural networks for emotion recognition.” In proceedings of the 2017 IEEE MIT undergraduate research technology conference (URTC), Cambridge, USA, pp 1–4

  47. J Zeng, S Shan, and X Chen (2018) “Facial expression recognition with inconsistently annotated datasets.” In proceeding of the 2018 IEEE 15th European conference on computer vision (ECCV), Munich, Germany, pp 1–16

  48. Minaee S, Abdolrashidi A (2021) Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9):3046

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deyu Lin.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Su, C., Wei, J., Lin, D. et al. Using attention LSGB network for facial expression recognition. Pattern Anal Applic 26, 543–553 (2023). https://doi.org/10.1007/s10044-022-01124-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-022-01124-w

Keywords

Navigation