[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3581783.3612546acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article
Open access

Variance-Aware Bi-Attention Expression Transformer for Open-Set Facial Expression Recognition in the Wild

Published: 27 October 2023 Publication History

Abstract

Despite the great accomplishments of facial expression recognition (FER) models in closed-set scenarios, they still lack open-world robustness when it comes to handling unknown samples. To address the demands of operating in an open environment, open-set FER models should improve their performance in rejecting unknown samples while maintaining their efficiency in recognizing known expressions. With this goal in mind, we propose an open-set FER framework named Variance-Aware Bi-Attention Expression Transformer (VBExT), which enhances conventional closed-set FER models with open-world robustness for unknown samples. Specifically, to make full use of the expression representation capabilities of learned features, we introduce a bi-attention feature augmentation mechanism that learns the important regions and integrates the hierarchical features extracted by the emotional CNN backbone. We also propose a variance-aware distribution modeling method that adapts to the diverse distribution of different expression classes in the open environment, thereby enhancing the detection ability of unknown expressions. Additionally, we have constructed a Fine-Grained Light Facial Expression dataset that includes 30 different light brightnesses to better validate the efficiency of VBExT. Extensive experiments and ablation studies show that VBExT significantly improves the performance of open-set FER and achieves state-of-the-art results on CFEE (lab, basic), RAF-DB (wild, basic+compound), and FGL-FE (multiple light brightnesses, basic).

References

[1]
Abhijit Bendale and Terrance E Boult. 2016. Towards open set deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1563--1572.
[2]
Michael J Black and Yaser Yacoob. 1997. Recognizing facial expressions in image sequences using local parameterized models of image motion. International Journal of Computer Vision 25, 1 (1997), 23--48.
[3]
Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O'Reilly, and Yan Tong. 2018. Island loss for learning discriminative features in facial expression recognition. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition. 302--309.
[4]
Guangyao Chen, Peixi Peng, Xiangqian Wang, and Yonghong Tian. 2021. Adversarial reciprocal points learning for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 11 (2021), 8065--8081.
[5]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In 9th International Conference on Learning Representations.
[6]
Shichuan Du, Yong Tao, and Aleix M Martinez. 2014. Compound facial expressions of emotion. Proceedings of the National Academy of Sciences 111, 15 (2014), E1454--E1462.
[7]
Paul Ekman and Wallace V Friesen. 1978. Facial action coding system. Environmental Psychology & Nonverbal Behavior (1978).
[8]
Zhiwen Fan, Yifan Jiang, Peihao Wang, Xinyu Gong, Dejia Xu, and Zhangyang Wang. 2022. Unified implicit neural stylization. In European Conference on Computer Vision. Springer, 636--654.
[9]
Zhiwen Fan, Huafeng Wu, Xueyang Fu, Yue Huang, and Xinghao Ding. 2018. Residual-guide network for single image deraining. In Proceedings of the 26th ACM international conference on Multimedia. 1751--1759.
[10]
Chuanxing Geng, Sheng-jun Huang, and Songcan Chen. 2020. Recent advances in open set recognition: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 10 (2020), 3614--3631.
[11]
Yunrui Guo, Guglielmo Camporese, Wenjing Yang, Alessandro Sperduti, and Lamberto Ballan. 2021. Conditional variational capsule network for open set recognition. In Proceedings of the IEEE International Conference on Computer Vision. 103--111.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[13]
Shu Kong and Deva Ramanan. 2021. Opengan: Open-set recognition via open data generation. In Proceedings of the IEEE International Conference on Computer Vision. 813--822.
[14]
Shan Li and Weihong Deng. 2019. Reliable crowdsourcing and deep localitypreserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing 28, 1 (2019), 356--370.
[15]
Shan Li and Weihong Deng. 2020. A deeper look at facial expression dataset bias. IEEE Transactions on Affective Computing 13, 2 (2020), 881--893.
[16]
Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing 28, 5 (2018), 2439--2450.
[17]
Daniel Lundqvist, Anders Flykt, and Arne Öhman. 1998. Karolinska directed emotional faces. Cognition and Emotion (1998).
[18]
Chenlei Lv, Zhongke Wu, Xingce Wang, and Mingquan Zhou. 2019. 3D facial expression modeling based on facial landmarks in single image. Neurocomputing 355 (2019), 155--167.
[19]
Lawrence Neal, Matthew Olson, Xiaoli Fern, Weng-Keen Wong, and Fuxin Li. 2018. Open set learning with counterfactual images. In Proceedings of the European Conference on Computer Vision. 613--628.
[20]
Stavros Ntalampiras, Ilyas Potamitis, and Nikos Fakotakis. 2011. Probabilistic novelty detection for acoustic surveillance under real-world conditions. IEEE Transactions on Multimedia 13, 4 (2011), 713--719.
[21]
Daniel Omeiza, Skyler Speakman, Celia Cintas, and Komminist Weldermariam. 2019. Smooth grad-cam: An enhanced inference level visualization technique for deep convolutional neural network models. arXiv preprint arXiv:1908.01224 (2019).
[22]
Poojan Oza and Vishal M Patel. 2019. C2ae: Class conditioned auto-encoder for open-set recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2307--2316.
[23]
Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7660--7669.
[24]
Walter J Scheirer, Anderson de Rezende Rocha, Archana Sapkota, and Terrance E Boult. 2012. Toward open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 7 (2012), 1757--1772.
[25]
Jianjian Shao, Zhenqian Wu, Yuanyan Luo, Shudong Huang, Xiaorong Pu, and Yazhou Ren. 2022. Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition. In Proceedings of the 30th ACM International Conference on Multimedia. 161--169.
[26]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[27]
Jingdong Wang, Ke Sun, Tianheng Cheng, Borui Jiang, Chaorui Deng, Yang Zhao, Dong Liu, Yadong Mu, Mingkui Tan, Xinggang Wang, et al. 2020. Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 10 (2020), 3349--3364.
[28]
Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6897--6906.
[29]
Lijuan Wang, Guoli Jia, Ning Jiang, Haiying Wu, and Jufeng Yang. 2022. EASE: Robust Facial Expression Recognition via Emotion Ambiguity-SEnsitive Cooperative Networks. In Proceedings of the 30th ACM International Conference on Multimedia. 218--227.
[30]
Yan Wang, Yixuan Sun, Wei Song, Shuyong Gao, Yiwen Huang, Zhaoyu Chen, Weifeng Ge, and Wenqiang Zhang. 2022. DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos. In Proceedings of the 30th ACM International Conference on Multimedia. 101--110.
[31]
Haoning Xi, Didier Aussel, Wei Liu, S Travis Waller, and David Rey. 2022. Singleleader multi-follower games for the regulation of two-sided mobility-as-a-service markets. European Journal of Operational Research (2022).
[32]
Haoning Xi, Liu He, Yi Zhang, and Zhen Wang. 2022. Differentiable road pricing for environment-oriented electric vehicle and gasoline vehicle users in the biobjective transportation network. Transportation Letters 14, 6 (2022), 660--674.
[33]
Haoning Xi, Yili Tang, S Travis Waller, and Amer Shalaby. 2023. Modeling, equilibrium, and demand management for mobility and delivery services in Mobility-as-a-Service ecosystems. Computer-Aided Civil and Infrastructure Engineering 38, 11 (2023), 1403--1423.
[34]
Zhen Xing,Weimin Tan, Ruian He, Yangle Lin, and Bo Yan. 2022. Co-Completion for Occluded Facial Expression Recognition. In Proceedings of the 30th ACM International Conference on Multimedia. 130--140.
[35]
Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, Qing Yang, and Cheng-Lin Liu. 2020. Convolutional prototype network for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 5 (2020), 2358--2370.
[36]
Xingyi Yang, Jingwen Ye, and Xinchao Wang. 2022. Factorizing knowledge in neural networks. In European Conference on Computer Vision. Springer, 73--91.
[37]
Xingyi Yang, Daquan Zhou, Songhua Liu, Jingwen Ye, and Xinchao Wang. 2022. Deep model reassembly. Advances in neural information processing systems 35 (2022), 25739--25753.
[38]
Ryota Yoshihashi, Wen Shao, Rei Kawakami, Shaodi You, Makoto Iida, and Takeshi Naemura. 2019. Classification-reconstruction learning for open-set recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4016--4025.
[39]
Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2018. From facial expression recognition to interpersonal relation prediction. International Journal of Computer Vision 126, 5 (2018), 550--569.
[40]
Guoying Zhao, Xiaohua Huang, Matti Taini, Stan Z Li, and Matti PietikäInen. 2011. Facial expression recognition from near-infrared videos. Image and Vision Computing 29, 9 (2011), 607--619.
[41]
Guoying Zhao and Matti Pietikainen. 2007. Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 6 (2007), 915--928.
[42]
Sicheng Zhao, Yue Gao, Xiaolei Jiang, Hongxun Yao, Tat-Seng Chua, and Xiaoshuai Sun. 2014. Exploring principles-of-art features for image emotion recognition. In Proceedings of the 22nd ACM international conference on Multimedia. 47--56.
[43]
Sicheng Zhao, Hongxun Yao, Yue Gao, Rongrong Ji, and Guiguang Ding. 2016. Continuous probability distribution prediction of image emotions via multitask shared sparse regression. IEEE Transactions on Multimedia 19, 3 (2016), 632--645.
[44]
Zengqun Zhao and Qingshan Liu. 2021. Former-dfer: Dynamic facial expression recognition transformer. In Proceedings of the 29th ACM International Conference on Multimedia. 1553--1561.

Cited By

View all
  • (2024)Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive PromptingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681583(5722-5731)Online publication date: 28-Oct-2024

Index Terms

  1. Variance-Aware Bi-Attention Expression Transformer for Open-Set Facial Expression Recognition in the Wild

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '23: Proceedings of the 31st ACM International Conference on Multimedia
      October 2023
      9913 pages
      ISBN:9798400701085
      DOI:10.1145/3581783
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. deep learning
      2. facial expression recognition
      3. open-set recognition

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      MM '23
      Sponsor:
      MM '23: The 31st ACM International Conference on Multimedia
      October 29 - November 3, 2023
      Ottawa ON, Canada

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)380
      • Downloads (Last 6 weeks)51
      Reflects downloads up to 10 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive PromptingProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681583(5722-5731)Online publication date: 28-Oct-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media