Abstract
Facial action unit (AU) plays an essential role in human facial behavior analysis. Despite the progress made in frame-level AU analysis, the discrete classification results provided by previous work are not explicit enough for the analysis required by many real-world applications, and as AU is a dynamic process, sequence-level analysis maintaining a global view has yet been gravely ignored in the literature. To fill in the blank, we propose a multi-label AU proposal generation task for sequence-level facial action analysis. To tackle the task, we design AUPro, which takes a video clip as input and directly generates proposals for each AU category. Extensive experiments conducted on two commonly used AU benchmark datasets, BP4D and DISFA, show the superiority of our proposed method.
Y. Chen and J. Zhang—Equal Contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS - improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017 (2017)
Chen, Y., Wu, H., Wang, T., Wang, Y., Liang, Y.: Cross-modal representation learning for lightweight and accurate facial action unit detection. IEEE Robot. Autom. Lett. 6(4), 7619–7626 (2021)
Cohn, J.F., Schmidt, K.: The timing of facial motion in posed and spontaneous smiles. In: Active Media Technology, pp. 57–69. World Scientific (2003)
Ekman, P., Friesen, W.: Facial action coding system: a technique for the measurement of facial movement (1978)
Gao, J., Yang, Z., Chen, K., Sun, C., Nevatia, R.: Turn TAP: temporal unit regression network for temporal action proposals. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017 (2017)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Teh, Y.W., Titterington, M. (eds.) Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010, vol. 9, pp. 249–256. PMLR (2010)
He, J., Li, D., Yang, B., Cao, S., Sun, B., Yu, L.: Multi view facial action unit detection based on CNN and BLSTM-RNN. In: International Conference on Automatic Face and Gesture Recognition (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Li, G., Zhu, X., Zeng, Y., Wang, Q., Lin, L.: Semantic relationships guided representation learning for facial action unit recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8594–8601 (2019)
Li, W., Abtahi, F., Zhu, Z.: Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Lin, C., et al.: Fast learning of temporal action proposal via dense boundary generator. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11499–11506 (2020)
Lin, T., Liu, X., Li, X., Ding, E., Wen, S.: BMN: boundary-matching network for temporal action proposal generation (2019)
Lin, T., Zhao, X., Su, H., Wang, C., Yang, M.: BSN: boundary sensitive network for temporal action proposal generation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 3–21. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_1
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization (2019)
Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. IEEE Trans. Affect. Comput. 4, 151–160 (2013)
Niu, X., Han, H., Shan, S., Chen, X.: Multi-label co-regularization for semi-supervised facial action unit recognition. arXiv preprint arXiv:1910.11012 (2019)
Schmidt, K.L., Ambadar, Z., Cohn, J.F., Reed, L.I.: Movement differences between deliberate and spontaneous facial expressions: Zygomaticus major action in smiling. J. Nonverbal Behav. 30(1), 37–52 (2006)
Senechal, T., Rapp, V., Salam, H., Seguier, R., Bailly, K., Prevost, L.: Combining AAM coefficients with LGBP histograms in the multi-kernel SVM framework to detect facial action units. In 2011 IEEE International Conference on Automatic Face Gesture Recognition (FG), pp. 860–865 (2011)
Shou, Z., Wang, D., Chang, S.-F.: Temporal action localization in untrimmed videos via multi-stage CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016 (2016)
Tong, Y., Liao, W., Ji, Q.: Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Trans. Pattern Anal. Mach. Intell. 29(10), 1683–1699 (2007)
Walecki, R., Rudovic, O., Pavlovic, V., Pantic, M.: Copula ordinal regression framework for joint estimation of facial action unit intensity. IEEE Trans. Affect. Comput. 10(3), 297–312 (2017)
Wang, C., Wang, S.: Personalized multiple facial action unit recognition through generative adversarial recognition network. In: Proceedings of the 26th ACM international conference on Multimedia, pp. 302–310 (2018)
Zhang, X., et al.: BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014)
Zhang, Y., Jiang, H., Wu, B., Fan, Y., Ji, Q.: Context-aware feature and label fusion for facial action unit intensity estimation with partially labeled data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 733–742 (2019)
Zhao, K., Chu, W., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Zhao, Y., Xiong, Y., Wang, L., Wu, Z., Tang, X., Lin, D.: Temporal action detection with structured segment networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2017 (2017)
Zhu, Y., De la Torre, F., Cohn, J.F., Zhang, Y.J.: Dynamic cascades with bidirectional bootstrapping for action unit detection in spontaneous facial behavior. IEEE Trans. Affect. Comput. 2(2), 79–91 (2011)
Acknowledgements
This work is in part supported by the PKU-NTU Joint Research Institute (JRI) sponsored by a donation from the Ng Teng Fong Charitable Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chen, Y., Zhang, J., Chen, D., Wang, T., Wang, Y., Liang, Y. (2021). AUPro: Multi-label Facial Action Unit Proposal Generation for Sequence-Level Analysis. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13110. Springer, Cham. https://doi.org/10.1007/978-3-030-92238-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-92238-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92237-5
Online ISBN: 978-3-030-92238-2
eBook Packages: Computer ScienceComputer Science (R0)