Abstract
Active perception/vision exploits the ability of robots to interact with their environment, for example move in space, towards increasing the quantity or quality of information obtained through their sensors and, thus, improving their performance in various perception tasks. Active face recognition is largely understudied in recent literature. Attempting to tackle this situation, in this paper, we propose an active approach that utilizes facial views produced by photorealistic facial image rendering. Essentially, the robot that performs the recognition selects the best among a number of candidate movements around the person of interest by simulating their results through view synthesis. This is accomplished by feeding the robot’s face recognizer with a real-world facial image acquired in the current position, generating synthesized views that differ by \(\pm \theta ^\circ \) from the current view and deciding, based on the confidence of the recognizer, whether to stay in place or move to the position that corresponds to one of the two synthesized views, in order to acquire a new real image with its sensor. Experimental results in three datasets verify the superior performance of the proposed method compared to the respective “static” approach, approaches based on the same face recognizer that involve synthetic face frontalization and synthesized views, random direction robot movement, robot movement towards a frontal location based on view angle estimation, as well as a state of the art active method. Results from a proof of concept simulation in a robotic simulator are also provided.
Similar content being viewed by others
References
Mendoza, M., Vasquez-Gomez, J.I., Taud, H., Sucar, L.E., Reta, C.: Supervised learning of the next-best-view for 3d object reconstruction. Pattern Recognit. Lett. 1(133), 224–31 (2020)
Delmerico, Jeffrey, Isler, Stefan, Sabzevari, Reza, Scaramuzza, Davide: A comparison of volumetric information gain metrics for active 3D object reconstruction. Auton. Robot. 42(2), 197–208 (2018)
Isler, Stefan., Sabzevari, Reza., Delmerico, Jeffrey., Scaramuzza, Davide.: An information gain formulation for active volumetric 3D reconstruction. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pages 3477–3484. IEEE, (2016)
Forster, Christian., Pizzoli, Matia., Scaramuzza, Davide.: Appearance-based active, monocular, dense reconstruction for micro aerial vehicles. In Proceedings of Robotics: Science and Systems, Berkeley, USA, July (2014)
Vasquez-Gomez, J.I., Troncoso, D., Becerra, I., Sucar, E., Murrieta-Cid, R.: Next-best-view regression using a 3D convolutional neural network. Mach. Vis. Appl. 32, 1–4 (2021)
Nakada, Masaki., Wang, Han., Terzopoulos, Demetri.: Acfr: Active face recognition using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 35–40, (2017)
Passalis, Nikolaos., Tefas, Anastasios.: Leveraging active perception for improving embedding-based deep face recognition. In Proceedings of IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), pages 1–6. IEEE, (2020)
Zeng, Rui, Wen, Yuhui, Zhao, Wang, Liu, Yong-Jin.: View planning in robot active vision: a survey of systems, algorithms, and applications. Comput. Vis. Med. 6(3), 225–245 (2020)
Rohner, Dorian., Henrich, Dominik.: Using active vision for enhancing an surface-based object recognition approach. In Proceedings of Fourth IEEE International Conference on Robotic Computing (IRC), pages 375–382. IEEE, (2020)
Kai, Xu., Shi, Yifei, Zheng, Lintao, Zhang, Junyu, Liu, Min, Huang, Hui, Hao, Su., Cohen-Or, Daniel, Chen, Baoquan: 3D attention-driven depth acquisition for object identification. ACM Trans. Graph. (TOG) 35(6), 1–14 (2016)
Wu, Zhirong., Song, Shuran., Khosla, Aditya., Yu, Fisher., Zhang, Linguang., Tang, Xiaoou., Xiao, Jianxiong.: 3D shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1912–1920, (2015)
Kai, Xu., Shi, Yifei, Zheng, Lintao, Zhang, Junyu, Liu, Min, Huang, Hui, Hao, Su., Cohen-Or, Daniel, Chen, Baoquan: 3D attention-driven depth acquisition for object identification. ACM Trans. Graph. (TOG) 35(6), 1–14 (2016)
Johns, Edward., Leutenegger, Stefan., Davison, Andrew J.: Pairwise decomposition of image sequences for active multi-view recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3813–3822, (2016)
Malmir, Mohsen., Sikka, Karan., Forster, Deborah., Fasel, Ian., Movellan, Javier R., Cottrell, Garrison W.: Deep active object recognition by joint label and action prediction. Comput. Vis. Image Understanding, 156:128–137, (2017)
Chen, S., Li, Y., Kwok, N.M.: Active vision in robotic systems: a survey of recent developments. Int. J. Robotics Res. 30(11), 1343–77 (2011)
Roy, S.D., Chaudhury, S., Banerjee, S.: Active recognition through next view planning: a survey. Pattern Recognition 37(3), 429–46 (2004)
de Croon, G.C., Sprinkhuizen-Kuyper, I.G., Postma, E.O.: Comparing active vision models. Image Vis. Comput. 27(4), 374–84 (2009)
Peng, Weixing, Wang, Yaonan, Miao, Zhiqiang, Feng, Mingtao, Tang, Yongpeng: Viewpoints planning for active 3-d reconstruction of profiled blades using estimated occupancy probabilities (EOP). IEEE Trans. Industr. Electron. 68(5), 4109–4119 (2020)
Duan, Qingyan, Zhang, Lei: Look more into occlusion: realistic face frontalization and recognition with boostgan. IEEE Trans. Neural Net. Learn. Syst. 32(1), 214–228 (2021)
Huang, Rui., Zhang, Shu., Li, Tianyu., He, Ran.: Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In Proceedings of the IEEE International Conference on Computer Vision, (ICCV), pages 2439–2448, (2017)
Liao, Jiashu., Kot, Alex., Guha, Tanaya., Sanchez, Victor.: Attention selective network for face synthesis and pose-invariant face recognition. In Proceedings of IEEE International Conference on Image Processing (ICIP), pages 748–752. IEEE, (2020)
Huan, Tu., Duoji, Gesang, Zhao, Qijun, Shuang, Wu.: Improved single sample per person face recognition via enriching intra-variation and invariant features. Appl. Sci. 10(2), 601 (2020)
Masi, Iacopo., Hassner, Tal., Tran, Anh Tuân., Medioni, Gérard.: Rapid synthesis of massive face sets for improved face recognition. In Proceedings of 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pages 604–611. IEEE, (2017)
Zhou, Hang., Liu, Jihao., Liu, Ziwei., Liu, Yu, Wang, Xiaogang.: Rotate-and-Render: Unsupervised photorealistic face rotation from single-view images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5911–5920, (2020)
Guo, Jianzhu., Zhu, Xiangyu., Lei, Zhen.: 3DDFA. https://github.com/cleardusk/3DDFA, (2018)
Guo, Jianzhu., Zhu, Xiangyu., Yang, Yang., Yang, Fan., Lei, Zhen., Li, Stan Z.: Towards fast, accurate and stable 3D dense face alignment. In Proceedings of the European Conference on Computer Vision (ECCV), pages 152–168. Springer International Publishing, (2020)
Zhu, X., Liu, X., Lei, Z., Li, S.Z.: Face alignment in full pose range: A 3d total solution. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 78–92 (2017)
Kato, Hiroharu., Ushiku, Yoshitaka., Harada, Tatsuya.: Neural 3D mesh renderer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3907–3916, (2018)
Wang, Qingzhong., Zhang, Pengfei., Xiong, Haoyi., Zhao, Jian.: Face. evolve: A high-performance face recognition library. arXiv preprint arXiv:2107.08621, (2021)
Passalis, N., Pedrazzi, S., Babuska, R., Burgard, W., Dias, D., Ferro, F., Gabbouj, M., Green, O., Iosifidis, A., Kayacan, E., Kober, J., Michel, O., Nikolaidis, N., Nousi, P., ieters, R., Tzelepi, M., Valada, A., Tefas, A.: OpenDR: an open toolkit for enabling Hhigh performance, low footprint deep learning for robotics. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 12479–12484. IEEE
OpenDR: A modular, open and non-proprietary toolkit for core robotic functionalities by harnessing deep learning. https://github.com/opendr-eu/opendr. Accessed: 2022-06-27
He, Kaiming., Zhang, Xiangyu., Ren, Shaoqing., Sun,Jian.: Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, (2016)
Deng, Jiankang., Guo, Jia., Xue, Niannan., Zafeiriou, Stefanos.: Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4690–4699, (2019)
Zhang, Kaipeng, Zhang, Zhanpeng, Li, Zhifeng, Qiao, Yu.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Deng, Jiankang., Guo, Jia., Ververas, Evangelos., Kotsia, Irene., Zafeiriou, Stefanos.: Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5203–5212, (2020)
OpenDR Face Detection module: RetinaFace. https://github.com/opendr-eu/opendr/blob/master/docs/reference/face-detection-2d-retinaface.md. Accessed: 2022-06-27
Learned-Miller, Gary B. Huang Erik.: Labeled faces in the wild: Updates and new reporting procedures. Technical Report UM-CS-2014-003, University of Massachusetts, Amherst, May (2014)
Gourier, Nicolas., Hall, Daniela., Crowley, James L.: Estimating face orientation from robust detection of salient facial structures. In Proceedings of Workshop on Visual Observation of Deictic Cestures, volume 6, page 7. FGnet (IST–2000–26434) Cambridge, UK, (2004)
Sherrah, Jamie, Gong, Shaogang: Fusion of perceptual cues for robust tracking of head pose and position. Pattern Recogn. 34(8), 1565–1572 (2001)
Georgiadis, Charalampos.: Generation of a synthetic annotated dataset for training and evaluating active perception methods. BSc Thesis, Aristotle University of Thessaloniki, (2022) https://doi.org/10.13140/RG.2.2.21002.34248.
Webots. http://www.cyberbotics.com. Open-source Mobile Robot Simulation Software
Michel, O.: Webots: professional mobile robot simulation. J. Adv. Robotics Syst. 1(1), 39–42 (2004)
Acknowledgements
The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 871449 (OpenDR). This publication reflects only the authors views. The European Union is not liable for any use that may be made of the information contained therein.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kakaletsis, E., Nikolaidis, N. Using synthesized facial views for active face recognition. Machine Vision and Applications 34, 62 (2023). https://doi.org/10.1007/s00138-023-01412-3
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-023-01412-3