3D Face Reconstruction with Dense Landmarks

Erroll Wood¹²,
Tadas Baltrušaitis¹²,
Charlie Hewitt¹²,
Matthew Johnson¹²,
Jingjing Shen¹²,
Nikola Milosavljević¹³,
Daniel Wilde¹²,
Stephan Garbin¹²,
Toby Sharp¹²,
Ivan Stojiljković¹³,
Tom Cashman¹² &
…
Julien Valentin¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13673))

Included in the following conference series:

European Conference on Computer Vision

4030 Accesses
55 Citations
3 Altmetric

Abstract

Landmarks often play a key role in face analysis, but many aspects of identity or expression cannot be represented by sparse landmarks alone. Thus, in order to reconstruct faces more accurately, landmarks are often combined with additional signals like depth images or techniques like differentiable rendering. Can we keep things simple by just using more landmarks? In answer, we present the first method that accurately predicts 10$\times $ as many landmarks as usual, covering the whole head, including the eyes and teeth. This is accomplished using synthetic training data, which guarantees perfect landmark annotations. By fitting a morphable model to these dense landmarks, we achieve state-of-the-art results for monocular 3D face reconstruction in the wild. We show that dense landmarks are an ideal signal for integrating face shape information across frames by demonstrating accurate and expressive facial performance capture in both monocular and multi-view scenarios. Finally, our method is highly efficient: we can predict dense landmarks and fit our 3D face model at over 150FPS on a single CPU thread. Please see our website: https://microsoft.github.io/DenseLandmarks/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 79.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 99.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

3D Facial Reconstruction from a Single Image Using a Hybrid Model Based on 3DMM and Deep Learning

“Look Ma, No Landmarks!” – Unsupervised, Model-Based Dense Face Alignment

The $$2^\mathrm{nd}$$ 106-Point Lightweight Facial Landmark Localization Grand Challenge

References

Alp Güler, R., Trigeorgis, G., Antonakos, E., Snape, P., Zafeiriou, S., Kokkinos, I.: DenseReg: fully convolutional dense shape regression in-the-wild. In: CVPR (2017)
Google Scholar
Bagdanov, A.D., Del Bimbo, A., Masi, I.: The Florence 2D/3D hybrid face dataset. In: Workshop on Human Gesture and Behavior Understanding. ACM (2011)
Google Scholar
Bai, Z., Cui, Z., Liu, X., Tan, P.: Riggable 3D face reconstruction via in-network optimization. In: CVPR (2021)
Google Scholar
Beeler, T., Bickel, B., Beardsley, P., Sumner, B., Gross, M.: High-quality single-shot capture of facial geometry. In: ACM Transactions on Graphics (2010)
Google Scholar
Beeler, T., et al.: High-quality passive facial performance capture using anchor frames. In: ACM Transactions on Graphics (2011)
Google Scholar
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Computer Graphics and Interactive Techniques (1999)
Google Scholar
Blanz, V., Vetter, T.: Face recognition based on fitting a 3d morphable model. TPAMI 25(9), 1063–1074 (2003)
Article Google Scholar
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep It SMPL: automatic estimation of 3d human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34
Chapter Google Scholar
Bradley, D., Heidrich, W., Popa, T., Sheffer, A.: High resolution passive facial performance capture. In: ACM Transactions on Graphics, vol. 29, no. 4 (2010)
Google Scholar
Browatzki, B., Wallraven, C.: 3FabRec: Fast Few-shot Face alignment by Reconstruction. In: CVPR (2020)
Google Scholar
Bulat, A., Sanchez, E., Tzimiropoulos, G.: Subpixel heatmap regression for facial landmark Localization. In: BMVC (2021)
Google Scholar
Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: ICCV (2017)
Google Scholar
Cao, C., Chai, M., Woodford, O., Luo, L.: Stabilized real-time face tracking via a learned dynamic rigidity prior. ACM Trans. Graph. 37(6), 1–11 (2018)
Google Scholar
Chandran, P., Bradley, D., Gross, M., Beeler, T.: Semantic deep face models. In: International Conference on 3D Vision (3DV) (2020)
Google Scholar
Cong, M., Lan, L., Fedkiw, R.: Local geometric indexing of high resolution data for facial reconstruction from sparse markers. CoRR abs/1903.00119 (2019). www.arxiv.org/abs/1903.00119
Deng, J., Guo, J., Ververas, E., Kotsia, I., Zafeiriou, S.: RetinaFace: single-shot multi-level face localisation in the wild. In: CVPR (2020)
Google Scholar
Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3d face reconstruction with weakly-supervised learning: from single image to image set. In: CVPR Workshops (2019)
Google Scholar
Dib, A., et al.: Practical face reconstruction via differentiable ray tracing. Comput. Graph. Forum 40(2), 153–164 (2021)
Article Google Scholar
Dib, A., Thebault, C., Ahn, J., Gosselin, P.H., Theobalt, C., Chevallier, L.: Towards high fidelity monocular face reconstruction with rich reflectance using self-supervised learning and ray tracing. In: CVPR (2021)
Google Scholar
Dou, P., Kakadiaris, I.A.: Multi-view 3D face reconstruction with deep recurrent neural networks. Image Vis. Comput. 80, 80–91 (2018)
Article Google Scholar
Dou, P., Shah, S.K., Kakadiaris, I.A.: End-to-end 3D face reconstruction with deep neural networks. In: CVPR (2017)
Google Scholar
Falcon, W., et al.: Pytorch lightning 3(6) (2019). GitHub. Note. https://github.com/PyTorchLightning/pytorch-lightning
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph. (ToG) 40(4), 1–13 (2021)
Article Google Scholar
Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3d face reconstruction and dense alignment with position map regression network. In: ECCV (2018)
Google Scholar
Garrido, P., et al.: Reconstruction of personalized 3d face rigs from monocular video. ACM Trans. Graph. 35(3), 1–15 (2016)
Article Google Scholar
Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3d morphable model regression. In: CVPR (2018)
Google Scholar
Gerig, T., et al.: Morphable face models-an open framework. In: Automatic Face & Gesture Recognition (FG). IEEE (2018)
Google Scholar
Grishchenko, I., Ablavatski, A., Kartynnik, Y., Raveendran, K., Grundmann, M.: Attention mesh: high-fidelity face mesh prediction in real-time. In: CVPR Workshops (2020)
Google Scholar
Güler, R.A., Neverova, N., Kokkinos, I.: Densepose: dense human pose estimation in the wild. In: CVPR (2018)
Google Scholar
Guo, J., Zhu, X., Yang, Y., Yang, F., Lei, Z., Li, S.Z.: Towards fast, accurate and stable 3d dense face alignment. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 152–168. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_10
Chapter Google Scholar
Guo, Y., Cai, J., Jiang, B., Zheng, J., et al.: Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. TPAMI 41(6), 1294–1307 (2018)
Article Google Scholar
Han, S., et al.: Megatrack: monochrome egocentric articulated hand-tracking for virtual reality. ACM Trans. Graph. (TOG) 39(4), 1–87 (2020)
Article Google Scholar
Hassner, T., Harel, S., Paz, E., Enbar, R.: Effective face frontalization in unconstrained images. In: CVPR (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Jeni, L.A., Cohn, J.F., Kanade, T.: Dense 3D face alignment from 2D videos in real-time. In: Automatic Face and Gesture Recognition (FG). IEEE (2015)
Google Scholar
Kartynnik, Y., Ablavatski, A., Grishchenko, I., Grundmann, M.: Real-time facial surface geometry from monocular video on mobile GPUs. In: CVPR Workshops (2019)
Google Scholar
Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Kumar, A., et al.: Luvli face alignment: estimating landmarks’ location, uncertainty, and visibility likelihood. In: CVPR (2020)
Google Scholar
Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: SIGGRAPH (2000)
Google Scholar
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. In: ACM Transactions on Graphics, (Proceedings SIGGRAPH Asia) (2017)
Google Scholar
Li, Y., Yang, S., Zhang, S., Wang, Z., Yang, W., Xia, S.T., Zhou, E.: Is 2d heatmap representation even necessary for human pose estimation? (2021)
Google Scholar
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989). https://doi.org/10.1007/BF01589116
Article MathSciNet MATH Google Scholar
Liu, F., Zhu, R., Zeng, D., Zhao, Q., Liu, X.: Disentangling features in 3D face shapes for joint face reconstruction and recognition. In: CVPR (2018)
Google Scholar
Liu, Y., Jourabloo, A., Ren, W., Liu, X.: Dense face alignment. In: ICCV Workshops (2017)
Google Scholar
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2019)
Google Scholar
Morales, A., Piella, G., Sukno, F.M.: Survey on 3d face reconstruction from uncalibrated images. Comput. Sci. Rev. 40, 100400 (2021)
Article Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)
Google Scholar
Piotraschke, M., Blanz, V.: Automated 3D face reconstruction from multiple images using quality measures. In: CVPR (2016)
Google Scholar
Popa, T., South-Dickinson, I., Bradley, D., Sheffer, A., Heidrich, W.: Globally consistent space-time reconstruction. Comput. Graph. Forum 29(5), 1633–1642 (2010)
Article Google Scholar
Richardson, E., Sela, M., Kimmel, R.: 3D face reconstruction by learning from synthetic data. In: 3DV. IEEE (2016)
Google Scholar
Richardson, E., Sela, M., Or-El, R., Kimmel, R.: Learning detailed face reconstruction from a single image. In: CVPR (2017)
Google Scholar
Sagonas, C., Antonakos, E., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: database and results. Image Vis. Computi. (IMAVIS) 47, 3–18 (2016)
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenet V2: Inverted residuals and linear bottlenecks. In: CVPR (2018)
Google Scholar
Sanyal, S., Bolkart, T., Feng, H., Black, M.: Learning to regress 3d face shape and expression from an image without 3d supervision. In: CVPR (2019)
Google Scholar
Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR (2006)
Google Scholar
Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: ICCV (2017)
Google Scholar
Shang, J.: Self-supervised monocular 3d face reconstruction by occlusion-aware multi-view geometry consistency. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 53–70. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_4
Chapter Google Scholar
Taylor, J., et al.: Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences. ACM Trans. Graph. (ToG) 35(4), 1–12 (2016)
Article Google Scholar
Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: CVPR (2012)
Google Scholar
Tewari, A., et al.: FML: face model learning from videos. In: CVPR (2019)
Google Scholar
Tewari, A., et al: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR (2018)
Google Scholar
Tewari, A., et al.: Mofa: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: ICCV Workshops (2017)
Google Scholar
Thies, J., Zollhöfer, M., Nießner, M., Valgaerts, L., Stamminger, M., Theobalt, C.: Real-time expression transfer for facial reenactment. ACM Trans. Graph. 34(6), 1–183 (2015)
Article Google Scholar
Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: CVPR (2016)
Google Scholar
Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: CVPR (2019)
Google Scholar
Tran, L., Liu, X.: Nonlinear 3d face morphable model. In: CVPR (2018)
Google Scholar
Tuan Tran, A., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3D morphable models with a very deep neural network. In: CVPR (2017)
Google Scholar
Wang, X., Bo, L., Fuxin, L.: Adaptive wing loss for robust face alignment via heatmap regression. In: ICCV (2019)
Google Scholar
Wightman, R.: Pytorch image models (2019). https://www.github.com/rwightman/pytorch-image-models, https://doi.org/10.5281/zenodo.4414861
Wood, E., et al.: Fake it till you make it: Face analysis in the wild using synthetic data alone (2021)
Google Scholar
Wu, W., Qian, C., Yang, S., Wang, Q., Cai, Y., Zhou, Q.: Look at boundary: a boundary-aware face alignment algorithm. In: CVPR (2018)
Google Scholar
Yi, H., et al.: MMFace: a multi-metric regression network for unconstrained face reconstruction. In: CVPR (2019)
Google Scholar
Yoon, J.S., Shiratori, T., Yu, S.I., Park, H.S.: Self-supervised adaptation of high-fidelity face models for monocular performance tracking. In: CVPR (2019)
Google Scholar
Zhou, Y., Deng, J., Kotsia, I., Zafeiriou, S.: Dense 3d face decoding over 2500fps: joint texture & shape convolutional mesh decoders. In: CVPR (2019)
Google Scholar
Zhu, M., Shi, D., Zheng, M., Sadiq, M.: Robust facial landmark detection via occlusion-adaptive deep networks. In: CVPR (2019)
Google Scholar
Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: CVPR (2016)
Google Scholar
Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–155 (2016)
Google Scholar
Zollhöfer, M., et al.: State of the art on monocular 3d face reconstruction, tracking, and applications. Comput. Graph. Forum 37(2), 523–550 (2018)
Article Google Scholar

Download references

Acknowledgements

Thanks to Chirag Raman and Jamie Shotton for their contributions, and Jiaolong Yang and Timo Bolkart for help with evaluation.

Author information

Authors and Affiliations

Microsoft, Cambridge, UK
Erroll Wood, Tadas Baltrušaitis, Charlie Hewitt, Matthew Johnson, Jingjing Shen, Daniel Wilde, Stephan Garbin, Toby Sharp & Tom Cashman
Microsoft, Belgrade, Serbia
Nikola Milosavljević & Ivan Stojiljković
Microsoft, Zurich, Switzerland
Julien Valentin

Authors

Erroll Wood
View author publications
You can also search for this author in PubMed Google Scholar
Tadas Baltrušaitis
View author publications
You can also search for this author in PubMed Google Scholar
Charlie Hewitt
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Jingjing Shen
View author publications
You can also search for this author in PubMed Google Scholar
Nikola Milosavljević
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Wilde
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Garbin
View author publications
You can also search for this author in PubMed Google Scholar
Toby Sharp
View author publications
You can also search for this author in PubMed Google Scholar
Ivan Stojiljković
View author publications
You can also search for this author in PubMed Google Scholar
Tom Cashman
View author publications
You can also search for this author in PubMed Google Scholar
Julien Valentin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erroll Wood .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 18084 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wood, E. et al. (2022). 3D Face Reconstruction with Dense Landmarks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13673. Springer, Cham. https://doi.org/10.1007/978-3-031-19778-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-19778-9_10
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19777-2
Online ISBN: 978-3-031-19778-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D Face Reconstruction with Dense Landmarks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Facial Reconstruction from a Single Image Using a Hybrid Model Based on 3DMM and Deep Learning

“Look Ma, No Landmarks!” – Unsupervised, Model-Based Dense Face Alignment

The $$2^\mathrm{nd}$$ 106-Point Lightweight Facial Landmark Localization Grand Challenge

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 18084 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

3D Face Reconstruction with Dense Landmarks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Facial Reconstruction from a Single Image Using a Hybrid Model Based on 3DMM and Deep Learning

“Look Ma, No Landmarks!” – Unsupervised, Model-Based Dense Face Alignment

The $$2^\mathrm{nd}$$ 106-Point Lightweight Facial Landmark Localization Grand Challenge

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 18084 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation