[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-72992-8_15guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360

Published: 30 October 2024 Publication History

Abstract

Creating a 360 parametric model of a human head is a very challenging task. While recent advancements have demonstrated the efficacy of leveraging synthetic data for building such parametric head models, their performance remains inadequate in crucial areas such as expression-driven animation, hairstyle editing, and text-based modifications. In this paper, we build a dataset of artist-designed high-fidelity human heads and propose to create a novel parametric 360 renderable parametric head model from it. Our scheme decouples the facial motion/shape and facial appearance, which are represented by a classic parametric 3D mesh model and an attached neural texture, respectively. We further propose a training method for decompositing hairstyle and facial appearance, allowing free-swapping of the hairstyle. A novel inversion fitting method is presented based on single image input with high generalization and fidelity. To the best of our knowledge, our model is the first parametric 3D full-head that achieves 360 free-view synthesis, image-based fitting, appearance editing, and animation within a single model. Experiments show that facial motions and appearances are well disentangled in the parametric space, leading to SOTA performance in rendering and animating quality. The code and SynHead100 dataset are released in https://nju-3dv.github.io/projects/Head360.

References

[1]
An, S., Xu, H., Shi, Y., Song, G., Ogras, U.Y., Luo, L.: PanoHead: geometry-aware 3D full-head synthesis in 360deg. In: CVPR, pp. 20950–20959 (2023)
[3]
Bagautdinov, T., Wu, C., Saragih, J., Fua, P., Sheikh, Y.: Modeling facial geometry using compositional VAEs. In: CVPR, pp. 3877–3886 (2018)
[4]
Baocai Y, Yanfeng S, Chengzhang W, and Yun G BJUT-3D large scale 3D face database and information processing J. Comput. Res. Dev. 2009 6 020 4
[5]
Blanz, V., Vetter, T., et al.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, vol. 99, pp. 187–194 (1999)
[6]
Brooks, T., Holynski, A., Efros, A.A.: InstructPix2Pix: learning to follow image editing instructions. In: CVPR, pp. 18392–18402 (2023)
[7]
Cao C, Weng Y, Zhou S, Tong Y, and Zhou K FaceWarehouse: a 3D facial expression database for visual computing TVCG 2013 20 3 413-425
[8]
Chan, E.R., et al.: Efficient geometry-aware 3D generative adversarial networks. In: CVPR, pp. 16123–16133 (2022)
[9]
Chan, E.R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. In: CVPR, pp. 5799–5809 (2021)
[10]
Cheng, S., Bronstein, M., Zhou, Y., Kotsia, I., Pantic, M., Zafeiriou, S.: MeshGAN: non-linear 3D morphable models of faces. arXiv preprint arXiv:1903.10384 (2019)
[11]
Cheng, S., Kotsia, I., Pantic, M., Zafeiriou, S.: 4DFAB: a large scale 4D database for facial expression analysis and biometric applications. In: CVPR, pp. 5117–5126 (2018)
[12]
Cosker, D., Krumhuber, E., Hilton, A.: A FACS valid 3D dynamic action unit database with applications to 3D dynamic morphable facial modeling. In: ICCV, pp. 2296–2303. IEEE (2011)
[13]
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, and Bharath AA Generative adversarial networks: an overview IEEE Sig. Process. Mag. 2018 35 1 53-65
[14]
Dai H, Pears N, Smith W, and Duncan C Statistical modeling of craniofacial shape and texture IJCV 2020 128 547-571
[15]
Debevec P The light stages and their applications to photoreal digital actors SIGGRAPH Asia 2012 2 4 1-6
[16]
Deng, Y., Yang, J., Chen, D., Wen, F., Tong, X.: Disentangled and controllable face image generation via 3D imitative-contrastive learning. In: CVPR, pp. 5154–5163 (2020)
[17]
Deng, Y., Yang, J., Xiang, J., Tong, X.: GRAM: generative radiance manifolds for 3D-aware image generation. In: CVPR, pp. 10673–10683 (2022)
[18]
Egger B et al. 3D morphable face models-past, present, and future ToG 2020 39 5 1-38
[19]
Goodfellow I et al. Generative adversarial networks Commun. ACM 2020 63 11 139-144
[20]
Gu, J., Liu, L., Wang, P., Theobalt, C.: StyleNeRF: a style-based 3D aware generator for high-resolution image synthesis. In: ICLR (2021)
[21]
Gui J, Sun Z, Wen Y, Tao D, and Ye J A review on generative adversarial networks: algorithms, theory, and applications TKDE 2021 35 4 3313-3332
[22]
Haque, A., Tancik, M., Efros, A., Holynski, A., Kanazawa, A.: Instruct-NeRF2NeRF: editing 3D scenes with instructions. In: ICCV (2023)
[23]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
[24]
He Q et al. Leonardis A, Ricci E, Roth S, Russakovsky O, Sattler T, Varol G, et al. EmoTalk3D: high-fidelity free-view synthesis of emotional 3D talking head ECCV 2024 2024 Cham Springer 55-72
[25]
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: NIPS, vol. 33, pp. 6840–6851 (2020)
[26]
Hong, Y., Peng, B., Xiao, H., Liu, L., Zhang, J.: HeadNeRF: a real-time nerf-based parametric head model. In: CVPR, pp. 20374–20384 (2022)
[27]
Hore, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: ICPR, pp. 2366–2369. IEEE (2010)
[28]
Huang, Z., Chan, K.C., Jiang, Y., Liu, Z.: Collaborative diffusion for multi-modal face generation and editing. In: CVPR, pp. 6080–6090 (2023)
[29]
Jiang, Z.H., Wu, Q., Chen, K., Zhang, J.: Disentangled representation learning for 3D face shape. In: CVPR, pp. 11957–11966 (2019)
[30]
Kammoun A, Slama R, Tabia H, Ouni T, and Abid M Generative adversarial networks for face generation: a survey ACM Comput. Surv. 2022 55 5 1-37
[31]
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR, pp. 4401–4410 (2019)
[32]
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: CVPR, pp. 8110–8119 (2020)
[33]
Kim, M., Liu, F., Jain, A., Liu, X.: DCFace: synthetic face generation with dual condition diffusion model. In: CVPR, pp. 12715–12725 (2023)
[34]
Li T, Bolkart T, Black MJ, Li H, and Romero J Learning a model of facial shape and expression from 4D scans ToG 2017 36 6 194
[35]
Manjunath, B., Chellappa, R., von der Malsburg, C.: A feature based approach to face recognition. In: Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1992)
[36]
Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, and Ng R Vedaldi A, Bischof H, Brox T, and Frahm J-M NeRF: representing scenes as neural radiance fields for view synthesis Computer Vision – ECCV 2020 2020 Cham Springer 405-421
[37]
Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, and Ng R NeRF: representing scenes as neural radiance fields for view synthesis Commun. ACM 2021 65 1 99-106
[38]
Niemeyer, M., Geiger, A.: GIRAFFE: representing scenes as compositional generative neural feature fields. In: CVPR, pp. 11453–11464 (2021)
[39]
Pan, D., et al.: RenderMe-360: a large digital asset library and benchmarks towards high-fidelity head avatars. In: NIPS Datasets and Benchmarks Track (2023)
[40]
Pearson, K.: LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 2(11), 559–572 (1901)
[41]
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. In: Seminal Graphics Papers: Pushing the Boundaries, vol. 2, pp. 577–582 (2023)
[42]
Savran A et al. Schouten B, Juul NC, Drygajlo A, Tistarelli M, et al. Bosphorus database for 3D face analysis Biometrics and Identity Management 2008 Heidelberg Springer 47-56
[43]
Sun, J., et al.: Next3D: generative neural texture rasterization for 3D-aware head avatars. In: CVPR, pp. 20991–21002 (2023)
[44]
Sun, J., et al.: FENeRF: face editing in neural radiance fields. In: CVPR, pp. 7672–7682 (2022)
[45]
Sun, X., et al.: VividTalk: one-shot audio-driven talking head generation based on 3D hybrid prior. arXiv preprint arXiv:2312.01841 (2023)
[46]
Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR, pp. 2549–2559 (2018)
[47]
Thies J, Zollhöfer M, and Nießner M Deferred neural rendering: image synthesis using neural textures ToG 2019 38 4 1-12
[48]
Toshpulatov M, Lee W, and Lee S Generative adversarial networks and their application to 3D face generation: a survey Image Vis. Comput. 2021 108
[49]
Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: CVPR, pp. 1126–1135 (2019)
[50]
Tran, L., Liu, X.: Nonlinear 3D face morphable model. In: CVPR, pp. 7346–7355 (2018)
[51]
Tran L and Liu X On learning 3D face morphable model from in-the-wild images PAMI 2019 43 1 157-171
[52]
Tucker L Some mathematical notes on three-mode factor analysis Psychometrika 1966 31 3 279-311
[53]
Vesdapunt N, Rundle M, Wu HT, and Wang B Vedaldi A, Bischof H, Brox T, and Frahm J-M JNR: joint-based neural rig representation for compact 3D face modeling Computer Vision – ECCV 2020 2020 Cham Springer 389-405
[54]
Vlasic D, Brand M, Pfister H, and Popović J Face transfer with multilinear models ToG 2005 24 3 426-433
[55]
Wang, L., Chen, Z., Yu, T., Ma, C., Li, L., Liu, Y.: FaceVerse: a fine-grained and detail-controllable 3D face morphable model from a hybrid dataset. In: CVPR, pp. 20333–20342 (2022)
[56]
Wang, T., et al.: RODIN: a generative model for sculpting 3D digital avatars using diffusion. In: CVPR, pp. 4563–4573 (2023)
[57]
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: CVPR, pp. 8798–8807 (2018)
[58]
Wang Z, Bovik AC, Sheikh HR, and Simoncelli EP Image quality assessment: from error visibility to structural similarity TIP 2004 13 4 600-612
[59]
Wood, E., Baltrušaitis, T., Hewitt, C., Dziadzio, S., Cashman, T.J., Shotton, J.: Fake it till you make it: face analysis in the wild using synthetic data alone. In: CVPR, pp. 3681–3691 (2021)
[60]
Wu, M., Zhu, H., Huang, L., Zhuang, Y., Lu, Y., Cao, X.: High-fidelity 3D face generation from natural language descriptions. In: CVPR, pp. 4521–4530 (2023)
[61]
Xia W, Zhang Y, Yang Y, Xue JH, Zhou B, and Yang MH GAN inversion: a survey PAMI 2022 45 3 3121-3138
[62]
Xiao, Y., Zhu, H., Yang, H., Diao, Z., Lu, X., Cao, X.: Detailed facial geometry recovery from multi-view images by learning an implicit function. In: AAAI, vol. 36, pp. 2839–2847 (2022)
[63]
Yang, H., et al.: FaceScape: a large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: CVPR (2020)
[64]
Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: FG, pp. 211–216. IEEE (2006)
[65]
Yu C, Gao C, Wang J, Yu G, Shen C, and Sang N BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation IJCV 2021 129 3051-3068
[66]
Yu C, Wang J, Peng C, Gao C, Yu G, and Sang N Ferrari V, Hebert M, Sminchisescu C, and Weiss Y BiSeNet: bilateral segmentation network for real-time semantic segmentation Computer Vision – ECCV 2018 2018 Cham Springer 334-349
[67]
Yu, H., Zhu, H., Lu, X., Liu, J.: Migrating face swap to mobile devices: a lightweight framework and a supervised training solution. In: ICME, pp. 1–6. IEEE (2022)
[68]
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR, pp. 586–595 (2018)
[69]
Zhang, X., et al.: A high-resolution spontaneous 3D dynamic facial expression database. In: FG, pp. 1–6. IEEE (2013)
[70]
Zhang X et al. BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database Image Vis. Comput. 2014 32 10 692-706
[71]
Zhu H et al. FaceScape: 3D facial dataset and benchmark for single-view 3D face reconstruction PAMI 2023 45 12 14528-14545
[72]
Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: CVPR, pp. 146–155 (2016)
[73]
Zhu X, Liu X, Lei Z, and Li SZ Face alignment in full pose range: a 3D total solution PAMI 2017 41 1 78-92
[74]
Zhuang Y, Zhu H, Sun X, and Cao X Avidan S, Brostow G, Cissé M, Farinella GM, and Hassner T MoFaNeRF: morphable facial neural radiance field ECCV 2022 2022 Cham Springer 268-285

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LVI
Sep 2024
582 pages
ISBN:978-3-031-72991-1
DOI:10.1007/978-3-031-72992-8
  • Editors:
  • Aleš Leonardis,
  • Elisa Ricci,
  • Stefan Roth,
  • Olga Russakovsky,
  • Torsten Sattler,
  • Gül Varol

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 30 October 2024

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media