[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Approach to 3D face reconstruction through local deep feature alignment

Published: 03 January 2019 Publication History

Abstract

Here, the authors propose an end‐to‐end method based on deep learning to reconstruct three‐dimensional (3D) face models from given face images. In the training stage, the authors propose to extract the feature representations from the 3D sample faces and corresponding 2D sample images through the proposed local deep feature alignment (LDFA) algorithm, and estimate an explicit mapping from the 2D features to their 3D counterparts for each local neighbourhood, then the authors learn a feed‐forward deep neural network for each neighbourhood whose parameters are initialised with the parameters obtained in the locality‐aware learning process and the explicit mapping. In the testing stage, the authors only need to feed a given face image to the deep neural network corresponding to the nearest sample image and receive the outputted 3D face model. Extensive experiments have been conducted on both non‐face and face data sets. The authors find that the LDFA algorithm performs better than several popular unsupervised feature extraction algorithms, and the 3D reconstruction results obtained by the proposed method also outperform the comparison methods.

7 References

[1]
Zhang, R., Tsai, P.S., Cryer, J.E., et al: ‘Shape‐from‐shading: a survey’, IEEE Trans. Pattern Anal. Mach. Intell., 1999, 21, (8), pp. 690–706
[2]
Blanz, V., Vetter, T.: ‘A morphable model for the synthesis of 3D faces’. Proc. Conf. SIGGRAPH'99, Los Angeles, USA, August 1999, pp. 187–194
[3]
Blanz, V., Vetter, T.: ‘Face recognition based on fitting a 3D morphable model’, IEEE Trans. Pattern Anal. Mach. Intell., 2003, 25, (9), pp. 1063–1074
[4]
Zhang, J., Tao, D., Bian, X., et al: ‘Monocular face reconstruction with global and local shape constraints’, Neurocomputing, 2015, 149, pp. 1535–1543
[5]
Song, M., Tao, D., Huang, X., et al: ‘Three‐dimensional face reconstruction from a single image by a coupled RBF network’, IEEE Trans. Image Process., 2012, 21, (5), pp. 2887–2897
[6]
Yu, J., Rui, Y., Tao, D.: ‘Click prediction for web image reranking using multimodal sparse coding’, IEEE Trans. Image Process., 2014, 23, (5), pp. 2019–2032
[7]
Yu, J., Rui, Y., Chen, B.: ‘Exploiting click constraints and multiview features for image reranking’, IEEE Trans. Multimedia, 2014, 16, (1), pp. 159–168
[8]
Yu, J., Kuang, Z., Zhang, B., et al: ‘Leveraging content sensitiveness and user trustworthiness to recommend fine‐grained privacy settings for social image sharing’, IEEE Trans. Inf. Forensics Secur., 2018, 13, (5), pp. 1317–1332
[9]
He, K., Zhang, X., Ren, S., et al: ‘Spatial pyramid pooling in deep convolutional networks for visual recognition’, IEEE Trans. Pattern Anal. Mach. Intell., 2015, 37, (9), pp. 1904–1916
[10]
Noh, H., Hong, S., Han, B.: ‘Learning deconvolution network for semantic segmentation’. Proc. Int. Conf. Computer Vision, Boston, USA, June 2015, pp. 1520–1528
[11]
Yu, J., Kuang, Z., Zhang, B., et al: ‘Image privacy protection by identifying sensitive objects via deep multi‐task learning’, IEEE Trans. Inf. Forensics Secur., 2017, 12, (5), pp. 1005–1016
[12]
Hong, C., Yu, J., Wan, J., et al: ‘Multimodal deep autoencoder for human pose recovery’, IEEE Trans. Image Process., 2015, 24, (12), pp. 5659–5670
[13]
Dou, P., Shah, S.K., Kakadiaris, I.A.: ‘End‐to‐end 3D face reconstruction with deep neural networks’. Proc. Int. Conf. CVPR, Honolulu, Hawaii, July 2017, pp. 1503–1512
[14]
Yu, J., Hong, C., Rui, Y., et al: ‘Multi‐task deep autoencoder model for human pose recovery’, IEEE Trans. Ind. Electron., 2018, 65, (6), pp. 5060–5068
[15]
Zhang, Z., Zha, H.: ‘Principal manifolds and nonlinear dimensionality reduction via tangent space alignment’, J. Shanghai Univ. (English Ed.), 2004, 8, (4), pp. 406–424
[16]
Zhang, J., Yu, J., You, J., et al: ‘Data‐driven facial animation via semi‐supervised local patch alignment’, Pattern Recognit., 2016, 57, pp. 1–20
[17]
Zhuang, Y., Zhang, J., Wu, F.: ‘Hallucinating faces: LPH super‐resolution and neighbor reconstruction for residue compensation’, Pattern Recognit., 2007, 40, (11), pp. 3178–3194
[18]
Rifai, S., Vincent, P., Muller, X., et al: ‘Contractive auto‐encoders: explicit invariance during feature extraction’. Proc. Int. Conf. Machine Learning, Bellevue, USA, June 2011, pp. 833–840
[19]
Piotraschke, M., Blanz, V.: ‘Automated 3D face reconstruction from multiple images using quality measures’. Proc. Int. Conf. CVPR, Las Vegas, USA, June 2016, pp. 3418–3427
[20]
Jiang, D., Hu, Y., Yan, S., et al: ‘Efficient 3D reconstruction for face recognition’, Pattern Recognit., 2005, 38, (6), pp. 787–798
[21]
Park, S., Heo, J., Savvides, M.: ‘3D face reconstruction from a single 2D face image’. Proc. CVPR Workshops, Anchorage, AK, USA, June 2008, pp. 1–8
[22]
Jiang, L., Zhang, J., Deng, B., et al: ‘3D face reconstruction with geometry details from a single image’, arXiv:1702.05619, 2017, (2), pp. 1–13
[23]
Park, U., Jain, A.K.: ‘3D face reconstruction from stereo video’. The 3rd Canadian Conf. Computer and Robot Vision, Quebec City, Canada, June 2006, p. 41
[24]
Choi, J., Medioni, G., Lin, Y., et al: ‘3D face reconstruction using a single or multiple views’. Proc. Int. Conf. Pattern Recognition, Istanbul, Turkey, August 2010, pp. 3959–3962
[25]
Hu, Q., Zwicker, M., Favaro, P.: ‘3D face reconstruction with silhouette constraints’. Proc. Conf. Vision, Modeling and Visualization, Bayreuth, Germany, October 2016, pp. 37–42
[26]
Wang, X., Yang, R.: ‘Learning 3D shape from a single facial image via non‐linear manifold embedding and alignment’. Proc. Int. Conf. CVPR, San Francisco, USA, June 2010, pp. 414–421
[27]
Zhang, J., Zhuang, Y.: ‘Sample based 3D face reconstruction from a single frontal image by adaptive locally linear embedding’, J. Zhejiang Univ.‐Sci. A, 2007, 8, (4), pp. 550–558
[28]
Zhang, Z., Wang, J., Zha, H.: ‘Adaptive manifold learning’, IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34, (2), pp. 253–265
[29]
Liang, H., Liang, R., Song, M., et al: ‘Coupled dictionary learning for the detail‐enhanced synthesis of 3D facial expressions’, IEEE Trans. Image Process., 2016, 46, (4), pp. 890–901
[30]
Song, M., Tao, D., Sun, S., et al: ‘Robust 3D face landmark localization based on local coordinate coding’, IEEE Trans. Image Process., 2014, 23, (12), pp. 5108–5122
[31]
Song, M., Tao, D., Sun, S., et al: ‘Joint sparse learning for 3D facial expression generation’, IEEE Trans. Image Process., 2013, 22, (8), pp. 3283–3295
[32]
Jackson, A.S., Bulat, A., Argyriou, V., et al: ‘Large pose 3D face reconstruction from a single image via direct volumetric CNN regression’. Proc. Int. Conf. Computer Vision, Venice, Italy, October 2017, pp. 1031–1039
[33]
Hinton, G.E., Salakhutdinov, R.R.: ‘Reducing the dimensionality of data with neural networks’, Science, 2006, 313, (5786), pp. 504–507
[34]
Lee, H., Ekanadham, C., Ng, A.Y.: ‘Sparse deep belief net model for visual area V2’. Proc. NIPS'08, Vancouver, Canada, December 2008, pp. 873–880
[35]
Alain, G., Bengio, Y.: ‘What regularized auto‐encoders learn from the data‐generating distribution’, J. Mach. Learn. Res., 2014, 15, (1), pp. 3563–3593
[36]
Baldi, P.: ‘Autoencoders, unsupervised learning, and deep architectures’, Unsupervised Transf. Learn. Chall. Mach. Learn., 2012, 7, p. 43
[37]
Rifai, S., Mesnil, G., Vincent, P., et al: ‘Higher order contractive auto‐encoder’, in Gunopulos, Dimitrios, Hofmann, Thomas, Malerba, Donato, et al. (Eds.): ‘Machine learning and knowledge discovery in databases’ (Springer, Heidelberg, Germany, 2011), pp. 645–660
[38]
Zhang, J., Li, K., Liang, Y., et al: ‘Learning 3D faces from 2D images via stacked contractive autoencoder’, Neurocomputing, 2017, 257, pp. 67–78
[39]
Yuan, Y., Wan, J., Wang, Q.: ‘Congested scene classification via efficient unsupervised feature learning and density estimation’, Pattern Recognit., 2016, 56, pp. 159–169
[40]
Huang, Y., Cao, X., Wang, Q., et al: ‘Long–short term features for dynamic scene classification’, IEEE Trans. Circuits Syst. Video Technol., 2018, 99, pp. 1–1

Index Terms

  1. Approach to 3D face reconstruction through local deep feature alignment
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 24 Jan 2025

            Other Metrics

            Citations

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media