FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection

Mohammed Aloraini ORCID: orcid.org/0000-0002-1655-8098¹

216 Accesses
3 Citations
Explore all metrics

Abstract

Digital videos have become essential to broadcast news that targets many audiences around the world, and it is therefore important to ensure the reliability of these broadcasted videos. Unfortunately, digital videos can be manipulated by replacing a person’s face or expressions with another person’s face or expressions without leaving visible traces. This facial manipulation is a challenging problem due to the lack of digital forensic techniques that can be used to verify the originality of video content. In this paper, we propose a novel approach, dubbed FaceMD, based on fusing three streams of convolutional neural networks to detect facial manipulation. The proposed FaceMD incorporates spatiotemporal information by fusing video frames, motion residuals, and 3D gradients to improve facial manipulation detection accuracy. We combine these three streams using different fusion methods and places to best use this spatiotemporal information, hence increasing detection performance. The experimental results show that the proposed FaceMD achieves state-of-the-art accuracy using two different facial manipulation data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Identify videos with facial manipulations based on convolution neural network and dynamic texture

Article 24 May 2022

A Feature Fusion Based Deep Learning Model for Deepfake Video Detection

Localization of Facial Images Manipulation in Digital Forensics via Convolutional Neural Networks

References

Rössler, A., Cozzolino, D., Verdoliva,,L., Riess, C., Thies, J., Nießner, M.: “Faceforensics: A large-scale video dataset for forgery detection in human faces,” arXiv preprint arXiv:1803.09179 (2018)
Aloraini, M., Sharifzadeh, M., Schonfeld, D.: Sequential and patch analyses for object removal video forgery detection and localization, IEEE Transactions on Circuits and Systems for Video Technology (Early Access), pp. 1 – 1 (2020)
Faceswap. https://github.com/MarekKowalski/FaceSwap/, Accessed: 2020-05-20
Deepfakes githup. https://github.com/deepfakes/faceswap, Accessed: 2020-05-20
Thies, J., Zollhofer, M., Stamminger,M., Theobalt, C., Nießner, M.: Face2face: Real-time face capture and reenactment of rgb videos, In: Proceedings of the IEEE conference on computer vision and pattern recognition pp. 2387–2395 (2016)
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graphics (TOG) 38(4), 1–12 (2019)
Article Google Scholar
Matern, F., Riess, C., Stamminger, M.: Exploiting visual artifacts to expose deepfakes and face manipulations, In: IEEE Winter Applications of Computer Vision Workshops (WACVW). IEEE 2019, pp. 83–92 (2019)
Afchar, D., Nozick, V., Yamagishi, J., Echizen, I.: Mesonet: a compact facial video forgery detection network, In: 2018 IEEE International Workshop on Information Forensics and Security (WIFS).IEEE, pp. 1–7 (2018)
Tolosana, R., Vera-Rodriguez, R., Fierrez, J., Morales, A., Ortega-Garcia, J.: Deepfakes and beyond: A survey of face manipulation and fake detection, arXiv preprint arXiv:2001.00179 (2020)
Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., Nießner, M.: Faceforensics++: Learning to detect manipulated facial images, In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1–11 (2019)
Google-jigsaw. https://ai.googleblog.com/2019/09/contributing-data-to-deepfake-detection.html, Accessed: 2020-05-20
Wu, M., Trappe, W., Wang, Z.J., Liu, K.R.: Collusion-resistant fingerprinting for multimedia. IEEE Sig Process Magaz 21(2), 15–27 (2004)
Article Google Scholar
Chen, S., Tan, S., Li, B., Huang, J.: Automatic detection of object-based forgery in advanced video. IEEE Trans. Circuits and Syst. Video Technol. 26(11), 2138–2151 (2016)
Article Google Scholar
Danielsson, P.-E., Seger, O.: Generalized and separable sobel operators, In: Machine vision for three-dimensional scenes.Elsevier, pp. 347–379 (1990)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks, In: Proceedings of the IEEE international conference on computer vision, pp. 4489–4497 (2015)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions, In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258 (2017)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition, In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1933–1941 (2016)
Zhou, P., Han, X., Morariu, V.I., Davis, L.S.: Two-stream neural networks for tampered face detection, In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, , pp. 1831–1839 (2017)

Download references

Acknowledgements

The researcher would like to thank the Deanship of Scientific Research, Qassim University, for funding the publication of this project.

Author information

Authors and Affiliations

Department of Electrical Engineering, College of Engineering, Qassim University, Unaizah, 56452, Saudi Arabia
Mohammed Aloraini

Authors

Mohammed Aloraini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammed Aloraini.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 846 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aloraini, M. FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection. SIViP 17, 247–255 (2023). https://doi.org/10.1007/s11760-022-02227-x

Download citation

Received: 18 December 2021
Revised: 18 February 2022
Accepted: 28 March 2022
Published: 21 April 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s11760-022-02227-x

FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Identify videos with facial manipulations based on convolution neural network and dynamic texture

A Feature Fusion Based Deep Learning Model for Deepfake Video Detection

Localization of Facial Images Manipulation in Digital Forensics via Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 846 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Identify videos with facial manipulations based on convolution neural network and dynamic texture

A Feature Fusion Based Deep Learning Model for Deepfake Video Detection

Localization of Facial Images Manipulation in Digital Forensics via Convolutional Neural Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 846 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation