[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3691573.3691617acmotherconferencesArticle/Chapter ViewAbstractPublication PagessiggraphConference Proceedingsconference-collections
short-paper

Enhancing Robustness in Audio Deepfake Detection for VR Applications using data augmentation and Mixup

Published: 30 September 2024 Publication History

Abstract

The rapid advancement of virtual reality (VR) technology has heightened the need for robust and reliable deepfake audio detection to ensure the authenticity and integrity of virtual interactions. Although current state-of-the-art models exhibit promising results, they are often overconfident, which can lead to poor generalization and reduced effectiveness against novel or slightly altered deepfake attacks. In this work, we investigate the application of data augmentation techniques and Mixup techniques to increase the diversity of training data and improve the generalization of deepfake audio detection models. Mixup creates new training examples by combining pairs of existing examples, promoting smoother and more robust decision boundaries, while data augmentation creates new training examples altering a sample with a given probability. Our results demonstrate that applying such techniques to the Wav2vec 2.0 model significantly improves its generalization ability, leading to more reliable deepfake detection in VR environments.

References

[1]
2019. ASVspoof 2019: The Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan. http://www.asvspoof.org/asvspoof2019/asvspoof2019_evaluation_plan.pdf. [Online].
[2]
Fatih Arslan. 2023. Deepfake Technology: A Criminological Literature Review. The Sakarya Journal of Law (The SJL) 11, 1 (2023), 701–720.
[3]
Rebecca A. Delfino. 2023. Deepfakes em julgamento: uma chamada para expandir o papel de controle do juiz de julgamento para proteger os processos legais contra falsificação tecnológica. Hastings Law Journal 74 (2023), 293. https://repository.uclawsf.edu/hastings_law_journal/vol74/iss2/3
[4]
Yinlin Guo, Haofan Huang, Xi Chen, He Zhao, and Yuehai Wang. 2023. Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier. arXiv preprint arXiv:2312.08089 (2023). https://doi.org/10.48550/arXiv.2312.08089 arXiv:2312.08089Submitted on 13 Dec 2023 (v1), last revised 10 Jan 2024 (this version, v2).
[5]
Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, and Nicholas Evans. 2021. AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks. arXiv preprint arXiv:2110.01200 (2021). https://doi.org/10.48550/arXiv.2110.01200 arXiv:2110.01200
[6]
Taein Kang, Soyul Han, Sunmook Choi, Jaejin Seo, Sanghyeok Chung, Seungeun Lee, Seungsang Oh, and Il-Youp Kwak. 2024. Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0. arXiv preprint arXiv:2402.17127 (2024). https://doi.org/10.48550/arXiv.2402.17127 arXiv:2402.17127
[7]
Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, and Dacheng Tao. 2024. Deepfake Generation and Detection: A Benchmark and Survey. arXiv preprint arXiv:2403.17881 (2024). https://doi.org/10.48550/arXiv.2403.17881 arXiv:2403.17881Submitted on 26 Mar 2024 (v1), last revised 16 May 2024 (this version, v4).
[8]
Tomasz Walczyna and Zbigniew Piotrowski. 2023. Overview of voice conversion methods based on deep learning. Applied Sciences 13, 5 (2023), 3100.
[9]
X. Wang, J. Yamagishi, and et al.2020. ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Computer Speech & Language (CSL) 64 (2020), 101114.
[10]
Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, and Shuchen Shi. 2024. Generalized Fake Audio Detection via Deep Stable Learning. arXiv preprint arXiv:2406.03237 (2024). https://doi.org/10.48550/arXiv.2406.03237 arXiv:2406.03237accepted by INTERSPEECH2024.
[11]
Junichi Yamagishi, Xuechen Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuenan Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, and et al.2021. Asvspoof 2021: accelerating progress in spoofed and deepfake speech detection. In ASVspoof 2021 Workshop - Automatic Speaker Verification and Spoofing Countermeasures Challenge.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
SVR '24: Proceedings of the 26th Symposium on Virtual and Augmented Reality
September 2024
346 pages
ISBN:9798400709791
DOI:10.1145/3691573
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Audio Classification
  2. Deepfake Detection
  3. Feature Abstraction
  4. Machine Learning
  5. Mixup

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

  • Embrapii

Conference

SVR 2024
SVR 2024: Symposium on Virtual and Augmented Reality
September 30 - October 3, 2024
Manaus, Brazil

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 40
    Total Downloads
  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)10
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media