More Web Proxy on the site http://driver.im/

research-article

Open access

Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training

Authors:

Taekyoung KwonAuthors Info & Claims

AISec '24: Proceedings of the 2024 Workshop on Artificial Intelligence and Security

Pages 101 - 112

https://doi.org/10.1145/3689932.3694765

Published: 22 November 2024 Publication History

Abstract

Deep learning models are continually improving in accuracy, but they remain vulnerable to adversarial attacks, often resulting in the misclassification of adversarial examples. Adversarial training can mitigate this problem by enhancing the model's robust accuracy on adversarial examples. However, such training typically compromises the model's standard accuracy on clean samples. The necessity for deep learning models to balance both robustness and accuracy for security is evident, but achieving this balance remains challenging, and the underlying reasons are yet to be clarified. This paper proposes an innovative pre-training method called Adversarial Feature Alignment (AFA) to address these problems. Our approach involves identifying the trade-off in the model's feature space and fine-tuning it to achieve accuracy on both standard and adversarial examples concurrently. Our research unveils an intriguing insight: misalignment within the feature space often leads to misclassification, regardless of whether the samples are benign or adversarial. AFA mitigates this risk by employing a novel optimization algorithm based on contrastive learning to alleviate potential feature misalignment. Through our evaluations, we demonstrate the superior performance of AFA. Our method delivers state-of-the-art robust accuracy while minimizing the drop in clean accuracy to 1.86% and 8.91% on CIFAR10 and CIFAR100, respectively, compared to cross-entropy. We also show that joint optimization of AFA and TRADES, accompanied by data augmentation using a recent diffusion model, achieves state-of-the-art accuracy and robustness. Through AFA, we expect to enhance security while preserving accuracy in deep learning models through adversarial training.

References

[1]

[n. d.]. Adversarial Robustness Toolbox. https://github.com/Trusted-AI/ adversarial-robustness-toolbox Last accessed 15 April 2024.

[2]

Sravanti Addepalli, Samyak Jain, et al. 2022. Efficient and effective augmentation strategy for adversarial training. Advances in Neural Information Processing Systems, Vol. 35 (2022).

[3]

Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, and Qian Wang. 2021. Recent advances in adversarial training for adversarial robustness. arXiv preprint arXiv:2102.01356 (2021).

[4]

Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In Proceedings of the International Conference on Learning Representations.

[5]

Anh Bui, Trung Le, He Zhao, Paul Montague, Seyit Camtepe, and Dinh Phung. 2021. Understanding and achieving efficient robustness with adversarial supervised contrastive learning. arXiv preprint arXiv:2101.10027 (2021).

[6]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE.

[7]

Jianbo Chen, Michael I Jordan, and Martin J Wainwright. 2020. Hopskipjumpattack: A query-efficient decision-based attack. In Proceedings of the IEEE Symposium on Security and Privacy.

[8]

Lin Chen, Yifei Min, Mingrui Zhang, and Amin Karbasi. 2020. More data can expand the generalization gap between adversarially robust and standard models. In Proceedings of the International Conference on Machine Learning. PMLR.

[9]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. PMLR.

[10]

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. 2019. Certified adversarial robustness via randomized smoothing. In Proceedings of the International Conference on Machine Learning.

[11]

Francesco Croce and Matthias Hein. 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the International Conference on Machine Learning. PMLR.

[12]

Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, and Jiaya Jia. 2021. Parametric contrastive learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

[13]

Lijie Fan, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, and Chuang Gan. 2021. When Does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? Advances in Neural Information Processing Systems, Vol. 34 (2021).

[14]

Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S Schoenholz, Maithra Raghu, Martin Wattenberg, and Ian Goodfellow. 2018. Adversarial spheres. arXiv preprint arXiv:1801.02774 (2018).

[15]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[16]

Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, and Timothy A Mann. 2021. Improving robustness using generated data. Advances in Neural Information Processing Systems, Vol. 34 (2021).

[17]

Florian Graf, Christoph Hofer, Marc Niethammer, and Roland Kwitt. 2021. Dissecting supervised constrastive learning. In Proceedings of the International Conference on Machine Learning. PMLR.

[18]

Olivier Henaff. 2020. Data-efficient image recognition with contrastive predictive coding. In Proceedings of the International Conference on Machine Learning. PMLR.

[19]

R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).

[20]

Chih-Hui Ho and Nuno Nvasconcelos. 2020. Contrastive learning with adversarial examples. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[21]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[22]

Ziyu Jiang, Tianlong Chen, Ting Chen, and Zhangyang Wang. 2020. Robust pre-training by adversarial contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[23]

Gaojie Jin, Xinping Yi, Wei Huang, Sven Schewe, and Xiaowei Huang. 2022. Enhancing adversarial training with second-order statistics of weights. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. 2022. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, Vol. 35 (2022).

[25]

Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[26]

Minseon Kim, Jihoon Tack, and Sung Ju Hwang. 2020. Adversarial self-supervised contrastive learning. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[27]

Sungyoon Lee, Jaewook Lee, and Saerom Park. 2020. Lipschitz-certifiable training with a tight outer bound. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[28]

Klas Leino, Zifan Wang, and Matt Fredrikson. 2021. Globally-robust neural networks. In Proceedings of the International Conference on Machine Learning. PMLR.

[29]

Matan Levi and Aryeh Kontorovich. 2024. Splitting the Difference on Adversarial Training. In Proceedings of the USENIX Security Symposium.

[30]

Alexander J Levine and Soheil Feizi. 2021. Improved, deterministic smoothing for l_1 certified robustness. In Proceedings of the International Conference on Machine Learning.

[31]

Jie Li, Tianqing Zhu, Wei Ren, and Kim-Kwang Raymond. 2023. Improve Individual Fairness in Federated Learning via Adversarial training. Computers & Security (2023).

[32]

Lin Li and Michael W. Spratling. 2023. Data augmentation alone can improve adversarial training. In Proceedings of the International Conference on Learning Representations.

[33]

Linyi Li, Tao Xie, and Bo Li. 2023. Sok: Certified robustness for deep neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE.

[34]

Keane Lucas, Samruddhi Pai, Weiran Lin, Lujo Bauer, Michael K Reiter, and Mahmood Sharif. 2023. Adversarial Training for Raw-Binary Malware Classifiers. In Proceedings of the USENIX Security Symposium.

[35]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).

[36]

Mark Niklas Mueller, Franziska Eckert, Marc Fischer, and Martin Vechev. 2023. Certified Training: Small Boxes are All You Need. In Proceedings of the International Conference on Learning Representations.

[37]

Rahul Rade and Seyed-Mohsen Moosavi-Dezfooli. 2021. Helper-based adversarial training: Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In Proceedings of the ICML Workshop on Adversarial Machine Learning.

[38]

Rahul Rade and Seyed-Mohsen Moosavi-Dezfooli. 2022. Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In Proceedings of the International Conference on Learning Representations.

[39]

Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John Duchi, and Percy Liang. 2020. Understanding and mitigating the tradeoff between robustness and accuracy. arXiv preprint arXiv:2002.10716 (2020).

[40]

Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C Duchi, and Percy Liang. 2019. Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032 (2019).

[41]

Sylvestre-Alvise Rebuffi, Sven Gowal, Dan Andrei Calian, Florian Stimberg, Olivia Wiles, and Timothy A Mann. 2021. Data augmentation can improve robustness. Advances in Neural Information Processing Systems, Vol. 34 (2021).

[42]

Amirhossein Reisizadeh, Farzan Farnia, Ramtin Pedarsani, and Ali Jadbabaie. 2020. Robust federated learning: The case of affine distribution shifts. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[43]

Leslie Rice, Eric Wong, and Zico Kolter. 2020. Overfitting in adversarially robust deep learning. In Proceedings of the International Conference on Machine Learning. PMLR.

[44]

Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S Davis, Gavin Taylor, and Tom Goldstein. 2019. Adversarial training for free! Advances in Neural Information Processing Systems, Vol. 32 (2019).

[45]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[46]

Yonglong Tian, Dilip Krishnan, and Phillip Isola. 2020. Contrastive multiview coding. In Proceedings of the European Conference on Computer Vision. Springer.

Digital Library

[47]

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2018. Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152 (2018).

[48]

Ulrike von Luxburg and Olivier Bousquet. 2004. Distance-Based Classification with Lipschitz Functions. J. Mach. Learn. Res., Vol. 5, Jun (2004).

[49]

Yizhen Wang, Somesh Jha, and Kamalika Chaudhuri. 2018. Analyzing the robustness of nearest neighbors to adversarial examples. In Proceedings of the International Conference on Machine Learning. PMLR.

[50]

Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, and Quanquan Gu. 2020. Improving adversarial robustness requires revisiting misclassified examples. In Proceedings of the International Conference on Learning Representations.

[51]

Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, and Shuicheng Yan. 2023. Better diffusion models further improve adversarial training. In Proceedings of the International Conference on Machine Learning. PMLR.

[52]

Eric Wong, Leslie Rice, and J Zico Kolter. 2020. Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020).

[53]

Boxi Wu, Jinghui Chen, Deng Cai, Xiaofei He, and Quanquan Gu. 2021. Do Wider Neural Networks Really Help Adversarial Robustness? Advances in Neural Information Processing Systems, Vol. 34 (2021).

[54]

Dongxian Wu, Shu-Tao Xia, and Yisen Wang. 2020. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[55]

Huan Xu, Constantine Caramanis, and Shie Mannor. 2009. Robustness and Regularization of Support Vector Machines. Journal of machine learning research, Vol. 10, 7 (2009).

[56]

Yao-Yuan Yang, Cyrus Rashtchian, Hongyang Zhang, Russ R Salakhutdinov, and Kamalika Chaudhuri. 2020. A closer look at accuracy vs. robustness. Proceedings of the Advances in neural information processing systems, Vol. 33 (2020).

[57]

Qiying Yu, Jieming Lou, Xianyuan Zhan, Qizhang Li, Wangmeng Zuo, Yang Liu, and Jingjing Liu. 2022. Adversarial Contrastive Learning via Asymmetric InfoNCE. In Proceedings of the European Conference on Computer Vision. Springer.

Digital Library

[58]

Yuanyuan Yuan, Shuai Wang, and Zhendong Su. 2023. Precise and Generalized Robustness Certification for Neural Networks. In Proceedings of the USENIX Security Symposium.

[59]

Bohang Zhang, Du Jiang, Di He, and Liwei Wang. 2022. Boosting the Certified Robustness of L-infinity Distance Nets. In Proceedings of the International Conference on Learning Representations.

[60]

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In Proceedings of the International Conference on Machine Learning. PMLR.

[61]

Jiawei Zhang, Zhongzhu Chen, Huan Zhang, Chaowei Xiao, and Bo Li. 2023. DiffSmooth: Certifiably Robust Learning via Diffusion Models and Local Smoothing. In Proceedings of the USENIX Security Symposium.

[62]

Jie Zhang, Bo Li, Chen Chen, Lingjuan Lyu, Shuang Wu, Shouhong Ding, and Chao Wu. 2023. Delving into the adversarial robustness of federated learning. arXiv preprint arXiv:2302.09479 (2023).

[63]

Jingfeng Zhang, Jianing Zhu, Gang Niu, Bo Han, Masashi Sugiyama, and Mohan Kankanhalli. 2021. Geometry-aware Instance-reweighted Adversarial Training. In Proceedings of the International Conference on Learning Representations.

[64]

Yifan Zhang, Bryan Hooi, Dapeng Hu, Jian Liang, and Jiashi Feng. 2021. Unleashing the power of contrastive self-supervised visual models via contrast-regularized fine-tuning. Advances in Neural Information Processing Systems, Vol. 34 (2021).

[65]

Giulio Zizzo, Ambrish Rawat, Mathieu Sinn, and Beat Buesser. 2020. Fat: Federated adversarial training. arXiv preprint arXiv:2012.01791 (2020).

Index Terms

Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training
1. Security and privacy
  1. Systems security

Recommendations

A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial attacks that generate adversarial examples by adding small perturbations to the clean images. To combat adversarial attacks, the two main defense methods used are denoising and adversarial ...
ATGAN: Adversarial training-based GAN for improving adversarial robustness generalization on image classification
Abstract
Deep neural networks are vulnerable to adversarial examples, which are well-designed examples aiming to cause models to produce wrong outputs with high confidence. Although adversarial training is by so far the only effective adversarial defense ...
Improving adversarial robustness of deep neural networks via adaptive margin evolution
Highlights
- A poof of the existence of an optimal state for adversarial training.
- Adaptive ...
Abstract
Adversarial training is the most popular and general strategy to improve Deep Neural Network (DNN) robustness against adversarial noises. Many adversarial training methods have been proposed in the past few years. However, most of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

AISec '24: Proceedings of the 2024 Workshop on Artificial Intelligence and Security

November 2024

225 pages

ISBN:9798400712289

DOI:10.1145/3689932

Program Chairs:
Maura Pintor
University of Cagliari
,
Xinyun Chen
Google DeepMind
,
Matthew Jagielski
Google DeepMind

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Institute of Information commu- nications Technology Planning Evaluation (IITP)

Conference

CCS '24

Sponsor:

SIGSAC

CCS '24: ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
72
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)72

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents