Abstract
Fast adversarial training (FAT) effectively improves the efficiency of standard adversarial training (SAT). However, initial FAT encounters catastrophic overfitting, i.e., the robust accuracy against adversarial attacks suddenly and dramatically decreases. Though several FAT variants spare no effort to prevent overfitting, they sacrifice much calculation cost. In this paper, we explore the difference between the training processes of SAT and FAT and observe that the attack success rate of adversarial examples (AEs) of FAT gets worse gradually in the late training stage, resulting in overfitting. The AEs are generated by the fast gradient sign method (FGSM) with a zero or random initialization. Based on the observation, we propose a prior-guided FGSM initialization method to avoid overfitting after investigating several initialization strategies, improving the quality of the AEs during the whole training process. The initialization is formed by leveraging historically generated AEs without additional calculation cost. We further provide a theoretical analysis for the proposed initialization method. We also propose a simple yet effective regularizer based on the prior-guided initialization, i.e., the currently generated perturbation should not deviate too much from the prior-guided initialization. The regularizer adopts both historical and current adversarial perturbations to guide the model learning. Evaluations on four datasets demonstrate that the proposed method can prevent catastrophic overfitting and outperform state-of-the-art FAT methods. The code is released at
.
X. Jia—Work done in an internship at Tencent AI Lab.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Andriushchenko, M., Croce, F., Flammarion, N., Hein, M.: Square attack: a query-efficient black-box adversarial attack via random search. Computer Vision – ECCV 2020 , 484–501 (2020). https://doi.org/10.1007/978-3-030-58592-1_29
Andriushchenko, M., Flammarion, N.: Understanding and improving fast adversarial training. Adv. Neural. Inf. Process. Syst. 33, 16048–16059 (2020)
Bai, J., et al.: Targeted attack for deep hashing based retrieval. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 618–634. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_36
Bai, Y., Zeng, Y., Jiang, Y., Wang, Y., Xia, S.-T., Guo, W.: Improving query efficiency of black-box adversarial attack. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 101–116. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_7
Bai, Y., Zeng, Y., Jiang, Y., Xia, S.T., Ma, X., Wang, Y.: Improving adversarial robustness via channel-wise activation suppressing. arXiv preprint arXiv:2103.08307 (2021)
Bai, Y., Zeng, Y., Jiang, Y., Xia, S., Ma, X., Wang, Y.: Improving adversarial robustness via channel-wise activation suppressing. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021)
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. IEEE (2017)
Chen, S.T., Cornelius, C., Martin, J., Chau, D.H.: Robust physical adversarial attack on faster r-CNN object detector. corr abs/1804.05810 (2018). arXiv preprint arXiv:1804.05810 (2018)
Croce, F., Hein, M.: Minimally distorted adversarial examples with a fast adaptive boundary attack. In: International Conference on Machine Learning, pp. 2196–2205. PMLR (2020)
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Cnnference on Machine Learning, pp. 2206–2216. PMLR (2020)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. Ieee (2009)
Duan, R., Ma, X., Wang, Y., Bailey, J., Qin, A.K., Yang, Y.: Adversarial camouflage: Hiding physical-world attacks with natural styles. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1000–1008 (2020)
Duan, R., et al.: Adversarial laser beam: Effective physical-world attack to DNNS in a blink. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16062–16071 (2021)
Finlayson, S.G., Bowers, J.D., Ito, J., Zittrain, J.L., Beam, A.L., Kohane, I.S.: Adversarial attacks on medical machine learning. Science 363(6433), 1287–1289 (2019)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2014)
Gu, J., Tresp, V., Hu, H.: Capsule network is not more robust than convolutional network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14309–14317 (2021)
Gu, J., Wu, B., Tresp, V.: Effective and efficient vote attack on capsule networks. arXiv preprint arXiv:2102.10055 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Jia, X., Wei, X., Cao, X., Foroosh, H.: ComDefend: an efficient image compression model to defend adversarial examples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6084–6092 (2019)
Jia, X., Wei, X., Cao, X., Han, X.: Adv-watermark: a novel watermark perturbation for adversarial examples. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1579–1587 (2020)
Jia, X., Zhang, Y., Wu, B., Ma, K., Wang, J., Cao, X.: LAS-AT: adversarial training with learnable attack strategy. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 13398–13408 (2022)
Jia, X., Zhang, Y., Wu, B., Wang, J., Cao, X.: Boosting fast adversarial training with learnable adversarial initialization. In: IEEE Trans. Image Process. 31, 4417–4430 (2022)
Kannan, H., Kurakin, A., Goodfellow, I.: Adversarial logit pairing. arXiv preprint arXiv:1803.06373 (2018)
Kim, H., Lee, W., Lee, J.: Understanding catastrophic overfitting in single-step adversarial training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 8119–8127 (2021)
Krizhevsky, A., et al.: Learning multiple layers of features from tiny images.Technical Report TR-2009 (2009)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Li, Y., et al.: Semi-supervised robust training with generalized perturbed neighborhood. Pattern Recogn. 124, 108472 (2022)
Liang, S., Wei, X., Yao, S., Cao, X.: Efficient Adversarial attacks for visual object tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 34–50. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_3
Liang, S., Wu, B., Fan, Y., Wei, X., Cao, X.: Parallel rectangle flip attack: a query-based black-box attack against object detection. arXiv preprint arXiv:2201.08970 (2022)
Lin, J., Song, C., He, K., Wang, L., Hopcroft, J.E.: Nesterov accelerated gradient and scale invariance for adversarial attacks. arXiv preprint arXiv:1908.06281 (2019)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
Pang, T., Yang, X., Dong, Y., Su, H., Zhu, J.: Bag of tricks for adversarial training. In: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, 3–7 May 2021. OpenReview.net (2021)
Pang, T., Yang, X., Dong, Y., Xu, K., Zhu, J., Su, H.: Boosting adversarial training with hypersphere embedding. arXiv preprint arXiv:2002.08619 (2020)
Park, G.Y., Lee, S.W.: Reliably fast adversarial training via latent adversarial perturbation. arXiv preprint arXiv:2104.01575 (2021)
Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw. 12(1), 145–151 (1999)
Rice, L., Wong, E., Kolter, Z.: Overfitting in adversarially robust deep learning. In: International Conference on Machine Learning, pp. 8093–8104. PMLR (2020)
Roth, K., Kilcher, Y., Hofmann, T.: Adversarial training is a form of data-dependent operator norm regularization. arXiv preprint arXiv:1906.01527 (2019)
Shafahi, A., et al.: Adversarial training for free! In: 3rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada (2019)
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)
Sriramanan, G., Addepalli, S., Baburaj, A., et al.: Guided adversarial attack for evaluating and enhancing adversarial defenses. Adv. Neural. Inf. Process. Syst. 33, 20297–20308 (2020)
Sriramanan, G., et al.: Towards efficient and effective adversarial training. In: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), vol. 34 (2021)
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Wang, X., He, K.: Enhancing the transferability of adversarial attacks through variance tuning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1924–1933 (2021)
Wang, Y., Ma, X., Bailey, J., Yi, J., Zhou, B., Gu, Q.: On the convergence and robustness of adversarial training. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9–15 June 2019, Long Beach, California, USA, pp. 6586–6595. PMLR (2019)
Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., Gu, Q.: Improving adversarial robustness requires revisiting misclassified examples. In: International Conference on Learning Representations (2019)
Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., Gu, Q.: Improving adversarial robustness requires revisiting misclassified examples. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020)
Wei, X., Liang, S., Chen, N., Cao, X.: Transferable adversarial attacks for image and video object detection. arXiv preprint arXiv:1811.12641 (2018)
Wong, E., Rice, L., Kolter, J.Z.: Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020)
Wu, D., Xia, S.T., Wang, Y.: Adversarial weight perturbation helps robust generalization. In: 4th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada, pp. 2958–2969 (2020)
Xie, C., et al.: Improving transferability of adversarial examples with input diversity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2730–2739 (2019)
Zagoruyko, S., Komodakis, N.: Wide residual networks (2016)
Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: accelerating adversarial training via maximal principle. arXiv preprint arXiv:1905.00877 (2019)
Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., Jordan, M.: Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning, pp. 7472–7482. PMLR (2019)
Zou, W., Huang, S., Xie, J., Dai, X., Chen, J.: A reinforced generation of adversarial examples for neural machine translation. arXiv preprint arXiv:1911.03677 (2019)
Acknowledgement
Supported by the National Key R &D Program of China under Grant 2018AAA01 02503, National Natural Science Foundation of China (No. U2001202, U1936208, 62006217). Beijing Natural Science Foundation (No. M22006). Shenzhen Science and Technology Program under grant No.RCYX20210609103057050, and Tencent AI Lab Rhino-Bird Focused Research Program under grant No. JR202123.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jia, X. et al. (2022). Prior-Guided Adversarial Initialization for Fast Adversarial Training. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13664. Springer, Cham. https://doi.org/10.1007/978-3-031-19772-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-19772-7_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19771-0
Online ISBN: 978-3-031-19772-7
eBook Packages: Computer ScienceComputer Science (R0)