Computer Science > Machine Learning

arXiv:2302.10722 (cs)

[Submitted on 21 Feb 2023 (v1), last revised 6 Dec 2023 (this version, v2)]

Title:Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

Authors:Sihui Dai, Wenxin Ding, Arjun Nitin Bhagoji, Daniel Cullina, Ben Y. Zhao, Haitao Zheng, Prateek Mittal

View PDF

Abstract:Finding classifiers robust to adversarial examples is critical for their safe deployment. Determining the robustness of the best possible classifier under a given threat model for a given data distribution and comparing it to that achieved by state-of-the-art training methods is thus an important diagnostic tool. In this paper, we find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset. We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints. We further define other variants of the attacker-classifier game that determine the range of the optimal loss more efficiently than the full-fledged hypergraph construction. Our evaluation shows, for the first time, an analysis of the gap to optimal robustness for classifiers in the multi-class setting on benchmark datasets.

Comments:	NeurIPS 2023 Spotlight
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as:	arXiv:2302.10722 [cs.LG]
	(or arXiv:2302.10722v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2302.10722

Submission history

From: Sihui Dai [view email]
[v1] Tue, 21 Feb 2023 15:17:13 UTC (738 KB)
[v2] Wed, 6 Dec 2023 19:33:31 UTC (531 KB)

Computer Science > Machine Learning

Title:Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators