[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1014052.1014066acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Adversarial classification

Published: 22 August 2004 Publication History

Abstract

Essentially all data mining algorithms assume that the data-generating process is independent of the data miner's activities. However, in many domains, including spam detection, intrusion detection, fraud detection, surveillance and counter-terrorism, this is far from the case: the data is actively manipulated by an adversary seeking to make the classifier produce false negatives. In these domains, the performance of a classifier can degrade rapidly after it is deployed, as the adversary learns to defeat it. Currently the only solution to this is repeated, manual, ad hoc reconstruction of the classifier. In this paper we develop a formal framework and algorithms for this problem. We view classification as a game between the classifier and the adversary, and produce a classifier that is optimal given the adversary's optimal strategy. Experiments in a spam detection domain show that this approach can greatly outperform a classifier learned in the standard way, and (within the parameters of the problem) automatically adapt the classifier to the adversary's evolving manipulations.

References

[1]
P. Domingos. MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 155--164, San Diego, CA, 1999. ACM Press.
[2]
P. Domingos and M. Pazzani. On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29:103--130, 1997.
[3]
R. B. Doorenbos, O. Etzioni, and D. S. Weld. A scalable comparison-shopping agent for the World-Wide Web. In Proceedings of the First International Conference on Autonomous Agents, pages 39--48, Marina del Rey, CA, 1997. ACM Press.
[4]
T. Fawcett. "In vivo" spam filtering: A challenge problem for KDD. SIGKDD Explorations, 5(2):140--148, 2003.
[5]
T. Fawcett and F. Provost. Adaptive fraud detection. Data Mining and Knowledge Discovery, 1(3):291--316, 1997.
[6]
D. Fudenberg and D. Levine. The Theory of Learning in Games. MIT Press, Cambridge, MA, 1999.
[7]
D. Fudenberg and J. Tirole. Game Theory. MIT Press, Cambridge, MA, 1991.
[8]
M. R. Garey and D. S. Johnson. Computers and Intractability. Freeman, New York, NY, 1979.
[9]
L. Guernsey. Retailers rise in Google rankings as rivals cry foul. New York Times, November 20, 2003.
[10]
G. Hulten, L. Spencer, and P. Domingos. Mining time-changing data streams. In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 97--106, San Francisco, CA, 2001. ACM Press.
[11]
D. Jensen, M. Rattigan, and H. Blau. Information awareness: A prospective technical assessment. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 378--387, Washington, DC, 2003. ACM Press.
[12]
M. Kearns. Computational game theory. Tutorial, Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA, 2002. http://www.cis.upenn.edu/~mkearns/nips02tutorial/.
[13]
R. Kohavi and G. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2):273--324, 1997.
[14]
B. Krebs. Online piracy spurs high-tech arms race. Washington Post, June 26, 2003.
[15]
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157--163, New Brunswick, NJ, 1994. Morgan Kaufmann.
[16]
B. Lloyd. Been gazumped by Google? Trying to make sense of the "Florida" update. Search Engine Guide, November 25, 2003.
[17]
M. V. Mahoney and P. K. Chan. Learning nonstationary models of normal network traffic for detecting novel attacks. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 376--385, Edmonton, Canada, 2002. ACM Press.
[18]
A. McCallum and K. Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, 1998. AAAI Press.
[19]
F. Nielsen. Email data, 2003. http://www.imm.dtu.dk/~rem/datasets/.
[20]
F. Provost and T. Fawcett. Robust classification for imprecise environments. Machine Learning, 42:203--231, 2001.
[21]
J. Rennie. Ifile spam classifier, 2003. http://www.nongnu.org/ifile/.
[22]
P. Robertson and J. M. Brady. Adaptive image analysis for aerial surveillance. IEEE Intelligent Systems, 14(3):30--36, 1999.
[23]
M. Sahami, S. Dumais, D. Heckerman, and E. Horvitz. A Bayesian approach to filtering junk e-mail. In Proceedings of the AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, 1998. AAAI Press.
[24]
G. Sakkis, I. Androutsopoulos, G. Paliouras, V. Karkaletsis, C.D. Spyropoulos, and P. Stamatopoulos. A memory-based approach to anti-spam filtering for mailing lists. In Information Retrieval, volume 6, pages 49--73. Kluwer, 2003.
[25]
T. Senator. Ongoing management and application of discovered knowledge in a large regulatory organization: A case study of the use and impact of NASD regulation's advanced detection system (ADS). In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 44--53, Boston, MA, 2000. ACM Press.
[26]
J. M. Smith. Evolution and the Theory of Games. Cambridge University Press, Cambridge, UK, 1982.
[27]
P. Turney. Cost-sensitive learning bibliography. Online bibliography, NRC Institute for Information Technology, Ottawa, Canada, 1998. http://members.rogers.com/peter.turney/bibliographies/cost-sensitive.html.
[28]
WordNet 2.0: A lexical database for the English language, 2003. http://www.cogsci.princeton.edu/~wn/.

Cited By

View all
  • (2025)Game-Theoretic Neyman-Pearson Detection to Combat Strategic EvasionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351583420(516-530)Online publication date: 2025
  • (2025)Adversarial robust image processing in medical digital twinInformation Fusion10.1016/j.inffus.2024.102728115(102728)Online publication date: Mar-2025
  • (2025)Difference of Convex programming in adversarial SVMJournal of Computational and Applied Mathematics10.1016/j.cam.2024.116201454(116201)Online publication date: Jan-2025
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
August 2004
874 pages
ISBN:1581138881
DOI:10.1145/1014052
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 August 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cost-sensitive learning
  2. game theory
  3. integer linear programming
  4. naive Bayes
  5. spam detection

Qualifiers

  • Article

Conference

KDD04

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)207
  • Downloads (Last 6 weeks)29
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Game-Theoretic Neyman-Pearson Detection to Combat Strategic EvasionIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.351583420(516-530)Online publication date: 2025
  • (2025)Adversarial robust image processing in medical digital twinInformation Fusion10.1016/j.inffus.2024.102728115(102728)Online publication date: Mar-2025
  • (2025)Difference of Convex programming in adversarial SVMJournal of Computational and Applied Mathematics10.1016/j.cam.2024.116201454(116201)Online publication date: Jan-2025
  • (2024)A Survey of Machine Learning and Cryptography AlgorithmsInnovative Machine Learning Applications for Cryptography10.4018/979-8-3693-1642-9.ch006(105-118)Online publication date: 12-Apr-2024
  • (2024)Adversarial Examples on XAI-Enabled DT for Smart Healthcare SystemsSensors10.3390/s2421689124:21(6891)Online publication date: 27-Oct-2024
  • (2024)Protecting Classifiers from AttacksStatistical Science10.1214/24-STS92239:3Online publication date: 1-Aug-2024
  • (2024)From Transparency to Accountability and Back: A Discussion of Access and Evidence in AI AuditingProceedings of the 4th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3689904.3694711(1-14)Online publication date: 29-Oct-2024
  • (2024)Resilient Machine Learning: Advancement, Barriers, and Opportunities in the Nuclear IndustryACM Computing Surveys10.1145/364860856:9(1-29)Online publication date: 24-Apr-2024
  • (2024)Abstraction and Refinement: Towards Scalable and Exact Verification of Neural NetworksACM Transactions on Software Engineering and Methodology10.1145/364438733:5(1-35)Online publication date: 3-Jun-2024
  • (2024)RAAD: Reinforced Adversarial Anomaly DetectorProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3635920(883-891)Online publication date: 8-Apr-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media