research-article

VeriBin: A Malware Authorship Verification Approach for APT Tracking through Explainable and Functionality-Debiasing Adversarial Representation Learning

Authors:

Sarah LabrosseAuthors Info & Claims

ACM Transactions on Privacy and Security, Volume 27, Issue 3

Article No.: 26, Pages 1 - 37

https://doi.org/10.1145/3669901

Published: 16 August 2024 Publication History

Get Access

Abstract

Malware attacks are posing a significant threat to national security, cooperate network, and public endpoint security. Identifying the Advanced Persistent Threat (APT) groups behind the attacks and grouping their activities into attack campaigns help security investigators trace their activities thus providing better security protections against future attacks. Existing Cyber Threat Intelligent (CTI) components mainly focus on malware family identification and behavior characterization, which cannot solve the APT tracking problem: while APT tracking needs one to link malware binaries of multiple families to a single threat actor, these behavior or function-based techniques are tightened up to a specific attack technique and would fail on connecting different families. Binary Authorship Attribution (AA) solutions could discriminate against threat actors based on their stylometric traits. However, AA solutions assume that the author of a binary is within a fixed candidate author set. However, real-world malware binaries may be created by a new unknown threat actor.

To address this research gap, we propose VeriBin for the Binary Authorship Verification (BAV) problem. VeriBin is a novel adversarial neural network that extracts functionality-agnostic style representations from assembly code for the AV task. The extracted style representations can be visualized and are explainable with VeriBin’s multi-head attention mechanism. We benchmark VeriBin with state-of-the-art coding style representations on a standard dataset and a recent malware-APT dataset. Given two anonymous binaries of out-of-sample authors, VeriBin can accurately determine whether they belong to the same author or not. VeriBin is resilient to compiler optimizations and robust against malware family variants.

References

[1]

Mohammed Abuhamad, Tamer AbuHmed, Aziz Mohaisen, and DaeHun Nyang. 2018. Large-scale and language-oblivious code authorship identification. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018. 101–114.

Abstract

References

Index Terms

Recommendations

A Malware Detection Framework Based on Forensic and Unsupervised Machine Learning Methodologies

Adversarial Attack on Microarchitectural Events based Malware Detectors

APT Detector: Detect and Identify APT Malware Based on Deep Learning Framework

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

Share

Share this Publication link

Share on social media

Affiliations