A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences

Rajchada Chanajitt¹²,
Bernhard Pfahringer¹² &
Heitor Murilo Gomes¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13529))

Included in the following conference series:

International Conference on Artificial Neural Networks

2601 Accesses

Abstract

Behavior-based machine learning plays a vital role in malware classification, as it potentially overcomes the limitations of signature-based methods. This paper explores the use of dynamic call sequences as extracted by the open source Noriben tool, which employs dynamic analysis in a virtualized environment. Call sequences of a length of up to 5000 operations are generated for a total of 2000 benign and malware samples. Seven malware families are recognized: ransomware, trojan, backdoor, rootkit, virus, miner, and other. An empirical comparison analyzes five different classifiers: fully connected neural networks, GRU and LSTM, Transformer, and two combination approaches. The overall best performing approach is a concatenation of a GRU with a Transformer architecture, yielding the highest F1-score. This best model achieves accuracy and F1-score values of up to 97%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Ensemble Malware Classification Using Neural Networks

Lightweight Behavior-Based Malware Detection

Malware Classification Based on Graph Convolutional Neural Networks and Static Call Graph Features

Notes

1.
Downloads from VirusShare [8] and VirusSign [18].
2.
Downloads from FileHorse [7].

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016)
Google Scholar
Alibaba: Alitianchi contest (2021). https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.11409106.5678.1.4354684cI0fYC1?raceId=231668s
Athiwaratkun, B., Stokes, J.W.: Malware classification with lstm and gru language models and a character-level cnn. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2482–2486 (2017). https://doi.org/10.1109/ICASSP.2017.7952603
Baskin, B.: Noriben malware analysis sandbox (2015). https://github.com/Rurik/Noriben
Chen, J., Guo, S., Ma, X., Li, H., Guo, J., Chen, M., Pan, Z.: Slam: a malware detection method based on sliding local attention mechanism. Secur. Commun. Networks 2020, 6724513:1–6724513:11 (2020)
Google Scholar
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation (2014)
Google Scholar
FileHorse: Filehorse, June 2020, https://fileHorse.com
Forensics, C.: Virusshare, June 2020, https://virusshare.com/
Goldberg, Y., Levy, O.: word2vec explained: deriving mikolov et al’.s negative-sampling word-embedding method (2014). cite arxiv:1402.3722
Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C.: Deep learning for classification of malware system call sequences. In: Kang, B.H., Bai, Q. (eds.) AI 2016. LNCS (LNAI), vol. 9992, pp. 137–149. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50127-7_11
Chapter Google Scholar
Maxwell, K.: Maltrieve: a tool to retrieve malware directly from the source for security researchers (2015). https://github.com/krmaxwell/maltrieve
O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al.: Keras Tuner (2019). https://github.com/keras-team/keras-tuner
Pedregosa, F., et al.: Scikit-learn: machine learning in python. JMLR 12, 2825–2830 (2011)
Google Scholar
Pektas, A., Acarman, T.: Malware classification based on api calls and behaviour analysis. IET Inf. Secur. 12, 107–117 (2018)
Article Google Scholar
Qian, Q., Tang, M.: Dynamic api call sequence visualization for malware classification. IET Inf. Secur. 13, October 2018
Google Scholar
Saxe, J., Berlin, K.: expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need (2017)
Google Scholar
VirusSign: Virussign, June 2020. https://samples.virussign.com/samples/
VirusTotal: Virustotal, June 2020. http://www.virustotal.com

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Rajchada Chanajitt, Bernhard Pfahringer & Heitor Murilo Gomes

Authors

Rajchada Chanajitt
View author publications
You can also search for this author in PubMed Google Scholar
Bernhard Pfahringer
View author publications
You can also search for this author in PubMed Google Scholar
Heitor Murilo Gomes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajchada Chanajitt .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teesside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chanajitt, R., Pfahringer, B., Gomes, H.M. (2022). A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_36

Download citation

DOI: https://doi.org/10.1007/978-3-031-15919-0_36
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15918-3
Online ISBN: 978-3-031-15919-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics