Abstract
Behavior-based machine learning plays a vital role in malware classification, as it potentially overcomes the limitations of signature-based methods. This paper explores the use of dynamic call sequences as extracted by the open source Noriben tool, which employs dynamic analysis in a virtualized environment. Call sequences of a length of up to 5000 operations are generated for a total of 2000 benign and malware samples. Seven malware families are recognized: ransomware, trojan, backdoor, rootkit, virus, miner, and other. An empirical comparison analyzes five different classifiers: fully connected neural networks, GRU and LSTM, Transformer, and two combination approaches. The overall best performing approach is a concatenation of a GRU with a Transformer architecture, yielding the highest F1-score. This best model achieves accuracy and F1-score values of up to 97%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems (2016)
Alibaba: Alitianchi contest (2021). https://tianchi.aliyun.com/competition/introduction.htm?spm=5176.11409106.5678.1.4354684cI0fYC1?raceId=231668s
Athiwaratkun, B., Stokes, J.W.: Malware classification with lstm and gru language models and a character-level cnn. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2482–2486 (2017). https://doi.org/10.1109/ICASSP.2017.7952603
Baskin, B.: Noriben malware analysis sandbox (2015). https://github.com/Rurik/Noriben
Chen, J., Guo, S., Ma, X., Li, H., Guo, J., Chen, M., Pan, Z.: Slam: a malware detection method based on sliding local attention mechanism. Secur. Commun. Networks 2020, 6724513:1–6724513:11 (2020)
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation (2014)
FileHorse: Filehorse, June 2020, https://fileHorse.com
Forensics, C.: Virusshare, June 2020, https://virusshare.com/
Goldberg, Y., Levy, O.: word2vec explained: deriving mikolov et al’.s negative-sampling word-embedding method (2014). cite arxiv:1402.3722
Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C.: Deep learning for classification of malware system call sequences. In: Kang, B.H., Bai, Q. (eds.) AI 2016. LNCS (LNAI), vol. 9992, pp. 137–149. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50127-7_11
Maxwell, K.: Maltrieve: a tool to retrieve malware directly from the source for security researchers (2015). https://github.com/krmaxwell/maltrieve
O’Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., et al.: Keras Tuner (2019). https://github.com/keras-team/keras-tuner
Pedregosa, F., et al.: Scikit-learn: machine learning in python. JMLR 12, 2825–2830 (2011)
Pektas, A., Acarman, T.: Malware classification based on api calls and behaviour analysis. IET Inf. Secur. 12, 107–117 (2018)
Qian, Q., Tang, M.: Dynamic api call sequence visualization for malware classification. IET Inf. Secur. 13, October 2018
Saxe, J., Berlin, K.: expose: a character-level convolutional neural network with embeddings for detecting malicious urls, file paths and registry keys (2017)
Vaswani, A., et al.: Attention is all you need (2017)
VirusSign: Virussign, June 2020. https://samples.virussign.com/samples/
VirusTotal: Virustotal, June 2020. http://www.virustotal.com
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chanajitt, R., Pfahringer, B., Gomes, H.M. (2022). A Comparison of Neural Network Architectures for Malware Classification Based on Noriben Operation Sequences. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13529. Springer, Cham. https://doi.org/10.1007/978-3-031-15919-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-031-15919-0_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15918-3
Online ISBN: 978-3-031-15919-0
eBook Packages: Computer ScienceComputer Science (R0)