Deep Learning-Based Malware Detection Using PE Headers

Arnas Nakrošis^8,9,
Ingrida Lagzdinytė-Budnikė⁸,
Agnė Paulauskaitė-Tarasevičienė⁸,
Giedrius Paulikas⁸ &
…
Paulius Dapkus⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1665))

Included in the following conference series:

International Conference on Information and Software Technologies

660 Accesses
1 Citations

Abstract

Due to recent advancements in technology, developers of intrusive software are finding more and more sophisticated ways to hide the existence of malicious code in software environments. It becomes difficult to identify viruses in the infected data sent in this way during analysis and detection phase of malware. For this reason, a significant amount of consideration has been devoted to research and development of methodologies and techniques that can identify miscellaneous malware without compromising the execution environment. In order to propose new methods, researchers are investigating not only the structure of malware detection algorithms, but also the properties that can be extracted from files. Extracted features allow malware to be detected even when virus creation tools change.

The authors of this study proposed a data structure consisting of 486 attributes that describe the most important file characteristics. The proposed structure was used to train neural networks to detect viruses. A set of over 400,000 infected and benign files were used to build the data set. Various machine learning algorithms based on unsupervised (k-means, self-organizing maps) and supervised (VGG-16, convolutional neural networks, ResNet) learning were tested. The performed tests were designed to determine the usefulness of the tested algorithms to detect malicious software.

Based on the implemented experimental research, the authors created and proposed a neural network architecture consisting of Dense and Dropout layers with L2 regularization that enables the detection of 8 types of malware with 98% accuracy. The great advantage of the article is the research carried out based on a large number of files. The proposed neural network architecture recognizes malware with at least the same accuracy as solutions offered by other authors and can be practically used to protect workstations against malicious files.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Malware Classification Using Image Representation

Using convolutional neural networks for classification of malware represented as images

Article 27 August 2018

Malware Classification in Local System Executable Files Using Deep Learning

References

Malware Statistics & Trends Report | AV-TEST. https://www.av-test.org/en/statistics/malware/. Accessed 24 Feb 2022
Mahler, T., et al.: Know your enemy: characteristics of cyber-attacks on medical imaging devices. ArXiv180105583 Cs, February 2018. http://arxiv.org/abs/1801.05583. Accessed 24 Feb 2022
Samra, A.A.A., Qunoo, H.N., Al-Rubaie, F., El-Talli, H.: A survey of static android malware detection techniques. In: 2019 IEEE 7th Palestinian International Conference on Electrical and Computer Engineering (PICECE), pp. 1–6, March 2019. https://doi.org/10.1109/PICECE.2019.8747224
Sayadi, H., et al.: Towards accurate run-time hardware-assisted stealthy malware detection: a lightweight, yet effective time series CNN-based approach. Cryptography 5(4), Art. no. 4 (2021). https://doi.org/10.3390/cryptography5040028
Patil, S., et al.: Improving the robustness of AI-based malware detection using adversarial machine learning. Algorithms 14(10), Art. no. 10 (2021). https://doi.org/10.3390/a14100297
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300, November 2010. https://doi.org/10.1109/BWCCA.2010.85
Sung, A.H., Xu, J., Chavez, P., Mukkamala, S.: Static analyzer of vicious executables (SAVE). In: 20th Annual Computer Security Applications Conference, pp. 326–334, December 2004. https://doi.org/10.1109/CSAC.2004.37
Awan, M.J., et al.: Image-based malware classification using VGG19 network and spatial convolutional attention. Electronics 10(19), Art. no. 19 (2021). https://doi.org/10.3390/electronics10192444
El-Shafai, W., Almomani, I., AlKhayer, A.: Visualized malware multi-classification framework using fine-tuned CNN-based transfer learning models. Appl. Sci. 11(14), Art. no. 14 (2021). https://doi.org/10.3390/app11146446
Xiao, G., Li, J., Chen, Y., Li, K.: MalFCS: an effective malware classification framework with automated feature extraction based on deep convolutional neural networks. J. Parallel Distrib. Comput. 141, 49–58 (2020). https://doi.org/10.1016/j.jpdc.2020.03.012
Article Google Scholar
Naeem, H., et al.: Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw. 105, 102154 (2020). https://doi.org/10.1016/j.adhoc.2020.102154
Article Google Scholar
Manavi, F., Hamzeh, A.: A new method for ransomware detection based on PE header using convolutional neural networks. In: 2020 17th International ISC Conference on Information Security and Cryptology (ISCISC), pp. 82–87, September 2020. https://doi.org/10.1109/ISCISC51277.2020.9261903
Rezaei, T., Hamze, A.: An efficient approach for malware detection using PE header specifications. In: 2020 6th International Conference on Web Research (ICWR), pp. 234–239, April 2020. https://doi.org/10.1109/ICWR49608.2020.9122312
Chen, Z., Xie, Z., Zhang, W., Xu, X.: ResNet and model fusion for automatic spoofing detection. In: Interspeech 2017, pp. 102–106, August 2017. https://doi.org/10.21437/Interspeech.2017-1085
Ha, J., Roh, H.: Experimental evaluation of malware family classification methods from sequential information of TLS-encrypted traffic. Electronics 10(24), Art. no. 24 (2021). https://doi.org/10.3390/electronics10243180
Elkhawas, A.I., Abdelbaki, N.: Malware detection using opcode trigram sequence with SVM. In: 2018 26th International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pp. 1–6, September 2018. https://doi.org/10.23919/SOFTCOM.2018.8555738
Mohammed, T.M., Nataraj, L., Chikkagoudar, S., Chandrasekaran, S., Manjunath, B.S.: HAPSSA: holistic approach to PDF malware detection using signal and statistical analysis. In: MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM), pp. 709–714, November 2021. https://doi.org/10.1109/MILCOM52596.2021.9653097
Elnaggar, R., Servadei, L., Mathur, S., Wille, R., Ecker, W., Chakrabarty, K.: Accurate and robust malware detection: running XGBoost on run-time data from performance counters. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 1 (2021). https://doi.org/10.1109/TCAD.2021.3102007
Tajoddin, A., Jalili, S.: HM3alD: polymorphic malware detection using program behavior-aware hidden Markov model. Appl. Sci. 8(7), Art. no. 7 (2018). https://doi.org/10.3390/app8071044
Wu, D., Guo, P., Wang, P.: Malware detection based on cascading XGBoost and cost sensitive. In: 2020 International Conference on Computer Communication and Network Security (CCNS), pp. 201–205, August 2020. https://doi.org/10.1109/CCNS50731.2020.00051
Feizollah, A., Anuar, N.B., Salleh, R., Amalina, F.: Comparative study of k-means and mini batch k-means clustering algorithms in android malware detection using network traffic analysis. In: 2014 International Symposium on Biometrics and Security Technologies (ISBAST), pp. 193–197, August 2014. https://doi.org/10.1109/ISBAST.2014.7013120
Fan, M., et al.: Graph embedding based familial analysis of android malware using unsupervised learning. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 771–782, May 2019. https://doi.org/10.1109/ICSE.2019.00085
Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Explaining vulnerabilities of deep learning to adversarial malware binaries. ArXiv190103583 Cs, January 2019. http://arxiv.org/abs/1901.03583. Accessed 27 Feb 2022
Ahmed, M.E., Kim, H., Camtepe, S., Nepal, S.: Peeler: profiling kernel-level events to detect ransomware. In: Bertino, E., Shulman, H., Waidner, M. (eds.) ESORICS 2021. LNCS, vol. 12972, pp. 240–260. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88418-5_12
Chapter Google Scholar
Al-Kasassbeh, M., Mohammed, S., Alauthman, M., Almomani, A.: Feature selection using a machine learning to classify a malware. In: Gupta, B.B., Perez, G.M., Agrawal, D.P., Gupta, D. (eds.) Handbook of Computer Networks and Cyber Security, pp. 889–904. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22277-2_36
Chapter Google Scholar
Oyama, Y., Miyashita, T., Kokubo, H.: Identifying useful features for malware detection in the ember dataset. In: 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), pp. 360–366, November 2019. https://doi.org/10.1109/CANDARW.2019.00069
Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. ArXiv180404637 Cs, April 2018. http://arxiv.org/abs/1804.04637. Accessed 26 Mar 2022
Oh, Y., Park, S., Ye, J.C.: Deep learning COVID-19 features on CXR using limited training data sets. IEEE Trans. Med. Imaging 39(8), 2688–2700 (2020). https://doi.org/10.1109/TMI.2020.2993291
Article Google Scholar
Ni, K., et al.: Large-scale deep learning on the YFCC100M dataset. ArXiv150203409 Cs, February 2015. http://arxiv.org/abs/1502.03409. Accessed 01 Apr 2022
VirusShare.com. https://virusshare.com/. Accessed 22 Jan 2022
MalwareBazaar | Malware sample exchange. https://bazaar.abuse.ch/. Accessed 22 Jan 2022
Hemalatha, J., Roseline, S.A., Geetha, S., Kadry, S., Damaševičius, R.: An efficient DenseNet-based deep learning model for malware detection. Entropy 23(3), Art. no. 3 (2021). https://doi.org/10.3390/e23030344
Margaritelli, S.: Evilsocket/ergo (2022). https://github.com/evilsocket/ergo. Accessed 22 Jan 2022
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001). https://doi.org/10.1023/A:1010920819831
Article MATH Google Scholar
Targ, S., Almeida, D., Lyman, K.: Resnet in Resnet: generalizing residual architectures. ArXiv160308029 Cs Stat, March 2016. http://arxiv.org/abs/1603.08029. Accessed 22 Jan 2022
Zhu, Y., Newsam, S.: DenseNet for dense flow. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 790–794, September 2017. https://doi.org/10.1109/ICIP.2017.8296389
van Laarhoven, T.: L2 regularization versus batch and weight normalization. ArXiv170605350 Cs Stat, June 2017. http://arxiv.org/abs/1706.05350. Accessed 22 Jan 202

Download references

Author information

Authors and Affiliations

Department of Applied Informatics, Kaunas University of Technology, Studentų St. 50–407, 51368, Kaunas, Lithuania
Arnas Nakrošis, Ingrida Lagzdinytė-Budnikė, Agnė Paulauskaitė-Tarasevičienė & Giedrius Paulikas
National Cyber Security Centre Under the Ministry of National Defense, Gediminas Avenue 40, Vilnius, Lithuania
Arnas Nakrošis & Paulius Dapkus

Authors

Arnas Nakrošis
View author publications
You can also search for this author in PubMed Google Scholar
Ingrida Lagzdinytė-Budnikė
View author publications
You can also search for this author in PubMed Google Scholar
Agnė Paulauskaitė-Tarasevičienė
View author publications
You can also search for this author in PubMed Google Scholar
Giedrius Paulikas
View author publications
You can also search for this author in PubMed Google Scholar
Paulius Dapkus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnas Nakrošis .

Editor information

Editors and Affiliations

Kaunas University of Technology, Kaunas, Lithuania
Audrius Lopata
Kaunas University of Technology, Kaunas, Lithuania
Daina Gudonienė
Kaunas University of Technology, Kaunas, Lithuania
Rita Butkienė

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakrošis, A., Lagzdinytė-Budnikė, I., Paulauskaitė-Tarasevičienė, A., Paulikas, G., Dapkus, P. (2022). Deep Learning-Based Malware Detection Using PE Headers. In: Lopata, A., Gudonienė, D., Butkienė, R. (eds) Information and Software Technologies. ICIST 2022. Communications in Computer and Information Science, vol 1665. Springer, Cham. https://doi.org/10.1007/978-3-031-16302-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-16302-9_1
Published: 06 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16301-2
Online ISBN: 978-3-031-16302-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics