Abstract
Due to recent advancements in technology, developers of intrusive software are finding more and more sophisticated ways to hide the existence of malicious code in software environments. It becomes difficult to identify viruses in the infected data sent in this way during analysis and detection phase of malware. For this reason, a significant amount of consideration has been devoted to research and development of methodologies and techniques that can identify miscellaneous malware without compromising the execution environment. In order to propose new methods, researchers are investigating not only the structure of malware detection algorithms, but also the properties that can be extracted from files. Extracted features allow malware to be detected even when virus creation tools change.
The authors of this study proposed a data structure consisting of 486 attributes that describe the most important file characteristics. The proposed structure was used to train neural networks to detect viruses. A set of over 400,000 infected and benign files were used to build the data set. Various machine learning algorithms based on unsupervised (k-means, self-organizing maps) and supervised (VGG-16, convolutional neural networks, ResNet) learning were tested. The performed tests were designed to determine the usefulness of the tested algorithms to detect malicious software.
Based on the implemented experimental research, the authors created and proposed a neural network architecture consisting of Dense and Dropout layers with L2 regularization that enables the detection of 8 types of malware with 98% accuracy. The great advantage of the article is the research carried out based on a large number of files. The proposed neural network architecture recognizes malware with at least the same accuracy as solutions offered by other authors and can be practically used to protect workstations against malicious files.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Malware Statistics & Trends Report | AV-TEST. https://www.av-test.org/en/statistics/malware/. Accessed 24 Feb 2022
Mahler, T., et al.: Know your enemy: characteristics of cyber-attacks on medical imaging devices. ArXiv180105583 Cs, February 2018. http://arxiv.org/abs/1801.05583. Accessed 24 Feb 2022
Samra, A.A.A., Qunoo, H.N., Al-Rubaie, F., El-Talli, H.: A survey of static android malware detection techniques. In: 2019 IEEE 7th Palestinian International Conference on Electrical and Computer Engineering (PICECE), pp. 1–6, March 2019. https://doi.org/10.1109/PICECE.2019.8747224
Sayadi, H., et al.: Towards accurate run-time hardware-assisted stealthy malware detection: a lightweight, yet effective time series CNN-based approach. Cryptography 5(4), Art. no. 4 (2021). https://doi.org/10.3390/cryptography5040028
Patil, S., et al.: Improving the robustness of AI-based malware detection using adversarial machine learning. Algorithms 14(10), Art. no. 10 (2021). https://doi.org/10.3390/a14100297
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, pp. 297–300, November 2010. https://doi.org/10.1109/BWCCA.2010.85
Sung, A.H., Xu, J., Chavez, P., Mukkamala, S.: Static analyzer of vicious executables (SAVE). In: 20th Annual Computer Security Applications Conference, pp. 326–334, December 2004. https://doi.org/10.1109/CSAC.2004.37
Awan, M.J., et al.: Image-based malware classification using VGG19 network and spatial convolutional attention. Electronics 10(19), Art. no. 19 (2021). https://doi.org/10.3390/electronics10192444
El-Shafai, W., Almomani, I., AlKhayer, A.: Visualized malware multi-classification framework using fine-tuned CNN-based transfer learning models. Appl. Sci. 11(14), Art. no. 14 (2021). https://doi.org/10.3390/app11146446
Xiao, G., Li, J., Chen, Y., Li, K.: MalFCS: an effective malware classification framework with automated feature extraction based on deep convolutional neural networks. J. Parallel Distrib. Comput. 141, 49–58 (2020). https://doi.org/10.1016/j.jpdc.2020.03.012
Naeem, H., et al.: Malware detection in industrial internet of things based on hybrid image visualization and deep learning model. Ad Hoc Netw. 105, 102154 (2020). https://doi.org/10.1016/j.adhoc.2020.102154
Manavi, F., Hamzeh, A.: A new method for ransomware detection based on PE header using convolutional neural networks. In: 2020 17th International ISC Conference on Information Security and Cryptology (ISCISC), pp. 82–87, September 2020. https://doi.org/10.1109/ISCISC51277.2020.9261903
Rezaei, T., Hamze, A.: An efficient approach for malware detection using PE header specifications. In: 2020 6th International Conference on Web Research (ICWR), pp. 234–239, April 2020. https://doi.org/10.1109/ICWR49608.2020.9122312
Chen, Z., Xie, Z., Zhang, W., Xu, X.: ResNet and model fusion for automatic spoofing detection. In: Interspeech 2017, pp. 102–106, August 2017. https://doi.org/10.21437/Interspeech.2017-1085
Ha, J., Roh, H.: Experimental evaluation of malware family classification methods from sequential information of TLS-encrypted traffic. Electronics 10(24), Art. no. 24 (2021). https://doi.org/10.3390/electronics10243180
Elkhawas, A.I., Abdelbaki, N.: Malware detection using opcode trigram sequence with SVM. In: 2018 26th International Conference on Software, Telecommunications and Computer Networks (SoftCOM), pp. 1–6, September 2018. https://doi.org/10.23919/SOFTCOM.2018.8555738
Mohammed, T.M., Nataraj, L., Chikkagoudar, S., Chandrasekaran, S., Manjunath, B.S.: HAPSSA: holistic approach to PDF malware detection using signal and statistical analysis. In: MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM), pp. 709–714, November 2021. https://doi.org/10.1109/MILCOM52596.2021.9653097
Elnaggar, R., Servadei, L., Mathur, S., Wille, R., Ecker, W., Chakrabarty, K.: Accurate and robust malware detection: running XGBoost on run-time data from performance counters. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 1 (2021). https://doi.org/10.1109/TCAD.2021.3102007
Tajoddin, A., Jalili, S.: HM3alD: polymorphic malware detection using program behavior-aware hidden Markov model. Appl. Sci. 8(7), Art. no. 7 (2018). https://doi.org/10.3390/app8071044
Wu, D., Guo, P., Wang, P.: Malware detection based on cascading XGBoost and cost sensitive. In: 2020 International Conference on Computer Communication and Network Security (CCNS), pp. 201–205, August 2020. https://doi.org/10.1109/CCNS50731.2020.00051
Feizollah, A., Anuar, N.B., Salleh, R., Amalina, F.: Comparative study of k-means and mini batch k-means clustering algorithms in android malware detection using network traffic analysis. In: 2014 International Symposium on Biometrics and Security Technologies (ISBAST), pp. 193–197, August 2014. https://doi.org/10.1109/ISBAST.2014.7013120
Fan, M., et al.: Graph embedding based familial analysis of android malware using unsupervised learning. In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 771–782, May 2019. https://doi.org/10.1109/ICSE.2019.00085
Demetrio, L., Biggio, B., Lagorio, G., Roli, F., Armando, A.: Explaining vulnerabilities of deep learning to adversarial malware binaries. ArXiv190103583 Cs, January 2019. http://arxiv.org/abs/1901.03583. Accessed 27 Feb 2022
Ahmed, M.E., Kim, H., Camtepe, S., Nepal, S.: Peeler: profiling kernel-level events to detect ransomware. In: Bertino, E., Shulman, H., Waidner, M. (eds.) ESORICS 2021. LNCS, vol. 12972, pp. 240–260. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88418-5_12
Al-Kasassbeh, M., Mohammed, S., Alauthman, M., Almomani, A.: Feature selection using a machine learning to classify a malware. In: Gupta, B.B., Perez, G.M., Agrawal, D.P., Gupta, D. (eds.) Handbook of Computer Networks and Cyber Security, pp. 889–904. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-22277-2_36
Oyama, Y., Miyashita, T., Kokubo, H.: Identifying useful features for malware detection in the ember dataset. In: 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), pp. 360–366, November 2019. https://doi.org/10.1109/CANDARW.2019.00069
Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. ArXiv180404637 Cs, April 2018. http://arxiv.org/abs/1804.04637. Accessed 26 Mar 2022
Oh, Y., Park, S., Ye, J.C.: Deep learning COVID-19 features on CXR using limited training data sets. IEEE Trans. Med. Imaging 39(8), 2688–2700 (2020). https://doi.org/10.1109/TMI.2020.2993291
Ni, K., et al.: Large-scale deep learning on the YFCC100M dataset. ArXiv150203409 Cs, February 2015. http://arxiv.org/abs/1502.03409. Accessed 01 Apr 2022
VirusShare.com. https://virusshare.com/. Accessed 22 Jan 2022
MalwareBazaar | Malware sample exchange. https://bazaar.abuse.ch/. Accessed 22 Jan 2022
Hemalatha, J., Roseline, S.A., Geetha, S., Kadry, S., Damaševičius, R.: An efficient DenseNet-based deep learning model for malware detection. Entropy 23(3), Art. no. 3 (2021). https://doi.org/10.3390/e23030344
Margaritelli, S.: Evilsocket/ergo (2022). https://github.com/evilsocket/ergo. Accessed 22 Jan 2022
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001). https://doi.org/10.1023/A:1010920819831
Targ, S., Almeida, D., Lyman, K.: Resnet in Resnet: generalizing residual architectures. ArXiv160308029 Cs Stat, March 2016. http://arxiv.org/abs/1603.08029. Accessed 22 Jan 2022
Zhu, Y., Newsam, S.: DenseNet for dense flow. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 790–794, September 2017. https://doi.org/10.1109/ICIP.2017.8296389
van Laarhoven, T.: L2 regularization versus batch and weight normalization. ArXiv170605350 Cs Stat, June 2017. http://arxiv.org/abs/1706.05350. Accessed 22 Jan 202
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nakrošis, A., Lagzdinytė-Budnikė, I., Paulauskaitė-Tarasevičienė, A., Paulikas, G., Dapkus, P. (2022). Deep Learning-Based Malware Detection Using PE Headers. In: Lopata, A., Gudonienė, D., Butkienė, R. (eds) Information and Software Technologies. ICIST 2022. Communications in Computer and Information Science, vol 1665. Springer, Cham. https://doi.org/10.1007/978-3-031-16302-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-16302-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16301-2
Online ISBN: 978-3-031-16302-9
eBook Packages: Computer ScienceComputer Science (R0)