ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning
<p>Taking into account the elements of the NIST layered security, CSET provides a model where different algorithms are applied and adapted to ML tasks, generating new opportunities for improvement in terms of prevention, detection, response & recovery and active defense. From where, GAN stands for Generative Adversarial Network - and NLP stands for Natural Language Processing.</p> "> Figure 2
<p>Proposed framework for the generation of synthetic binary samples in PE format using RL.</p> "> Figure 3
<p>Instrumentation components of the synthetic sample in functional form.</p> "> Figure 4
<p>Proposed framework for the generation of synthetic samples of DoS cyber-attacks employing RL.</p> ">
Abstract
:1. Introduction
2. Aim and Motivation—The Problem with Cyber Threat Datasets
3. The Importance of Data in Cybersecurity Tasks
- Domain problem: obtaining samples with new behaviors is a race between early detection and the radius of impact of an unattended incident, such as zero-day incidents. While not all samples in the cybersecurity domain can be aggregated, patterns can be generated based on general behavioral policies.
- Inconsistency problems: associated with the domain problem. In an attack or compromise scenario, a data source may have many or few samples related to the incident, which may lead to some inherent problems in the data acquisition process, such as noise, incomplete, insignificant, high-dimensional and unbalanced information.
- Availability: often, due to privacy concerns, datasets will not be available for replication or testing; they may also be costly or may not contain the samples for a desired context. An important point to mention is that public disclosure of detection strategies could also be a double-edged sword, since on the one hand it refines defensive proposals, but on the other hand it gives information to the malicious actor to improve his evasion techniques.
- Attack-related
- -
- Refers to samples related to malicious intrusions such as scam, malware and web-based attacks
- Defender artifacts
- -
- These are samples that arise from defense system logs such as alerts, anomalous patterns and configurations.
- Management and organizational
- -
- It is related to behavioral data around security policies involving users, malicious actors and threats that impact the organization.
- Network and Internet macro-level data
- -
- Contains malicious trace samples over Local Area Network (LAN), Wireless Area Network (WLAN) and Internet networks. It is presented as information about network traffic at different layers of the Open Systems Interconnection (OSI) model.
- Attack-related: data collection, particularly for malware analysis purposes, has become a strenuous process. In [22], it is concluded that the escalation of malicious incursions goes hand in hand with the increase of resources to build controlled analysis of malicious samples. On the one hand, SA-oriented procedures are limited in generating single-use hashes, without considering the dynamics of behavioral change. On the other hand, DA-oriented techniques only work in phases where the binary is being executed and monitored, but do not provide incremental results. Consequently, those based on heuristics are complex to set up when the threat is difficult to detect and, therefore, more in-depth generation is required. Ultimately, tasks based on anomaly detection procedures can be easily fooled when new obfuscation and cryptographic schemes are adopted in the fabrication of the malicious binary.
- Network and Internet macro-level data: there is concern about the latent increase in DoS cyber-attacks, which have eventually become a weapon for hire or sale, available to any user. With this, the variety, intensity and volume of traffic generated during an attack on the network or Internet has considerably changed the landscape of sample acquisition. This is a major challenge when replicating an attack in controlled environments, specifically for an ideal scenario of reflected DoS cyber-attacks [23].
4. Proposed Methodology
4.1. Reinforcement Learning for Creating Synthetic Sample Gyms
4.2. Creation of Synthetic Malware Samples in PE Formats by RL
4.2.1. Data Collection
4.2.2. Learning Space: Actions and States
- Adding an obfuscated function to the import table
- Manipulate the common name of the sections in each offset
- Build and increase the spacing of the format sections
- Add bytes to the remaining free space at the end of the sections
- Create a new entry point that immediately goes to the original entry point of each offset
- Remove information about compilation and debugging signatures
- Package the binary and add bytes at the end of the last section of the PE file
Algorithm 1 Learning process, out-of-policy |
Require:, arbitrarilty, and for each do Initialize agent a with sates s at time for each do Choose A from S using derived from Q Take action , observe R, end for until is terminal, hence the PE is fully mutated end for |
4.2.3. Synthetic Header Checking and LIEF Feature Extraction
4.3. Synthetic Sample Generation of DoS Cyber-Attacks Using RL
4.3.1. Data Collection
- CICDDoS2019 [58]: contains benign network traffic and distributed DoS cyber-attacks via SNMP Simple Network Management Protocol (SNMP) reflected attacks, NetBIOS reflected and timestamp attacks, Lightweight Directory Access Protocol (LDAP) amplification attacks, Trivial File Transfer Protocol (TFTP) amplification attacks, Network Time Protocol (NTP) amplification attacks, Synchronize (SYN) flooding attacks, WebDDoS Hypertext Transfer Protocol (HTTP) specific-protocol attack, Microsoft SQL Server (MSSQL) specific-protocol attacks, User Datagram Protocol (UDP) lag flooding attacks, Domain Name System (DNS) flooding attacks, and Simple Service Discovery Protocol (SSDP) reflection attacks.
- Customized set: for application layer DoS cyber-attacks, hulk and slowlowris protocol-specific flooding tools [59] were deployed. To build the scenario, 12 virtual machines were configured with Windows operating system versions 7 and 10 (half of each pool of virtual machines), mounted on 4 PCs with 16GB of RAM, Intel core i7 processor and Ubuntu operating system. The series of attacks were orchestrated towards a gateway with permissive rules (any-any) and also a spare port was configured in the generic switch to capture traffic on a third computer with Ubuntu operating system and tcmpdump as a logging tool.
4.3.2. Learning Space: Actions and States
Algorithm 2 Learning process, out-of-policy for mutating DoS cyber-attacks frames |
Require:, arbitrarilty, and for each network frame do Sort the start of the protocol conversation Build an action space S for each frame for each do Initialize agent a with sates s at time for each do Choose A from S using derived from Q Take action , observe R, end for until is terminal, hence the DoS cyber-attack frame is fully mutated. end for end for |
5. Results and Discussions
- First scope: the samples are subjected to traditional detection tools before being transformed: for synthetic malware samples, VirusTotal [65] is used as a sensor for different antivirus programs; for DoS cyber-attacks samples, the generic detection CloudShark [66] firewall based on behavioral signatures is used.
- Second scope: the already characterized samples are compared with different synthetic generation and balancing techniques and finally evaluated in terms of performance metrics by different state-of-the-art SL algorithms.
- VirusTotal: the sensor works by assembling different antivirus machines, which together determine the evaluation criteria of the submitted sample. The accuracy value is taken into account as follows: 40 to 60 independent machines are used to perform a diagnosis of the object by means of a simple triaging, if 60% exceeds the malicious assignment, the object is considered as such, otherwise it is considered clean.Although it is one of the main early malware evaluation mechanisms, it is difficult to summarize the specific category of the synthetic samples. Even so, it could be observed that in the labels of each machine, 40% presented a signature denominated as Generic, 31% as Malicious, 13% as Trojan, 8% as Riskware, 3% as Adware, and, the rest in different taxonomies.
- CloudShark: the tool allows loading PCAP files and analyzing the degree of threat contained in the sample. It is worth mentioning that each of them can be composed of different frames, called streams, i.e., the sample contains the entire conversation. To determine whether the conversation presents any anomalous pattern, CloudShark compares the IDS Snort signature and returns a categorical value of the threat as low, medium or high.
- Shallow algorithms: are those that the literature refers to as classical, where learning takes place by means of predefined features and labels, in a continuous forward model. The following algorithms derived from Table 7 were reported: Multi-Layer Perception (MLP) [67,74], Decision Trees (DT) [67,74], Logistic Regression (LR) [67,71], Support Vector Machines (SVM) [67,69,71], Random Forest (RF) [67,71], and Gradient Boosting (GB) [73].
- Deep learning algorithms: those based on neural networks, which perform operations on different layers that represent a simplified form of information to each of them. They are known to work with different information input and output structures. In this sense, the following algorithms were reported: Deep Residual Network (CNN+DRN) also known as ResNet-18, [68], MDGAN (the discriminator itself worked as a classifier) [70] and Feed Forward Neural Network (FFNN) [72].
- For the synthetic malware generation scope
- -
- GAN-MLP [67] exceeded the precision record with 99.46% towards the counterpart of this project (MLP-RL) with 99.12%. The variation is not high, however, GAN-MLP can result an unsatisfactory procedure if the number of malware samples increase, this has already been reported in [76] where the complexity is proportional to the number of inputs, making the latter unstable and slow, which could produce samples with low quality. Moreover, if the network produced by the GAN is linked as input to a MLP the number of parameters to be estimated can be exponential, generating a redundant model or with low efficiency.
- -
- In the matter of Fuzzy-SMOTE+ SVM [69], the precision value of 99.02% exceeded this work, compared with RL + SVM, which only achieved a percentage of 99.81%. It is worth mentioning that the Fuzzy-SMOTE computation could work as long as it is desired to report a preliminary range in the sample balancing. Even so, such algorithm only creates replicates with little mutation, that could, in the worst case, generate an overfitting phenomenon.
- -
- On the other hand, the shallow algorithms, trained with this scope, namely, RL + DT, RL + RF and RL + LR, obtained better results in terms of precision, with 99.71%, 99.81% and 99.45%, respectively, compared to their GAN-oriented [67] counterparts, which obtained a precision score as follows: GAN + DT with 90.43%, GAN + RF with 98.87% and GAN + LR with 96.71%.It is known that pure or ensemble DT algorithms are ideal for their ability to understand and interpret problems in a timely manner with little data preparation. This can be enhanced if random splitting methods such as RF are applied. However, the risk of combining trees and GANs is the instability of the model when the number of samples increases considerably, which, unlike RL the process concentrates within the malware sample mutation on the agent’s policies and not on the competition between generators and discriminators.
- -
- Moreover, as for shallow algorithms employing MDM [71] and One-hot-encoding, lower values were obtained as a function of the aforementioned precision, these include MDM + One-Hot-Encoding + SVM with 97.30%, MDM + One-Hot-Encoding + LR with 98.10% and MDM + One-Hot-Encoding + RF with 98.20%, which, compared to this project, the following values were obtained: RL + SVM 98.70%, RL + LR 99.45% and RL + RF 99.81%.This has an important reason and lies in the fact that MDM is the theoretical basis of RL, but oriented to infer the total probability between states, without considering the instability risks of adding partial observations and out-of-policy state controls, which, the present work applied with Q-learning. In addition, MDM has been shown to have problems in the optimal search for policies in transition states, when the number of samples increases and that coupled with One-hot-encoding-type characterizations, would result in identical synthetic malware samples, with little context and poor semantics.When MDM is added to an AWGCN [71] it is possible to obtain a scope that other neural networks cannot reach, especially when it is desirable to work with s non-structured data, particularly because of the attention layer provided. However, with large-scale data there is the possibility of suffering from noise, scalability disturbances and adverse discrepancy between the rules of the AWGCN trees, which would require sufficient mini-batch tasks to approach a high-volume environment. The results demonstrated in terms of precision showed that MDM + AWGCN + SVM with 99.20%, MDM + AWGCN + LR with 99.30% and MDM + AWGCN + RF 98.70% in percentage, obtained lower values than this project with RL + SVM 98.70%, RL + LR 99.45% and RL + RF 99.81% values.
- -
- By comparing hybrid GAN algorithms such as DCGAN + ResNet-18 [68] and MDGAN [70], it can be observed that different types of malware features such as pixel representation and sequences of APIs can be combined and then the resulting matrix can be sampled at low cost, reducing the training time and synthetic production. However, in addition to the already-mentioned disadvantages of GAN, a high volume of inputs can affect the calculation of probabilistic distribution when combining two or more DL mechanisms. In summary, both DCGAN + ResNet-18 and MDGAN obtained lower precision values than proposed in this project with RL, with 90.00% and 95.90%, respectively.
- For the scope of DoS cyber-attacks
- -
- The scope presented in [72], Statistical learning + FFNN, has an interesting potential since mutations are fully adequate to descriptive statistics of different samples. However, there are not enough indicators of sample quality, volume and functionality. Moreover, a FFNN is known to present many parameter fitting problems, especially since its optimization function is based on gradient optimization. Even so, when testing this RL + FFNN project, better results were obtained, in terms of performance with 96.41% compared to its counterpart with a value of 88.00%.
- -
- Although in [73] the GAN is stabilized with the GP penalty algorithm, to minimize boosting and gradient vanishing effects, the WGAN is not immune to high volume impact during training phases, in fact, the penalty avoids the cost of hyper-parameter computation. Indeed WGAN, presents oscillation and learning convergence problems for new samples, especially those of large content such as DoS cyber-attacks. The SL part of this scope employed the GB algorithm, despite that, no relevant data were available to compare the precision metric of the algorithm. However, it was possible to calculate the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), which measures the performance of the model at different thresholds. The AUC is then said to be measured in an interval of , where 1 is a perfect model. This indicates that the RL + GB model of this project also obtained a better result than GP-WGANs + GB, with an AUC of .
- -
- The main limitations of MDM have already been mentioned. In [74], it was explored to use it with PRISM, in order to stabilize states that could result in unpredictable behavior. A problem that could occur in this case is that, being a simulation algorithm, the costs generated per se would be exhaustive and it could not be guaranteed that the samples would result in sufficient quality. In fact, RL, avoids this scenario by reducing the unfavorable states, so that the agent learns what is necessary and reduces the search for optimization criteria to increase. Consequently, both MDM + PRISM + MLP and MDM + PRISM + DT obtained lower precision records with 79.70% and 98.80% respectively compared to RL + MLP with a score of 99.81% RL + DT with 99.94%.
- -
- Finally, in [60], although the sample development is more technically oriented, there are no data to compare with ML algorithms to evaluate the quality and shape of the synthetic samples.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Enisa Threat Landscape 2021. 2021. Available online: https://www.enisa.europa.eu/publications/enisa-threat-landscape-2021 (accessed on 19 November 2022).
- Kolias, C.; Kambourakis, G.; Stavrou, A.; Voas, J. DDoS in the IoT: Mirai and other botnets. Computer 2017, 50, 80–84. [Google Scholar] [CrossRef]
- Moore, T. The economics of cybersecurity: Principles and policy options. Int. J. Crit. Infrastruct. 2010, 3, 103–117. [Google Scholar] [CrossRef]
- Leszczyna, R. Review of cybersecurity assessment methods: Applicability perspective. Comput. Secur. 2021, 108, 102376. [Google Scholar] [CrossRef]
- Ford, V.; Siraj, A. Applications of machine learning in cyber security. In Proceedings of the 27th International Conference on Computer Applications in Industry and Engineering, New Orleans, LO, USA, 1–20 August 2015; IEEE Xplore: Kota Kinabalu, Malaysia, 2014; Volume 118. [Google Scholar]
- Ucci, D.; Aniello, L.; Baldoni, R. Survey of machine learning techniques for malware analysis. Comput. Secur. 2019, 81, 123–147. [Google Scholar] [CrossRef] [Green Version]
- McAfee Labs and Advanced Threat Research. McAfee Labs Threats Report. 2019. Available online: https://www.trellix.com/fr-ca/advanced-research-center/threat-reports.html (accessed on 21 October 2022).
- Yu, B.; Fang, Y.; Yang, Q.; Tang, Y.; Liu, L. A survey of malware behavior description and analysis. Front. Inf. Technol. Electron. 2018, 19, 583–603. [Google Scholar] [CrossRef]
- Khalaf, B.A.; Mostafa, S.A.; Mustapha, A.; Mohammed, M.A.; Abduallah, W.M. Comprehensive review of artificial intelligence and statistical approaches in distributed denial of service attack and defense methods. IEEE Access 2019, 7, 51691–51713. [Google Scholar] [CrossRef]
- Valdovinos, I.A.; Perez-Diaz, J.A.; Choo, K.K.R.; Botero, J.F. Emerging DDoS attack detection and mitigation strategies in software-defined networks: Taxonomy, challenges and future directions. J. Netw. Comput. Appl. 2021, 187, 103093. [Google Scholar] [CrossRef]
- Nikoloudakis, Y.; Kefaloukos, I.; Klados, S.; Panagiotakis, S.; Pallis, E.; Skianis, C.; Markakis, E.K. Towards a Machine Learning Based Situational Awareness Framework for Cybersecurity: An SDN Implementation. Sensors 2021, 21, 4939. [Google Scholar] [CrossRef]
- Handa, A.; Sharma, A.; Shukla, S.K. Machine learning in cybersecurity: A review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2019, 9, e1306. [Google Scholar] [CrossRef]
- Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Xu, M. A survey on machine learning techniques for cyber security in the last decade. IEEE Access 2020, 8, 222310–222354. [Google Scholar] [CrossRef]
- Roh, Y.; Heo, G.; Whang, S.E. A survey on data collection for machine learning: A big data-ai integration perspective. IEEE Trans. Knowl. Data Eng. 2019, 33, 1328–1347. [Google Scholar] [CrossRef] [Green Version]
- Paullada, A.; Raji, I.D.; Bender, E.M.; Denton, E.; Hanna, A. Data and its (dis) contents: A survey of dataset development and use in machine learning research. Patterns 2021, 2, 100336. [Google Scholar] [CrossRef] [PubMed]
- Sarker, I.H.; Kayes, A.; Badsha, S.; Alqahtani, H.; Watters, P.; Ng, A. Cybersecurity data science: An overview from machine learning perspective. J. Big Data 2020, 7, 1–29. [Google Scholar] [CrossRef]
- Humayun, M.; Jhanjhi, N.; Talib, M.; Shah, M.H.; Suseendran, G. Cybersecurity for Data Science: Issues, Opportunities, and Challenges. Lect. Notes Netw. Syst. 2021, 248, 435–444. [Google Scholar] [CrossRef]
- Alshaibi, A.; Al-Ani, M.; Al-Azzawi, A.; Konev, A.; Shelupanov, A. The Comparison of Cybersecurity Datasets. Data 2022, 7, 22. [Google Scholar] [CrossRef]
- Dasgupta, D.; Akhtar, Z.; Sen, S. Machine learning in cybersecurity: A comprehensive survey. J. Def. Model. Simul. 2022, 19, 57–106. [Google Scholar] [CrossRef]
- Sarker, I.H. A machine learning based robust prediction model for real-life mobile phone data. Internet Things. 2019, 5, 180–193. [Google Scholar] [CrossRef] [Green Version]
- Zheng, M.; Robbins, H.; Chai, Z.; Thapa, P.; Moore, T. Cybersecurity research datasets: Taxonomy and empirical analysis. In Proceedings of the 11th USENIX Workshop on Cyber Security Experimentation and Test (CSET 18), Baltimore, MD, USA, 13 August 2018. [Google Scholar]
- Naseer, M.; Rusdi, J.F.; Shanono, N.M.; Salam, S.; Muslim, Z.B.; Abu, N.A.; Abadi, I. Malware Detection: Issues and Challenges. In Proceedings of the 2019 International Conference of Science and Information Technology in Smart Administration (ICSINTeSA), Balikpapan, Indonesia, 16–17 October 2019; IOP Publishing: Bristol, UK, 2021; Volume 1807, p. 012011. [Google Scholar] [CrossRef]
- Alzahrani, R.J.; Alzahrani, A. Security Analysis of DDoS Attacks Using Machine Learning Algorithms in Networks Traffic. Electronics 2021, 10, 2919. [Google Scholar] [CrossRef]
- Sikorsi, A.M. Practical Malware Analysis: A Hands-On Guide to Dissecting Malicious Software; 1st Edition, Kindle Edition; No Starch Press: San Francisco, CA, USA, 2012. [Google Scholar]
- Nikolenko, S.I. Synthetic Data for Deep Learning; Springer: Berlin/Heidelberg, Germany, 2021; Volume 174. [Google Scholar]
- Ye, J.; Xue, Y.; Long, L.R.; Antani, S.; Xue, Z.; Cheng, K.C.; Huang, X. Synthetic sample selection via reinforcement learning. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Lima, Peru, 4–8 October 2020; pp. 53–63. [Google Scholar] [CrossRef]
- Polizzotto, M.N.; Finfer, S.; Garcia, F.; Sönnerborg, A.; Zazzi, M.; Böhm, M.; Jorm, L.; Barbieri, S.; Kaiser, R.; I-Hsien Kuo, N. The Health Gym: Synthetic Health-Related Datasets for the Development of Reinforcement Learning Algorithms. arXiv 2022, arXiv:2203.06369. [Google Scholar]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep reinforcement learning: A brief survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef] [Green Version]
- Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. Openai gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
- Xiang, X.; Foo, S. Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems: Part 1—Fundamentals and Applications in Games, Robotics and Natural Language Processing. Mach. Learn. Knowl. Extr. 2021, 3, 554–581. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Singh, J.; Singh, J. A survey on machine learning-based malware detection in executable files. J. Syst. Archit. 2021, 112, 101861. [Google Scholar] [CrossRef]
- Aboaoja, F.A.; Zainal, A.; Ghaleb, F.A.; Al-rimy, B.A.S.; Eisa, T.A.E.; Elnour, A.A.H. Malware Detection Issues, Challenges, and Future Directions: A Survey. Appl. Sci. 2022, 12, 8482. [Google Scholar] [CrossRef]
- Karl-Bridge-Microsoft. PE Format-Win32 Apps. Available online: https://github.com/Karl-Bridge-Microsoft (accessed on 21 October 2022).
- Zatloukal, F.; Znoj, J. Malware detection based on multiple PE headers identification and optimization for specific types of files. JAEC 2017, 1, 153–161. [Google Scholar] [CrossRef] [Green Version]
- Anderson, H.S.; Kharkar, A.; Filar, B.; Evans, D.; Roth, P. Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning. arXiv 2018, arXiv:1801.08917. [Google Scholar]
- Salem, A.; Banescu, S.; Pretschner, A. Maat: Automatically analyzing virustotal for accurate labeling and effective malware detection. ACM Trans. Priv. Secur. 2021, 24, 1–35. [Google Scholar] [CrossRef]
- VirusTotal. Virustotal. 2021. Available online: https://www.virustotal.com/gui/home/upload (accessed on 17 September 2022).
- Zhao, Y.; Li, L.; Wang, H.; Cai, H.; Bissyandé, T.F.; Klein, J.; Grundy, J. On the impact of sample duplication in machine-learning-based android malware detection. ACM Trans. Softw. Eng. Methodol. 2021, 30, 1–38. [Google Scholar] [CrossRef]
- Joyce, R.J.; Amlani, D.; Nicholas, C.; Raff, E. MOTIF: A Malware Reference Dataset with Ground Truth Family Labels. Comput. Secur. 2022, 124, 102921. [Google Scholar] [CrossRef]
- Oyama, Y.; Miyashita, T.; Kokubo, H. Identifying useful features for malware detection in the ember dataset. In Proceedings of the 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW), Nagasaki, Japan, 26–29 November 2019; pp. 360–366. [Google Scholar]
- Amich, A.; Eshete, B. Explanation-guided diagnosis of machine learning evasion attacks. In Proceedings of the International Conference on Security and Privacy in Communication Systems, Washington, WA, USA, 21–23 October 2021; pp. 207–228. [Google Scholar]
- Castro, R.L.; Schmitt, C.; Rodosek, G.D. Armed: How automatic malware modifications can evade static detection? In Proceedings of the 2019 5th International Conference on Information Management (ICIM), Cambridge, UK, 24–27 March 2019; pp. 20–27. [Google Scholar]
- Romain, T. LIEF Library to Instrument Executable Formats. Available online: https://lief-project.github.io/ (accessed on 27 September 2022).
- Anderson, H.S.; Roth, P. Ember: An open dataset for training static pe malware machine learning models. arXiv 2018, arXiv:1804.04637. [Google Scholar]
- Hawkins, D.M. The problem of overfitting. J. Chem. Inf. Model 2004, 44, 1–12. [Google Scholar] [CrossRef]
- Weinberger, K.; Dasgupta, A.; Langford, J.; Smola, A.; Attenberg, J. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 1113–1120. [Google Scholar]
- Vishnu, N.; Batth, R.S.; Singh, G. Denial of service: Types, techniques, defence mechanisms and safe guards. In Proceedings of the 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, UAE, 11–12 December 2019; pp. 695–700. [Google Scholar]
- Pokrinchak, M.; Chowdhury, M.M. Distributed Denial of Service: Problems and Solutions. In Proceedings of the 2021 IEEE International Conference on Electro Information Technology (EIT), Mt. Pleasant, MI, USA, 14–15 May 2021; pp. 032–037. [Google Scholar]
- Bhardwaj, A.; Mangat, V.; Vig, R.; Halder, S.; Conti, M. Distributed denial of service attacks in cloud: State-of-the-art of scientific and commercial solutions. Comput. Sci. Rev. 2021, 39, 100332. [Google Scholar] [CrossRef]
- Shinde, P.; Parvat, T.J. DDoS attack analyzer: Using JPCAP and WinCap. Procedia Comput. Sci. 2016, 79, 781–784. [Google Scholar] [CrossRef] [Green Version]
- Goyal, P.; Goyal, A. Comparative study of two most popular packet sniffing tools-Tcpdump and Wireshark. In Proceedings of the 2017 9th International Conference on Computational Intelligence and Communication Networks (CICN), Cyprus, Turkey, 16–17 September 2017; pp. 77–81. [Google Scholar]
- Kshirsagar, D.; Kumar, S. A feature reduction based reflected and exploited DDoS attacks detection system. JAIHC 2022, 13, 393–405. [Google Scholar] [CrossRef]
- Arshi, M.; Nasreen, M.; Madhavi, K. A survey of DDoS attacks using machine learning techniques. In Proceedings of the E3S Web of Conferences; EDP Sciences:: Les Ulis, France, 2020; Volume 184, p. 01052. [Google Scholar] [CrossRef]
- Zargar, S.T.; Joshi, J.; Tipper, D. A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks. IEEE Commun. Surv. Tutor. 2013, 15, 2046–2069. [Google Scholar] [CrossRef] [Green Version]
- Gohil, M.; Kumar, S. Evaluation of classification algorithms for distributed denial of service attack detection. In Proceedings of the 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), Laguna Hills, CA, USA, 9–13 December 2020; pp. 138–141. [Google Scholar]
- Kaspersky. DDoS Protection White Paper; Kaspersky: Moscow, Russia, 2015. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; pp. 1–8. [Google Scholar]
- Radoyska, P.; Atanasova, M. Free tools for Testing the Security of Web Services in the UTP Network. In Proceedings of the Fifth International Scientific Conference “Telecommunications, Informatics, Energy and Management”, Sofia, Bulgaria, 3–4 October 2020. [Google Scholar]
- Cordero, C.G.; Vasilomanolakis, E.; Wainakh, A.; Mühlhäuser, M.; Nadjm-Tehrani, S. On generating network traffic datasets with synthetic attacks for intrusion detection. ACM Trans. Priv. Secur. 2021, 24, 1–39. [Google Scholar] [CrossRef]
- Alkasassbeh, M.; Al-Naymat, G.; Hassanat, A.B.; Almseidin, M. Detecting distributed denial of service attacks using data mining techniques. Int. J. Adv. Comput. Sci. Appl. 2016, 7. [Google Scholar] [CrossRef] [Green Version]
- Alothman, B. Raw network traffic data preprocessing and preparation for automatic analysis. In Proceedings of the 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Oxford, UK, 3–4 June 2019; pp. 1–5. [Google Scholar]
- Han, L.q.; Zhang, Y. Pca-based ddos attack detection of sdn environments. In Proceedings of the International conference on Big Data Analytics for Cyber-Physical-Systems, Shanghai, China, 28–29 December 2020; pp. 1413–1419. [Google Scholar]
- Bro, R.; Smilde, A.K. Principal component analysis. Anal. methods 2014, 6, 2812–2831. [Google Scholar] [CrossRef] [Green Version]
- Masri, R.; Aldwairi, M. Automated malicious advertisement detection using virustotal, urlvoid, and trendmicro. In Proceedings of the 2017 8th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 4–6 April 2017; pp. 336–341. [Google Scholar]
- Sanders, C. Practical Packet Analysis, 3E: Using Wireshark to Solve Real-World Network Problems; No Starch Press: San Francisco, CA, USA, 2017. [Google Scholar]
- Zhu, E.; Zhang, J.; Yan, J.; Chen, K.; Gao, C. N-gram MalGAN: Evading machine learning detection via feature n-gram. Digit. Commun. Netw. 2022, 8, 485–491. [Google Scholar] [CrossRef]
- Lu, Y.; Li, J. Generative adversarial network for improving deep learning based malware classification. In Proceedings of the 2019 Winter Simulation Conference (WSC), National Harbor, MD, USA, 8–11 December 2019; pp. 584–593. [Google Scholar]
- Xu, Y.; Wu, C.; Zheng, K.; Niu, X.; Yang, Y. Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets. Int. J. Distrib. Sens. Netw. 2017, 13, 1550147717703116. [Google Scholar] [CrossRef] [Green Version]
- Mazaed Alotaibi, F. A Multifaceted Deep Generative Adversarial Networks Model for Mobile Malware Detection. Appl. Sci. 2022, 12, 9403. [Google Scholar] [CrossRef]
- Hsiao, S.W.; Chu, P.Y. Sequence Feature Extraction for Malware Family Analysis via Graph Neural Network. arXiv 2022, arXiv:2208.05476. [Google Scholar]
- Hekmati, A.; Grippo, E.; Krishnamachari, B. Large-scale Urban IoT Activity Data for DDoS Attack Emulation. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, Coimbra, Portugal, 15–17 November 2021; pp. 560–564. [Google Scholar]
- Charlier, J.; Singh, A.; Ormazabal, G.; State, R.; Schulzrinne, H. SynGAN: Towards generating synthetic network attacks using GANs. arXiv 2019, arXiv:1908.09899. [Google Scholar]
- Arnaboldi, L.; Morisset, C. Generating synthetic data for real world detection of DoS attacks in the IoT. In Proceedings of the Software Technologies: Applications and Foundations, Toulouse, France, 25–29 June 2018; pp. 130–145. [Google Scholar] [CrossRef] [Green Version]
- Hernandez-Suarez, A.; Sanchez-Perez, G.; Toscano-Medina, L.K.; Olivares-Mercado, J.; Portillo-Portilo, J.; Avalos, J.G.; García Villalba, L.J. Detecting Cryptojacking Web Threats: An Approach with Autoencoders and Deep Dense Neural Networks. Appl. Sci. 2022, 12, 3234. [Google Scholar] [CrossRef]
- Liu, M.; Mroueh, Y.; Ross, J.; Zhang, W.; Cui, X.; Das, P.; Yang, T. Towards better understanding of adaptive gradient algorithms in generative adversarial nets. arXiv 2019, arXiv:1912.11940. [Google Scholar]
Algorithm | Advantages | Disadvantages |
---|---|---|
Oversampling | No loss of dataset integrity. | Exact replicates of the samples with minority class are created, or, those with greater distribution are reduced. This is a generalization risk in ML algorithms, as they can lose the sense of generalization, leading to under- or over-fitting events. |
Categorical latent Gaussian process | Gaussian processes are flexible, adaptive and easy to manipulate for the generation of new samples. | Phenomena of low or no dispersion can be observed, as they use all samples and features to predict new synthetic samples. In addition, this method may present defective samples when the set has a high dimension. |
Multiple embedding | High-dimensional samples are projected onto lower-dimensional samples, producing a new replication with compositionally rich synthetic content. | Samples from different contexts can be represented as one, removing particular and heterogeneous properties, leading to poor generalization. |
Generative Adversarial Networks (GAN) | Its major advantage is that a GAN can obtain a latent representation of the original samples and build a new, augmented and modified version according to its distribution. | A large number of continuous samples are needed to generate synthetic outputs, which increases the complexity of the model. |
Data augmentation | It generates new points artificially in the existing data, increasing the amount of information in the sample, its main advantage is that it reduces data collection and labeling | It is difficult to provide the necessary augmentation, in fact if the dataset is biased, the augmented data will be biased as well. |
Group | Feature | Description | Type |
---|---|---|---|
1 | General information | Encompasses the file characteristics, such as file size, number of imported and exported functions, debugging section, resources, relocations, signatures and number of symbols. | Object |
2 | PE Header information | Includes the timestamp, target machine and a series of text strings representing the list of read-only data sections. From the optional header, the target subsystem, DLL library imports, the magic number of the file in text format, the major and minor image version, linker versions, system and subsystem versions, code size and headers are depicted. | Object |
3 | Imported functions | The address import table is translated in a grammatical way and the list of imported functions for each library is reported. In order to create a useful feature for models, the set of 256 unique libraries is used, as with the 1024 unique functions, both as an import sequence. | String |
4 | Exported functions | The features include a list of exported functions which are represented within the object by a 128-binary hash. | String |
5 | Section information | This group reports the properties of each section of the PE file, including the name, size, entropy, virtual size and a list of text strings that represent the characteristics of the section. | Object |
6 | Byte histogram | This group covers 256 integer values, which represent the count of each byte contained in the file. | Integer |
7 | Histogram of entropy bytes | To represent the entropy of the file, the histogram represents the approximation of the probability distribution of the entropy H and series of bytes b. | Float |
8 | String information | Statistical information over printable text strings. | Float |
Properties | Description | Type |
---|---|---|
Network ports | It is important to mention that in DoS and Distributed-DoS cyber-attacks there is a certain degree of randomness in the target ports used by the attacker, mainly in the TCP protocol and some others specific to the application layer. A valid variety of ports allows the realism of a synthetic flow to be checked. | Integer |
Variety of IP addresses | In a DoS cyber-attack, especially a distributed one, there must be a variety of connections from different source IP addresses. | String |
Time to live (TTL) | The lifetime of a network packet varies, depending on the metrics of the different network devices, where the attack fluctuates. | Float |
Maximum Segment Size (MSS) | It is the distribution of the segments in the capture file and is related to the structure and sequence of the attack. | Float |
Window Size | It allows measuring the behavior of packets in relation to the amount of information that a device can receive in a time series. | Integer |
Payloads | In attacks targeting the application layer, volumes of payloads can be observed as high length requests directed to specific ports. These, can be schematized as sequences that can be transformed according to their content and volume. | String |
Feature | Property | Type |
---|---|---|
Source Address | Variety of IP addresses | String |
Origin Protocol | Origin protocol number | String |
Destination Protocol | Destination protocol number | String |
Destination Address | Variety of IP addresses | String |
Packet ID | TTL | String |
Source Node | Variety of IP addresses | String |
Destination Node | Variety of IP addresses | String |
Packet Size | MSS | String |
Squencial Number | Window Size | String |
Number of Packets | Window Size | String |
Number of bytes | Window Size | String |
Packet in | TTL | String |
Packet out | TTL | String |
Packet Transmition | TTL | String |
Packet delay note | TTL | String |
Packet Rate | Window Size | String |
Byte rate | Window Size | String |
Pkt Avg Size | Window Size | String |
Utilization | Payloads | String |
Packet Delay | MSS | String |
Packet send time | MSS | String |
Packet reserved time | MSS | String |
The first packet Sent | TTL | String |
Last packet reserved | TTL | String |
Feature | Description | Type |
---|---|---|
Forward packet length mean | Mean size of packet in forward direction | Float |
Inter-Arrival total bandwidth | Total time between two packets sent in the backward direction | Float |
Bandwidth Inter-Arrival time standard deviation | Standard deviation time between two packets sent in the backward direction | Float |
Forward push flags | Number of times the push flag was set in packets travelling in the forward direction | Float |
Minimum forward segment size | Minimum segment size observed in the forward direction | Float |
Forward packet length standard deviation | Standard deviation size of packet in forward direction | Float |
Synthetic Sample | Detection Radius |
---|---|
Malware | 7801 out of 8000 (%) samples detected by VirusTotal sensor. |
DDoS | 32,120 out of 50,000 (%) samples detected by CloudShark rules |
Scope | Algorithm | Description | Type of Mutation |
---|---|---|---|
Malware | GAN [67] | A GAN with a black box detector is proposed; the samples are modified by changes in the probabilistic distribution of API32 calls, so that, the SL algorithm can misclassify the sample and bypass the detector, thus demonstrating that there are synthetic results with a high degree of obfuscation. | Modification to Windows API32 |
Malware | DCGAN [68] | Samples of various malware families are converted into 32x32-dimensional gray-scale images. The Deep-Convolutional-GAN network (DCGAN) uses a generator that modifies the original image, adding noise elements in the distribution and using a discriminator to determine whether the modified image is malware or not. It is shown that several malware synthetic samples can be generated by bypassing the discriminator. | Transformation of samples to images and modification of pixel distribution. |
Malware | Fuzzy-SMOTE [69] | Different samples are analyzed, mainly from the Android operating system, representing vecotrized values of SA, DA and risk lists. Synthetic samples are generated by supersampling minority classes in a fuzzy region, to maximize the degree of belonging to the class in question. | Oversampling from minority to majority class. |
Malware | MDGAN [70] | A Multifaceted-Deep-GAN (MDGAN) is used to generate a Gussian random distribution to samples containing values from the header of a malware binary in PE format, further, concatenated with sequences from the operating system APIs. The results demonstrate that it is possible to generate features that the discriminator will evaluate as effective malware. | The distribution of the result of merging characteristics is modified. |
Malware | Markov Decision Model (MDM) + Attention Aware Graph Neural Network (AWGCN) [71] | The sequences of API calls are modified using Markov chains and then randomly distributed without replacement. It is shown that it is possible to intervene in sequence calling and generate new samples with sequential distributions similar to those of an original malware binary. | The order of the malware binary API sequences in the operating system. |
DoS cyber-attacks | Statistical Learning [60] | Descriptive statistical data are obtained as a function of host, protocol, conversation and specific fields of the network flow. PCAP file information is mutated and copied, inferring which values will be closest to a real sample in relation to previously calculated values and maintaining a certain degree of entropy. | The network flow file in PCAP format is modified. |
DoS cyber-attacks | Statistical learning and simulation [72] | A simulated environment is generated using specific Internet if Things (IoT) software and statistical data are calculated in the time windows of the attack rerun: the start time and the duration of the attack, and the percentage of the nodes that go under stress. The values are incorporated into a tabular set that is validated by a Neural Network. | Statistical values of a set already constructed. |
DoS cyber-attacks | GP-WGANs [73] | The random uniform distribution of different sets in PCAP format is measured using a Gradient Penalty Wasserstein GAN network (GP-WGAN), so that the synthetic samples resemble the real ones. The generator is in charge of executing the probabilistic changes and a discriminator evaluates the quality of the new synthetic sample. This project mainly focuses on application layer attacks. | Data distribution in PCAP files. |
DoS cyber-attacks | MDM + Probabilistic Symbolic Symbolic Model Checker (PRISM) [74] | It focuses on simulating the steps to synthetically reproduce a DDoS attack on an IoT sensor network, thanks to the transactional abstractions of the MDM. PRISM allows to calculate the probability of sensor battery drain, specifically in application layer attacks, allowing to generate data that evaluate the intensity of a volumetric attack. | Attack sequences and probability of battery drainage. |
Algorithm | Configuration |
---|---|
Hidden layer sizes: | |
RL + MLP | Activation function: sigmoid |
Weight optimization solver: Stochastic Gradient Descent | |
Attribute selection method: GINI | |
RL + RF | Number of features to consider for best split: 2 |
Minimum number of samples required to be at leaf node: 1 | |
Minimum number of samples required to split internal nodes: 1 | |
Maximum depth of the tree: 3 | |
Minimum number of trees in forest: 3 | |
Number of estimators (trees): 100 | |
RL + DT | Maximum number of features in each estimator: 3 |
Maximum depth of the tree: 3 | |
Inverse of regularization strength of term: | |
RL + LR | Norm selected to regularize the cost function: |
Optimization algorithm: LBFGS | |
Penalty parameter C of error term: 10 | |
RL + SVM | Type of division: One-vs-one |
Kernel type: linear | |
Loss function to be optimized: log-loss | |
RL + GB | Number of estimators: 100 |
Criterion to measure the quality of a split: Friedman Minimum-Square-Error | |
Minimum number of samples required to split internal nodes: 2 | |
Minimum number of samples required to be at leaf node: 1 | |
1D convolution layer L with 64 filters, a kernel size with 3 units and as an activation function ReLU | |
convolution layer | |
convolution layer] | |
RL + ResNET-18 | convolution layer] |
convolution layer] | |
convolution layer] | |
The output layer with a Sigmoid activation function | |
1D Dense input layer L with 9 units and as an activation function ReLU | |
RL + FFNN | 1 Hidden layer |
1 Hidden layer | |
The output layer with a Sigmoid activation function |
Scope | Algorithm | Precision |
---|---|---|
GAN + MLP [67] | % | |
GAN + DT [67] | % | |
Malware | GAN + LR [67] | % |
GAN + SVM [67] | % | |
GAN + RF [67] | % | |
Malware | DCGAN + ResNet-18 [68] | % |
Malware | Fuzzy-SMOTE + SVM [69] | % |
Malware | MDGAN [70] | % |
MDM + One-Hot-Encoding + SVM [71] | % | |
MDM + One-Hot-Encoding + LR [71] | % | |
Malware | MDM + One-Hot-Encoding + RF [71] | % |
MDM + AWGCN + SVM [71] | % | |
MDM + AWGCN + LR [71] | % | |
MDM + AWGCN + RF [71] | % | |
RL + MLP (this work) | % | |
RL + DT (this work) | % | |
RL + LR (this work) | % | |
Malware | RL + SVM (this work) | % |
RL + RF (this work) | % | |
RL + ResNet-18 (this work) | % | |
DoS cyber-attacks | Statistical learning + FFNN [72] | % |
DoS cyber-attacks | GP-WGANs + GB [73] | AUC = |
DoS cyber-attacks | Statistical Learning [60] | - |
MDM + PRISM + MLP [74] | % | |
DoS cyber-attacks | MDM + PRISM + DT [74] | % |
RL + MLP (this work) | % | |
DoS cyber-attacks | RL + DT (this work) | % |
RL + GB (this work) | AUC = | |
RL + FFNN (this work) | % |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hernandez-Suarez, A.; Sanchez-Perez, G.; Toscano-Medina, L.K.; Perez-Meana, H.; Olivares-Mercado, J.; Portillo-Portillo, J.; Benitez-Garcia, G.; Sandoval Orozco, A.L.; García Villalba, L.J. ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning. Sensors 2023, 23, 1231. https://doi.org/10.3390/s23031231
Hernandez-Suarez A, Sanchez-Perez G, Toscano-Medina LK, Perez-Meana H, Olivares-Mercado J, Portillo-Portillo J, Benitez-Garcia G, Sandoval Orozco AL, García Villalba LJ. ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning. Sensors. 2023; 23(3):1231. https://doi.org/10.3390/s23031231
Chicago/Turabian StyleHernandez-Suarez, Aldo, Gabriel Sanchez-Perez, Linda K. Toscano-Medina, Hector Perez-Meana, Jesus Olivares-Mercado, Jose Portillo-Portillo, Gibran Benitez-Garcia, Ana Lucila Sandoval Orozco, and Luis Javier García Villalba. 2023. "ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning" Sensors 23, no. 3: 1231. https://doi.org/10.3390/s23031231
APA StyleHernandez-Suarez, A., Sanchez-Perez, G., Toscano-Medina, L. K., Perez-Meana, H., Olivares-Mercado, J., Portillo-Portillo, J., Benitez-Garcia, G., Sandoval Orozco, A. L., & García Villalba, L. J. (2023). ReinforSec: An Automatic Generator of Synthetic Malware Samples and Denial-of-Service Attacks through Reinforcement Learning. Sensors, 23(3), 1231. https://doi.org/10.3390/s23031231