Open AccessReview

Applications of Machine Learning in Cyber Security: A Review

Ioannis J. Vourganas

^†

and

Anna Lito Michala

^*,†

Netrity Ltd., Glasgow G2 1BP, UK

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

J. Cybersecur. Priv. 2024, 4(4), 972-992; https://doi.org/10.3390/jcp4040045

Submission received: 29 September 2024 / Revised: 3 November 2024 / Accepted: 8 November 2024 / Published: 17 November 2024

Download

Browse Figures

Versions Notes

Abstract

In recent years, Machine Learning (ML) and Artificial Intelligence (AI) have been gaining ground in Cyber Security (CS) research in an attempt to counter increasingly sophisticated attacks. However, this paper poses the question of qualitative and quantitative data. This paper argues that scholarly research in this domain is severely impacted by the quality and quantity of available data. Datasets are disparate. There is no uniformity in (i) the dataset features, (ii) the methods of collection, or (iii) the preprocessing requirements to enable good-quality analyzed data that are suitable for automated decision-making. This review contributes to the existing literature by providing a single summary of the wider field in relation to AI, evaluating the most recent datasets, combining considerations of ethical AI, and posing a list of open research questions to guide future research endeavors. Thus, this paper contributes valuable insights to the cyber security field, fostering advancements for the application of AI/ML.

Keywords:

intrusion detection systems; dataset review; machine learning; ethical AI

1. Introduction

In the realm of cyber security, achieving a robust defense against cyber threats necessitates a multifaceted approach that intertwines technological innovations, organizational policies, and individual user behaviors. Given the dynamic nature of cyber threats, it is imperative to incorporate the latest developments and insights within the domain. The foundational principles of cyber security are encapsulated by the CIA triad, namely the Confidentiality, Integrity, and Availability (CIA) of information [1,2,3]. Contemporary strategies in cyber security further underscore the significance of leveraging advanced technologies, adhering to regulatory mandates, fostering user awareness, and implementing proactive measures for threat detection and mitigation. In recent years, ML and AI have been gaining ground in CS research in an attempt to counter increasingly sophisticated attacks. The motivations for the use of AI/ML vary. Research demonstrates improved breach prevention and attack detection rates [4]. Researchers disagree on the rates, and various numbers from 30 to 75% have been cited in the wider literature. However, ML and AI rely heavily on the availability of qualitative and quantitative data as a data-driven science. This poses the challenge of suitable data availability. In recent years, many datasets have been proposed to address this. This paper poses the question of the suitability of existing datasets. It remains to be evaluated if the available data are in an appropriate format and representation and if they are representative of the intended use case. This paper argues that scholarly research in this domain is severely impacted by the quality and quantity of available data. To attempt to address this argument, a narrative review is performed, and the methodology is presented in Section 2. In Section 3, we discuss the different questions that CS attempts to address with AI/ML. In Section 4, we demonstrate that the datasets are disparate. There is no uniformity in (i) the dataset features, (ii) the methods of collection, or (iii) the preprocessing requirements. In Section 5, we focus on existing efforts in one specific subdomain of CS, namely intrusion detection systems. We expand on their limitations in Section 6 to showcase the arguments presented in earlier sections regarding datasets and their appropriateness. Finally, in Section 7, this paper contributes valuable insights to the cyber security field, fostering advancements for the application of AI/ML by proposing open research questions.

2. Review Method

This paper follows the methodology of a narrative literature review, providing a structured synthesis of key developments, methodologies, and applications within this rapidly evolving field. First, this review focuses on studies that evaluate the suitability of AI as a tool to address the challenges CS is facing and attempts to categorize those challenges. To address this, a targeted search of peer-reviewed journals and technical reports was aimed at sources published within the last twenty years to allow for a wide breadth. Exceptionally, when fundamental concepts are discussed, older studies are cited. Studies published in the last 5 years were considered more relevant when conflicting statements were found. The studies are categorized by themes, such as intrusion detection systems, phishing prevention, and anomaly detection in network traffic, which are prominent application areas for AI in cyber security. This categorization is inspired by and extends an earlier review presented in [5]. To address the limitations and biases in methodology, only studies with large datasets and robust validation metrics are included, ensuring a high standard of research quality.

In total, 101 studies were selected, of which 99 are cited here, and 29 do not reference ML/AL in their methodology sections. The 3 studies were excluded because they simply re-iterated similar findings. The 29 studies discuss the broader domain research present datasets or discuss research directions in CS that cannot benefit from ML/AI. However, they were still important to the review in positioning the utility of ML/AI in CS and categorizing CS research efforts and attacks.

3. The Cyber Security Research Landscape

To begin to examine the utility of AI/ML in CS, it is important to first examine the research landscape. These are the main domains in which CS research is advancing, organized in order of relevance to AI/ML.

Advanced Threat Detection and Response: New methods of identification of patterns and anomalies, indicative of potential security threats, enhance the capabilities for detecting and responding to threats [5].
IoT security: The quick integration of Internet of Things (IoT) devices into the digital ecosystem necessitates secure frameworks to mitigate inherent vulnerabilities [6].
Regulatory compliance and standards: Compliance with cyber security norms and regulations (e.g., General Data Protection Regulation (GDPR)) is paramount in practices and software [7,8].
The role of encryption: The evolution of encryption technologies, including the development of quantum-resistant algorithms, is critical [9].
Blockchain for security: The blockchain’s decentralized framework ensures the integrity and transparency of transactions and data exchanges in cyber security infrastructures [10], as well as identity management and secure communication.
Cyber security awareness and training: This is vital for reducing the susceptibility to human errors, which cause a significant proportion of security breaches [11].
Incident response planning: This is characterized by well-defined protocols and responsibilities, demonstrating preparedness and resilience [12].

By nature, some domains are more closely related to AI/ML by virtue of the availability of data (items enumerated at the top), while others necessitate human interaction and are thus unaffected (items at the bottom). Regulatory compliance affects AI/ML applications more intrinsically, and this is further discussed in the remainder of this paper. The following section further splits the AI/ML-relevant research domains by type of cyber security.

3.1. Types of Cyber Security

One categorization of the research landscape may be achieved based on the types of cyber security, as initially presented in [5] and prioritized with respect to AI/ML, revised, and expanded in Table 1.

3.1.1. Infrastructure and Network Security

Discussed together, infrastructure and network security constitute the cornerstone of an effective cyber security framework. This section is an examination of components, methodologies, and interconnectedness, with a particular emphasis on the detection of anomalies within network traffic. Infrastructure security is dedicated to the safeguarding of an organization’s vital physical and digital assets, preventing operational disruptions, data breaches, and damage to assets [13]. These encompass data centers, servers, network devices, and ancillary power networks. Prominent strategies entail the following:

Physical security measures: biometric access, deployment of security personnel, and surveillance systems preventing unauthorized physical access [14].
Virtual protection mechanisms: sophisticated intrusion detection systems, routine security assessments to identify and rectify vulnerabilities, and the maintenance of current software and hardware [15].
Redundancy and resilience: backup systems and alternative data routes ensure service continuity [13].

Network security focuses on the protection of data in transit and the infrastructure of the network from unauthorized access, attacks, or data exfiltration, comprising the following:

Firewalls: a defense line between secure/internal and potentially unsafe/external networks [16].
Intrusion Detection and Prevention Systems (IDPS): monitor network traffic to identify and automatically mitigate potential threats [17,18].
Two-Factor Authentication (2FA): two forms of user identification prior to granting network access [19].
Remote access management: network access to authorized personnel; e.g., through a Virtual Private Network (VPN).

The surveillance of network traffic is instrumental for the early detection of irregularities, which may be indicative of a security threat, emanating either internally (e.g., from a compromised laptop) or externally (for instance, from an external Domain Name System (DNS) server). This prevents proliferation deeper into the network or adverse impacts on essential infrastructure. The deployment of advanced tools and technologies, notably ML, plays a vital role in the efficacious detection of anomalies within network traffic. Such tools are indispensable for safeguarding against intricate threats, including advanced persistent threats, zero-day exploits, and other sophisticated attacks that may elude conventional security measures [18].

3.1.2. Application, Information Security, and Human Factors

The sanctity of sensitive data processed by applications, increasingly accessible over networks and the internet, and the surge in cyber attacks exploiting application vulnerabilities underscores the imperative of securing the application layer. Measures implemented during the development phase of applications aim to address these challenges [20]. With the rise of natural language processing technologies, programmers are already relying on machine-generated code. Thus, it might prove to be a future research point of interest for AI in CS.

Information security encompasses both digital and physical data, employing practices ranging from encryption and two-factor authentication to physical security measures, ensuring comprehensive data protection [21]. This area is predominantly targeted by regulations such as GDPR. However, it is not the main focus of AI in CS.

Cultivating a security-aware culture empowers employees to serve as a primary defense line against cyber threats [22]. Once again, this type is predominantly focused on the social aspects and regulation and thus is not directly related to the application of AI or ML. However, some tools focus on automated screening of information before reaching humans (e.g., email screening), which could involve advanced AI.

3.2. Cyber Security Attack Types

The earlier section established that the infrastructure and network types are the main focus of AI/ML solutions. To evaluate the suitability of AI and ML methods for CS in infrastructure and network security, as well as to be able to evaluate the quality and stability of datasets, a review of the known attack types is required. The types of attacks were categorized initially in [5] and are reviewed here in light of ML/AI applicability. A summative assessment of the reviewed attack types is presented in Table 2.

3.2.1. Malware

Malicious software (malware), emerges as one of the most formidable threats, significantly evolving in its complexity and sophistication [23]. Malware diversifies into viruses, worms, trojans, and ransomware (Figure 1). Malware polymorphic and metamorphic variants pose significant challenges to conventional detection methodologies [24]. ML and AI have shown promise in malware detection; for example, the use of deep learning models [25].

3.2.2. Ransomware

Ransomware is characterized by its capacity to encrypt data, thereby rendering the files inaccessible, requesting ransom, or blackmailing [26,27] (Figure 2). Utilization of advanced and smart anti-malware solutions [28] may be effective mitigation [29].

3.2.3. Phishing

Phishing attacks (Figure 3) exploit the trust of unsuspecting users to acquire sensitive information [30,31], resulting in identity theft, financial detriment, and unauthorized system access. ML/AI is being intensively investigated as an approach for the detection of phishing attacks, leveraging machine learning and neural network techniques to discern and mitigate these threats with enhanced efficacy [32,33,34]. These methodologies center on scrutinizing email content, website attributes, and user behavioral patterns. As phishing attacks proliferate in frequency and sophistication, the concerted integration of technology, education [35], and preemptive defensive measures [36] assumes heightened significance.

3.2.4. DDoS

A DDoS attack disrupts the flow of traffic to a targeted server, service, or network with an overwhelming surge of internet traffic (Figure 4). The proliferation of internet-connected devices has expanded the attack surface [37] and introduced the utilization of larger botnets [38]. Mitigation strategies involve early detection through traffic analysis and intrusion detection systems (IDS) including ML/AI-based approaches [39,40].

3.2.5. SQL Injections

SQL injections target vulnerabilities in applications interacting with databases through malicious SQL statements embedded into queries (Figure 5). This attack is particularly relevant to web applications [23]. Advanced exploitation techniques include time-based blind SQL injection and automated injection tools, necessitating the deployment of more sophisticated detection and prevention methodologies [41]. Detection methods include ML/AI approaches for dynamic and static analysis of code [41,42].

3.2.6. Zero-Day Exploit

A zero-day exploit denotes a cyber attack exploiting a security vulnerability discovered in software on the very day of its detection, yet unknown to software vendors or the public (Figure 6). The early detection of zero-day exploits poses challenges due to the absence of discernible signatures or patterns for identification. Research has explored the application of machine learning and anomaly detection methodologies to prognosticate and identify such vulnerabilities before exploitation [43] and deploy smarter intrusion detection systems [44].

3.2.7. DNS Tunneling

DNS tunneling encodes data from other programs or protocols within DNS queries and responses (Figure 7). This method facilitates the circumvention of network security measures [45]. These attacks cause DNS traffic pattern anomalies. Increasingly, ML/AI models are being leveraged for efficient detection in tailored network security systems [46,47].

3.2.8. XSS Attacks

XSS attacks represent a form of injection wherein malevolent scripts are inserted into otherwise benign and trusted websites, exploiting a web application to dispatch malicious code, typically under the guise of a browser-side script [48] (Figure 8). Mitigating XSS attacks necessitates adherence to secure coding practices, validation, and sanitization of all inputs, escaping outputs, and implementation of content security policies that confine the sources of executable scripts.

3.2.9. Social Engineering

Social engineering constitutes a tactic employed by cyber criminals to manipulate individuals into disclosing confidential information or undertaking actions that may jeopardize security (Figure 9). This stratagem relies on exploiting human psychology rather than technical hacking methodologies [49]. Education and awareness stand as the foremost defense against social engineering attacks [50]. Technological solutions alone prove inadequate [51].

4. Dataset Availability and Assessment

The earlier sections establish that all of the previously mentioned attacks can be in part identified and/or prevented with the use of ML/AL, apart from social engineering. Table 3 shows the prevalence of various AI/ML modeling approaches in the corpus of reviewed studies cited here, excluding the 29 studies that did not have a relevant methodology section. It is important to note that the reported accuracies of 84% to 99% in the studies reference previously known attacks. When unknown attacks are in question, detection rates drop significantly to 3.2–14.7% according to [52]. Additionally, false positives are a well-documented issue. In the domains of ML and AI, the selection of an appropriate dataset is pivotal, necessitating a thorough evaluation across multiple dimensions such as attack diversity, traffic realism, dataset balance and quality, labeled data availability, and the dataset’s size and complexity. Thus, a review of existing datasets and their applicability is necessary for further development in this cross-disciplinary domain.

In the last 20 years, several other datasets have also been synthesized or recorded to assist in prognostic and diagnostic analytics. DARPA contains a set of intrusion detection data, including LLDOS-1.0 and LLDOS2.0.2, which consist of connections between source and destination IP addresses. Traffic and various attack data are categorized by the MIT Lincoln Laboratory and are used to evaluate attacks and detect intrusions. However, it is considered outdated. CAIDA contains distributed denial of service (DDoS) attack traffic and regular traffic traces, including unspecified traffic from a DDoS attack in 2007. CTU-13 is an extensive dataset that includes botnet traffic captured by a Czech university in 2011 and is useful for malware analysis. It covers a moderate diversity of attacks but offers realistic labeled data. The CTU-13 dataset carves a niche in botnet detection with its focused coverage on botnet traffic, offering a unique perspective and detailed stages of botnet activity. These are beneficial for dynamic threat detection models, despite its narrower breadth of attack types compared to others. MAWI includes data retrieved from Japanese network research institutions and comprises labels that identify traffic deviation. ISCX’12 is produced by the Canadian Institute for Cybersecurity and contains 19 features; 19.11% of the traffic belongs to distributed DDoS attacks. Bot-IoT contains reliable traffic and simulates the internet of things in a realistic network environment in the Cyber Range Lab of UNSW Canberra. It contains DDoS, DoS, OS and service scan, keylogging, and data exfiltration attacks, with the DDoS and DoS attacks further organized based on the employed protocol. ISOT’10 contains a mix of malicious and non-malicious data traffic and was created during the Information Security and Object Technology (ISOT) research at the University of Victoria. Similarly, the modern yet domain-specific SWaT and IoT-23 datasets present network data for SCADA and IoT systems, respectively, making them very niche for general utility.

UNSW-NB15 was created using the IXIA PerfectStorm tool in the Cyber Range Lab of UNSW Canberra and includes a mix of contemporary synthetic attack activities and behaviors. This dataset contains 49 features and 9 attack types. The TCPDUMP tool is executed to simulate and capture 100 GB of traffic, and the ARGUS and Bro-IDS tools are operated with 12 models to generate 49 features in classifying the data. The UNSW-NB15 and CICIDS2017/2018 datasets cater to contemporary cyber security challenges with a high diversity of modern attack types and a very realistic simulation of network traffic, respectively. UNSW-NB15, with its balance between complexity and manageability, presents a rich environment for deep learning models, albeit with potential computational demands. The UNSW-NB 15 dataset [53] has nine attack classes: analysis, backdoor, DoS, exploits, fuzzers, generic, reconnaissance, shellcode, worms, and a normal class. It was created from 100 GB of normal and modern attack traffic by the Australian Centre for Cyber Security using the IXIA tool. CICIDS2017/2018, distinguished by its comprehensive attack types and very high realism, requires significant computational resources due to its vast size, but it offers extensive labeled data for a wide array of supervised learning applications. Another dataset that stood out was CICCDoS2019 [54]; one of the most recently updated. However, it was rejected for being designed only for DoS attacks, making it unsuitable for testing on a wide variety of attacks.

KDD’99 Cup is a widely utilized dataset that has 41 features for evaluating anomaly detection. Through this dataset, attacks are categorized into four categories, namely probing, remote-to-local (R2L), user-to-remote (U2R), and DDoS. NSL-KDD has removed redundant records from KDD’99 and some inherent issues eliminating bias toward frequent records. The NSL-KDD dataset, recognized for its moderate diversity of attacks and improvement over its predecessor KDD’99, offers a balanced and qualitative dataset with reduced redundancy, making it suitable for ML applications without extensive preprocessing. It stands out for its manageable size and labeled data, which support supervised learning and facilitate the training of ML models with relative ease. NSL-KDD is similar to UNSW-NB 15 and shares the same strengths. However, it has older traffic and attack types and is known to be more unbalanced [53].

Figure 10 showcases two of the most important factors in the comparison of datasets: their age versus the labeled attack categories, along with any subgroups presented. Clearly, 78% of the datasets are 10 or more years old, while there is a trend for newer datasets to provide fewer attack labels with more varied scenarios or sub-groups of individual attack algorithms or methods (e.g., specific malware, specific exploit, etc.) MAWI and SWaT are not labeled and are therefore suited to exploratory anomaly detection using ML/AI algorithms. The remainder are labeled, allowing for classification methods.

As the body of available datasets is growing, it would be expected that the ML/AI approaches would be also evolving. However, one limitation is that all of the aforementioned datasets define their own features, with some directly measured and some derived. It is almost impossible to attempt to unite the available datasets in a larger corpus due to the diversity of the features. Also, the derivation methods are not always well defined; in some cases, the originally measured feature is not published as part of the dataset; only the derivative. As a result, any new ML or AI approaches need to identify a dataset most suitable for the problem at hand. Another limitation is of course the dataset’s age, which would not include the latest attacks. Finally, and perhaps most importantly, there is no consensus on which features are indeed relevant for certain attacks or attacks in general.

In this review, we argue, however, that certain characteristics can differentiate datasets and make them more appropriate for use in ML and AI. Suitable datasets should:

include audit logs and raw network data;
provide a variety of modern attacks;
represent realistic and diverse normal traffic;
be labeled;
comply with ethical AI principles and privacy protocols (e.g., GPDR);
be accepted by the scientific community.

An analysis of the most recent and well-used datasets based on these characteristics is presented in Table 4 and Table 5. This was based on the following reasoning. The date upon last being updated is relevant to the dataset’s ability to represent modern attack types. Whether the data were real or simulated represents how realistic the traffic will be. Its labeled status enables the dataset to be usable by a variety of approaches. Its compliance with AI and GDPR standards is necessary to meet our ethical requirements. Finally, if it is widely scientifically accepted, this implies a higher approval by the community and thus a lower risk of excluding significant features. Additionally, the range of attack types is considered, with UNSW-15 having a wide range while being new. Additionally, UNSW 15 was shown to be frequently used in recent research papers for these reasons, as also explained in [55,56,57,58]. Such applications can also serve as a point of reference to compare the performance of novel systems.

5. Intrusion Detection System Evaluation

Based on the earlier analysis, ML or AI are mostly used in the process of identifying attacks as they happen in an attempt to prevent them from completing their task or spreading further. Such systems are usually referred to as IDS. Recent systematic reviews on ML in IDS are analyzed in this section [59,60,61,62], followed by research papers published at a later date. The literature reviews largely agree that the most effective way of improving the performance of ML IDS systems is through better feature selection rather than new ML models. This is in line with findings in several other domains where AI has been applied. Also, it is the root cause of non-acceptance in industrial applications. The primary goal of any new IDS approach should thus be to minimize false positives, which are bottlenecks to the effective use of IDS in networks [60].

In pursuit of this goal, we evaluate the recent research landscape based on the following criteria, derived from recent advancements in ethical AI approaches, which promise to enable such models to become more usable and better regulated:

Explainability: addressing the need for IDS system decisions to be easily explainable to the user,
Bias: addressing imbalance and multicollinearity, treating outliers effectively, and efforts to mitigate dataset imbalance, ultimately affecting the ability of the model to eliminate false predictions,
Robustness: evaluating the model against attacks and normal traffic and analyzing the repeatability of the outcomes,
Efficiency: the model’s inference execution time and whether it is reported.

Explainability in ML and IDS is crucial for transparency and trust [55]. Research shows that understanding these models leads to better performance through improved data engineering [59] rather than merely adopting new ML models.

The bias criterion addresses the validity of the IDS system or research results. Factors such as multicollinearity [63], outlier impact [64], and dataset class imbalances [65] can impact the results. Many of these issues are not widely discussed in the IDS literature, raising concerns about the result’s statistical validity. Future research should primarily aim to address these gaps.

Robustness relates to diverse and evolving security threats. An IDS should be able to recognize many known threats reliably and be resilient to new threats [52].

Finally, efficiency was incorporated as a criterion due to the real-time demands of many IDS scenarios. Rapid threat detection and response can be pivotal, and the literature review in IDS with ML highlights its need and the current lack of measurement [66].

The analysis of recent literature against our specified criteria is summarized in Table 6. As demonstrated, no published approach addresses all of the criteria, with a significant gap in bias, robustness, and a noticeable lack of efficiency measurements, which hinders the application of these techniques in real-world scenarios.

It should be noted that these papers did not set out to achieve those goals. Primarily, they focus on improved performance as the majority of advanced AI or ML publications.

The authors in [52] studied how IDS with ML responds to unknown attacks. While IDS with ML is highly accurate for known attacks, its performance significantly decreases when faced with new attack types. The study looks into strategies to minimize this drop using various models and data splits in training. Their unique analysis indicates that some models are better at predicting unknown attacks, and the effectiveness varies based on the specific unknown attack type. However, according to the authors, understanding why some models may perform better on certain splits or unknown attacks is limited due to limitations in model explainability. The research does not include statistical tests for dataset validity, specifically in terms of feature importance, imbalanced classes, and multicollinearity. This omission can add noise to the study’s findings. Moreover, using a dataset not frequently cited in the literature amplifies this concern. Its approach to dealing with unknown attacks is novel and can contribute to robustness. Their evaluation with regards to false positives is to an extent moving towards bias reduction. The study, however, does not report processing time results, so it did not meet the efficiency criterion.

The research in [68] focuses on effective feature extraction. The proposed approach does not consider explainability but uses easier-to-explain models such as Jrip or KNN, trying to address bias by using SMOTE to deal with class imbalance, but not in terms of multicollinearity. It does not attempt to evaluate robustness. While it uses image representation of network traffic, a laborious task presented as efficient, the paper does not report on execution times. Thus, efficiency conclusions cannot be drawn.

In [55], the authors look at making an explainable AI on deep learning models using a range of local and global explanations, with LIME, SHAP, and rulefit. However, its explainability is limited by treating all attacks as the same, and deep learning may still be too much of a “black box” compared to other approaches.

The authors in [71] created a novel IDS framework which resulted in a high accuracy. The proposed approach focuses on effective feature selection. They used boruta feature selection, which permutates many copies of the data to improve the robustness of the feature selected. However, the approach would not meet our definition of robustness against unknown attacks. Additionally, the authors deal with bias only partially by taking into account the correlation of different features but not the inherent class imbalance. Finally, the work does not focus on explainability or efficiency.

The use of Convolutional Neural Networks (CNNs) in IDS is a growing research area [70,76,77]. CNNs can process significant amounts of data in parallel, enabling efficient real-time analysis of network traffic [78]. This ability to process large amounts of data means costly feature engineering is not conducted. However, they are currently limited in practical use due to computational requirements [76].

In [74], the team worked towards explainability with SHAP local and global explanations. The authors used a range of more interpretable ML models such as naive Bayes, SVM linear, SVM RBF, decision trees, quadratic discriminant analysis, and logistic regression, as well as less interpretable models such deep learning DNN. They concluded that the use of SHAP on the less-interpretable models would not give adequate explainability. They attempted to address bias by removing missing values and encoding categorical data as numerical. Further, they removed features which only contained one unique value, a variance of 0.1, or Pearson correlation of 0.9. Additionally, they performed recursive feature elimination and sequential feature selection. The data were scaled from a range of 0–1, except for the DL methods, which were standardized by removing the mean and scaling to unit variance. Similarly, a range of AI/ML approaches was used, including CNN, DNN, auto encoders, logistic regression, and random forest XGBoost in [60].

Both works conclude that in order to achieve better performance, feature selection is the best approach. This finding also aligns with the observations of wider literature reviews [59]. The literature is now starting to focus on frameworks that involve feature selection as part of their design. However, those lack statistical certainty, as they most often do not test for dependencies within the dataset.

The research papers often focus on data, selection showing they are aware of the importance of feature selection based on feature importance, but they do not reveal how their systems score the importance [71]. Additionally, often, feature importance analysis lumps all attack types into one, which is not necessarily the most appropriate approach, as attacks can have widely different signatures. This has been shown in the literature when trying to generalize dispersed data.

Some recent papers for IDS with ML on the UNSW-NB 15 dataset [56,58] focus on feature extraction due to its demonstrated effectiveness [59]. However, they do not reveal which attributes of the data their feature extraction algorithm picked. This lack of explainability makes it hard to make use of the results more generally in feature engineering and causes uncertainty in black box results.

When exploring the feature importance of the UNSW-NB 15 dataset, the research also does not discuss feature importance [71], or in [58,73], or it treats all attacks as one when calculating feature importance, so the unique feature importance of different attack types may be lost. This is exacerbated with the known imbalance of data per attack type in this dataset. Additionally, ref. [67] specifically visualizes the distinctions between the attack types, showing that treating them as one would have limitations. In the IDS with the ML literature, it has been proven that creating models for each attack type produces better results [59]. As the recent papers do not discuss the statistical use of the dataset, we assume no major changes have been applied. Most prominently, when selecting the important features, there is a lack of evaluation towards multicollinearity [67,73].

On the other hand, due to the rapidly evolving nature of security threats, IDS datasets quickly become outdated. To combat this, the generation of new datasets has become an area of research [69,79]. Generative Adversarial Networks (GANs) have seen an increasing amount of use in IDS. With their ability to generate new synthetic data, this can help to deal with the problem of imbalanced classes in IDS datasets [80]. This extra generation of data can improve robustness. Further, the simulation of real-time traffic has shown to be useful in debugging and testing IDS before deployment. However, GANS are currently limited in IDS by their inability to deal with multiple classes. In addition, they require strict preprocessing of data, as GANS were originally designed for image data [81]. These factors align with earlier observations on the importance of appropriate datasets for IDS as the main driving force for future improvements in the field. Most recently, ref. [69] presented an open-source tool that can be used to generate the dataset so new datasets can be generated as the network changes over time. However, these new datasets are limited by their lack of use by the wider scientific community, making comparisons to other work limited, as the validity of the datasets has not been rigorously tested.

6. Shortcomings in Existing IDSs

In light of the earlier review, the limitations of existing IDSs are categorized into two broad categories. Firstly, those relating to false identification of a particular class are fundamentally linked to the quality of the dataset and the feature selection. Secondly, those related to the ethical AI aspects are reviewed in the earlier section relating to bias, explainability, and robustness. This section elaborates further on these two categories of limitations.

6.1. False Identification

In every ML or AI approach, and particularly in classification tasks often used by IDS, the concepts of false positives and false negatives are crucial. These concepts are best understood in a binary classification example (Figure 11) where we attempt to identify “malicious” or “benign” activities in our monitored network. The real traffic implications of these errors in IDS were studied in [82], proposing mechanisms for their assessment. Data mining and machine learning strategies specifically aimed at reducing false positives in IDS have been explored in [83]. Further research discusses minimizing false positives through decision tree classifiers [84], while adaptive alert classification is used in [85] to improve the accuracy of distinguishing between true and false positives. Finally, the need for enhancing detection accuracy by reducing both false positives and negatives through multi-objective optimization techniques in IDS is presented in [86].

In IDS, a false positive (type I error) leads to unnecessary allocation of resources to investigate non-threatening activities, potentially leading to operational disruptions and alert fatigue, where users become desensitized to warnings [52,87]. Several methods have been explored for the minimization of false positives in IDS, including optimization techniques [88]. Furthermore, these include hybrid approaches combining different detection methods [89], attack-specific methods [90], and ML [52].

On the other hand, a false negative (type II error) in IDS means that a real threat or malicious activity is not detected. The implications are more severe, as false negatives lead to unmitigated attacks, data breaches, and other security incidents [52], with various strategies investigated over the years [91,92]. Recent approaches to reduce or mitigate false negatives include ML [93], attention mechanism-based algorithms [94], ensemble model approaches [95], and deep learning [96], among others. The distinction between false positives and false negatives is fundamentally tied to the trade-off between sensitivity (or recall) and specificity of a model.

Balancing these errors requires careful tuning of the model’s threshold values and continuous evaluation against emerging threats to maintain an acceptable level of security without overwhelming the system or its users with false alarms. The authors in [97] discuss the need for analysts to periodically review and adjust thresholds to optimize the balance between false positives and false negatives, emphasizing the importance of continuous evaluation in security operations. Additionally, the authors in [98] explore enhancing insider threat detection in imbalanced cyber security settings, focusing on fine-tuning threshold values to better manage the trade-offs between detecting true threats and minimizing false positives. Beyond these approaches, in this review, we further argue that the quality of datasets is crucial in every ML/AI approach, especially the aspects of high dimensionality or multicollinearity, which can silently misguide model patterns and introduce hidden biases.

Many of the reviewed studies have attempted to address the issue through feature selection. Manual and automated methods have been attempted. In [82], the authors propose statistical methods, and in [73], a more narrative approach is used, but the results have not gained traction in further research. In [60], a fusion of statistical importance methods is proposed, showing reduced computation capacity and retained accuracy. The authors in [57] demonstrate the use of genetic algorithm-based feature extraction, leading to reduced false positives, though it is combined with a different modeling approach, so the individual contribution is not clear. Ref. [58] proposed an incrementally convoluted method of Information Gain (IG) and Random Forest (RF) for initial feature selection, followed by Recursive Feature Elimination (RFE) with a Multilayer Perceptron (MLP) network for further refinement, showing a 50% reduction in feature space and leading to a 2% improvement in accuracy. Automated cleaning methods include principal component analysis (PCA), ridge regression, lasso regression, ensemble learning [63], and Recursive Feature Elimination (RFE) [66]. PCA and REF techniques provide higher accuracy and lower false-positive rates in IDS applications [66]. Others are not yet evaluated, while it is unclear if outlier detection [64] is indeed utilized or suitable for CS applications. Furthermore, the Synthetic Minority Over-Sampling Technique (SMOTE) [96] is often used to automatically balance data, while it is unclear if an absolute balance would be truly representative of the problem. Even more advanced approaches such as feature extraction models like VGG-16 and DenseNet are proposed in [68] at a high computation cost. Other studies simply use increasingly complex models (more than 56% of the studies use CNNs, RNNs, DNNs in Table 3) or hybrid approaches to counter the effects of unclean data [61]. It is clear that the research community has not reached a consensus on the most appropriate approach to tackle the issue of false positives or the features needed for IDS models.

6.2. Ethical AI and Compliance Aspects

Ethical AI principles can play a significant role in how machine learning (ML) models are developed and applied in a cyber security context by providing additional considerations around transparency, fairness, privacy, and accountability. These principles add valuable requirements that contribute to the ethical development of ML models and also result in more robust models, increased user trust, and better regulatory compliance. Integration of ethical AI principles can be well understood through the process and benefits described in a few pioneering case studies.

Transparency: In the cyber security world, transparency is essential, and transparency about the model’s decision-making process is especially vital in capturing why the model sets off an alert or makes specific predictions. Explainable AI (XAI) techniques like SHAP (Shapley Additive Explanations) and LIME (Local Interpretable Model-Agnostic Explanations) can be used to elucidate the inner workings of a model and provide insight into its threat detection abilities for human understanding and trust. An informational case example within the IBM Watson for CS (https://medium.com/trusted-ai/explainability-using-shap-in-ibm-watson-openscale-55548adedf38, accessed on 3 November 2024) project focused on how to make Watson’s detection approach transparent using natural language processing to reveal real-time information for a batch of incidents. SHAP integration proved essential to explain features that contributed to the detection of threats, easing the interpretation of the model for cyber security teams.
Fairness and bias mitigation: In the scope of cyber security, fairness can be ensured if ML models prepared for threat detection do not discriminate against any group of malicious activities or any potentially relevant data source from which threats might arise. A case study of bias in a cyber security dataset could use SMOTE to address the data bias, problem where the team aimed for a more balanced cyber security dataset, allowing the system to recognize a large and diverse set of malicious threats, thus lowering the possibility of biased detection. Microsoft has published extensively in this domain (https://www.microsoft.com/de-ch/ai/responsible-ai, accessed on 3 November 2024).
Privacy and security: Privacy-preserving techniques such as differential privacy or federated learning ensure that individual data are protected. Google’s Federated Learning for Mobile Threat Detection epitomizes privacy-centric cyber security (https://research.google/pubs/federated-learning-for-mobile-keyboard-prediction-2/, accessed on 3 November 2024 ). This approach detects malware by training models directly on users’ devices rather than centralized servers. Through this distributed approach, detection happens without sacrificing sensitive data, and the certainty of a balance between security and privacy is achieved.
Accountability with auditability: This can involve the provision for audit trails, along with auditing of any changes in model performance. The EU’s AI4Cyber (https://ai4cyber.eu/, accessed on 3 November 2024) project team included an audit framework for cyber security models. This auditing framework allowed for a series of regular reviews and impact evaluations to gauge success in model performance, to determine conformity with the EU GDPR, and to create a responsible mechanism for consideration of complaints and problems relating to the model.
Robustness and security against adversarial attacks: Models used in cyber security must also be formally resistant to adversarial attacks that aim to manipulate underlying vulnerabilities in the training data or the model structure. Such robustness can be achieved through robust training paradigms, such as adversarial training. The Guaranteeing AI Robustness against Deception project by DARPA (https://www.darpa.mil/program/guaranteeing-ai-robustness-against-deception, accessed on 3 November 2024) is an example of a project that secures ML models against adversarial threats. In this project, models were trained on artificially generated cyber security data to prepare them for real-world malicious attacks.
User-centric design and human oversight: In ethical AI within cyber security, user-centric design must ensure human oversight is prioritized. A human-in-the-loop (HITL) approach allows for human intervention if automation fails to suffice, thus providing a layer of ethical decision-making. The Umbrella Security platform (https://umbrella.cisco.com/, accessed on 3 November 2024) used by Cisco effectively implements HITL strategies, flagging uncertain cases for human review.
Compliance: Integrating frameworks such as GDPR, EU AI Act, and NIST’s AI risk management framework will ensure the legal use and applications of ethical AI. Periodic evaluations ensure the models remain compliant with pressure from regulation amendments. The NIST’s AI Compliance for Federal Cyber Security project requires the creation of federal cyber security models complying with the NIST (https://www.nist.gov/itl/ai-risk-management-framework, accessed on 3 November 2024 ) principles for AI ethics, fairness, and robustness, setting a template for cyber security teams throughout the U.S. to maintain policy compliance and set a standard for ethical AI across public cyber security.

A diversified approach must therefore be put in place to integrate AI principles of an ethical nature into cyber security. By employing techniques such as explainable AI, federated learning, adversarial training, and human-in-the-loop artificial intelligence, cyber security projects may solidify the grounds for ethics that pattern model adherence and model robustness to meet compliance requirements within a rapidly shifting paradigm of cyber security.

7. Open Questions

From the aforementioned analysis, we concur that future work must focus on the improvement of input data for IDS approaches. Additionally, parameters such as explainability, bias, robustness, and efficiency must undertake a larger role in the furtherment of the field. Based on these observations, we identify several open questions:

RQ1: Which features used in ML/AI training can be considered sensitive data, and how can these be protected without losing utility?
RQ2: Which datasets and ML approaches used for intrusion detection have been affected by unnecessary high dimensionality or multicollinearity?
RQ3: What methods can be used to detect and mitigate bias in IDS models?
RQ4: Which of the modern approaches for transparency and explainability are useful for cyber security-relevant datasets and IDS models, and how should they be adapted?
RQ5: How can we protect ML models for IDS from adversarial attacks, including evasion and poisoning attacks?
RQ6: What are the legal and accountability implications of ML/AL-based decisions in IDS?
RQ7: Which modern approaches in incremental learning can be adapted to enable IDS models to learn continuously and adapt to evolving threats without introducing new ethical or security issues?
RQ8: What optimizations can be suitably applied in IDS techniques to make them computationally efficient and propose acceptable trade-offs between accuracy and resource consumption?
RQ9: What frameworks and certification processes can be developed to standardize ethical practices in ML/AI for IDS?

Addressing these open questions requires multidisciplinary research and collaboration, combining expertise from ML, AI, cyber security, ethics, law, and user experience design. As a result, wider and complete teams will be necessary in the future to truly address the challenges posed and provide industry adoptable solutions with practical impact beyond academia. Potential pathways for advancing the field should consider developing and integrating new datasets, as well as providing viable approaches to combine existing datasets. Significant effort should be made to adopt ethical AI practices such as enriched white-box models with transparency and explainability tools. Finally, research should focus on establishing a common understanding of features that provide a realistic representation of the question being investigated, e.g., network traffic. This should consider both the availability of features, the format and representation, the mathematical underpinning, as well as the categorization/classification of attacks, thus addressing the identified limitations in current approaches.

Author Contributions

Conceptualization, I.J.V.; methodology, I.J.V.; validation, A.L.M.; investigation, I.J.V. and A.L.M.; resources, I.J.V. and A.L.M.; writing—original draft preparation, I.J.V.; writing—review and editing, A.L.M.; visualization, I.J.V.; project administration, A.L.M.; funding acquisition, I.J.V. and A.L.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly funded through the UK Transformative Technologies competition supported by UKRI, specifically Innovate UK (app. no: 10074348).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All datasets are generally publicly available and widely used in the domain. No new data were created.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results. Michala has declared conflicts of interest in her role as a Lecturer at the University of Glasgow and has the approval to conduct research related to this paper without acknowledging the University. Netrity Ltd. is a commercial entity in which both authors have commercial interests.

References

bin Zainuddin, A.A.; Sairin, H.; Mazlan, I.A.; Muslim, N.N.A.; Sabarudin, W.A.S.W. Enhancing IoT Security: A Synergy of Machine Learning, Artificial Intelligence, and Blockchain. Data Sci. Insights 2024, 2, 11. [Google Scholar]
Mammeri, Z.Z. Introduction to Computer Security; Wiley Data and Cybersecurity: Hoboken, NJ, USA, 2024. [Google Scholar]
Manikandan, V.; Raj, V.; Janakiraman, S.; Sivaraman, R.; Amirtharajan, R. Let wavelet authenticate and tent-map encrypt: A sacred connect against a secret nexus. Soft Comput. 2024, 28, 6839–6853. [Google Scholar] [CrossRef]
Hayagreevan, H.; Khamaru, S. Security of and by Generative AI platforms. arXiv 2024, arXiv:2410.13899. [Google Scholar]
Mijwil, M.; Salem, I.E.; Ismaeel, M.M. The Significance of Machine Learning and Deep Learning Techniques in Cybersecurity: A Comprehensive Review. Iraqi J. Comput. Sci. Math. 2023, 4, 87–101. [Google Scholar]
Alrawais, A.; Alhothaily, A.; Hu, C.; Cheng, X. Fog computing for the internet of things: Security and privacy issues. IEEE Internet Comput. 2017, 21, 34–42. [Google Scholar] [CrossRef]
Azam, N.; Michala, A.L.; Ansari, S.; Truong, N.B. Modelling Technique for GDPR-Compliance: Toward a Comprehensive Solution. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 3300–3305. [Google Scholar]
Kulesza, J.; Balleste, R. Cybersecurity and Human Rights in the Age of Cyberveillance; Rowman & Littlefield: Lanham, MD, USA, 2015. [Google Scholar]
Chen, L.; Chen, L.; Jordan, S.; Liu, Y.K.; Moody, D.; Peralta, R.; Perlner, R.A.; Smith-Tone, D. Report on Post-Quantum Cryptography; US Department of Commerce, National Institute of Standards and Technology: Gaithersburg, MD, USA, 2016; Volume 12.
Kshetri, N. Can blockchain strengthen the internet of things? IT Prof. 2017, 19, 68–72. [Google Scholar] [CrossRef]
Hadlington, L. Human factors in cybersecurity; Examining the link between Internet addiction, impulsivity, attitudes towards cybersecurity, and risky cybersecurity behaviours. Heliyon 2017, 3, e00346. [Google Scholar] [CrossRef]
Cichonski, P.; Millar, T.; Grance, T.; Scarfone, K. Computer security incident handling guide. NIST Spec. Publ. 2012, 800, 1–147. [Google Scholar]
Sharma, S.; Mishra, N. Original Research Article Anomaly detection in Smart Traffic Light system using blockchain: Securing through proof of stake and machine learning. J. Auton. Intell. 2024, 7, 1087. [Google Scholar] [CrossRef]
Wisdom, D.D.; Vincent, O.R.; Igulu, K.; Hyacinth, E.A.; Christian, A.U.; Oduntan, O.E.; Hauni, A.G. Industrial IoT Security Infrastructures and Threats. In Communication Technologies and Security Challenges in IoT: Present and Future; Springer: Singapore, 2024; pp. 369–402. [Google Scholar]
Tarab, H.I. Cyber-attack detection and identification using deep learning. Int. J. Comput. Artif. Intell. 2024, 5, 42–49. [Google Scholar] [CrossRef]
Swathi, G.C.; Kumar, G.K.; Kumar, A.S. Ensemble classification to predict botnet and its impact on IoT networks. Meas. Sensors 2024, 33, 101130. [Google Scholar] [CrossRef]
Buedi, E.D.; Ghorbani, A.A.; Dadkhah, S.; Ferreira, R.L. Enhancing EV Charging Station Security Using A Multi-dimensional Dataset: CICEVSE2024. Res. Sq. 2024. [Google Scholar] [CrossRef]
Lightbody, D.; Ngo, D.M.; Temko, A.; Murphy, C.C.; Popovici, E. Dragon_Pi: IoT Side-Channel Power Data Intrusion Detection Dataset and Unsupervised Convolutional Autoencoder for Intrusion Detection. Future Internet 2024, 16, 88. [Google Scholar] [CrossRef]
Murthy, A.; Asghar, M.R.; Tu, W. A lightweight Intrusion Detection for Internet of Things-based smart buildings. Secur. Priv. 2024, 7, e386. [Google Scholar] [CrossRef]
Nijim, M.; Kanumuri, V.; Al Aqqad, W.; Albataineh, H. Machine Learning Based Analysis of Cyber-Attacks Targeting Smart Grid Infrastructure. In Proceedings of the International Conference on Advances in Computing Research, Madrid, Spain, 3–5 June 2024; Springer: Cham, Switzerland, 2024; pp. 334–349. [Google Scholar]
Pulimamidi, R. To enhance customer (or patient) experience based on IoT analytical study through technology (IT) transformation for E-healthcare. Meas. Sensors 2024, 33, 101087. [Google Scholar] [CrossRef]
Bolat-Akça, B.; Bozkaya, E. Digital twin-assisted intelligent anomaly detection system for Internet of Things. Ad Hoc Netw. 2024, 158, 103484. [Google Scholar] [CrossRef]
Sikorski, M.; Honig, A. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software; No Starch Press: San Francisco, CA, USA, 2012. [Google Scholar]
Ucci, D.; Aniello, L.; Baldoni, R. Survey of machine learning techniques for malware analysis. Comput. Secur. 2019, 81, 123–147. [Google Scholar] [CrossRef]
Javaid, A.; Niyaz, Q.; Sun, W.; Alam, M. A deep learning approach for network intrusion detection system. In Proceedings of the 9th EAI International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS), New York, NY, USA, 3–5 December 2015; pp. 21–26. [Google Scholar]
Savage, K.; Coogan, P.; Lau, H. The Evolution of Ransomware, Symantec Security Response; Symantec Corporation: Mountain View, CA, USA, 2015. [Google Scholar]
Kharraz, A.; Robertson, W.; Balzarotti, D.; Bilge, L.; Kirda, E. Cutting the gordian knot: A look under the hood of ransomware attacks. In Proceedings of the Detection of Intrusions and Malware, and Vulnerability Assessment: 12th International Conference, DIMVA 2015, Milan, Italy, 9–10 July 2015; Springer: Cham, Switzerland, 2015; pp. 3–24. [Google Scholar]
Richardson, R.; North, M.M. Ransomware: Evolution, mitigation and prevention. Int. Manag. Rev. 2017, 13, 10. [Google Scholar]
Liska, A.; Gallo, T. Ransomware: Defending Against Digital Extortion; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2016. [Google Scholar]
Hadnagy, C. Social Engineering: The Art of Human Hacking; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Collier, H.; Morton, C. Teenagers: A Social Media Threat Vector. In Proceedings of the International Conference on Cyber Warfare and Security, Johannesburg, South Africa, 26–27 March 2024; Voume 19, pp. 55–61. [Google Scholar]
Hix, J.; Teng, J.; Juker, M.; Ryan, G. AI-Based Phishing Countermeasures; Embry-Riddle Aeronautical University, Prescott Campus: Prescott, AZ, USA, 2024. [Google Scholar]
Adekunle, T.S.; Alabi, O.O.; Lawrence, M.O.; Ebong, G.N.; Ajiboye, G.O.; Bamisaye, T.A. The Use of AI to Analyze Social Media Attacks for Predictive Analytics. J. Comput. Theor. Appl. 2024, 2, 169–178. [Google Scholar]
Ussatova, O.; Zhumabekova, A.; Karyukin, V.; Matson, E.T.; Ussatov, N. The development of a model for the threat detection system with the use of machine learning and neural network methods. Int. J. Innov. Res. Sci. Stud. 2024, 7, 863–877. [Google Scholar] [CrossRef]
Abu-Amara, F.; Hosani, R.A.; Tamimi, H.A.; Hamdi, B.A. Spreading cybersecurity awareness via gamification: Zero-day game. Int. J. Inf. Technol. 2024, 16, 2945–2953. [Google Scholar] [CrossRef]
Heartfield, R.; Loukas, G. A taxonomy of attacks and a survey of defence mechanisms for semantic social engineering attacks. ACM Comput. Surv. (CSUR) 2015, 48, 1–39. [Google Scholar] [CrossRef]
Mirkovic, J.; Reiher, P. A taxonomy of DDoS attack and DDoS defense mechanisms. ACM SIGCOMM Comput. Commun. Rev. 2004, 34, 39–53. [Google Scholar] [CrossRef]
Kambourakis, G.; Kolias, C.; Stavrou, A. The mirai botnet and the iot zombie armies. In Proceedings of the MILCOM 2017—2017 IEEE Military Communications Conference (MILCOM), Baltimore, MD, USA, 23–25 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 267–272. [Google Scholar]
Zekri, M.; El Kafhali, S.; Aboutabit, N.; Saadi, Y. DDoS attack detection using machine learning techniques in cloud computing environments. In Proceedings of the 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), Rabat, Morocco, 24–26 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–7. [Google Scholar]
Zargar, S.T.; Joshi, J.; Tipper, D. A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks. IEEE Commun. Surv. Tutor. 2013, 15, 2046–2069. [Google Scholar] [CrossRef]
Jemal, I.; Cheikhrouhou, O.; Hamam, H.; Mahfoudhi, A. Sql injection attack detection and prevention techniques using machine learning. Int. J. Appl. Eng. Res. 2020, 15, 569–580. [Google Scholar]
Falor, A.; Hirani, M.; Vedant, H.; Mehta, P.; Krishnan, D. A deep learning approach for detection of SQL injection attacks using convolutional neural networks. In Proceedings of the Data Analytics and Management: ICDAM 2021, Polkowice, Poland, 26 June 2021; Springer: Singapore, 2022; Volume 2, pp. 293–304. [Google Scholar]
Sabottke, C.; Suciu, O.; Dumitraș, T. Vulnerability disclosure in the age of social media: Exploiting twitter for predicting {Real-World} exploits. In Proceedings of the 24th USENIX Security Symposium (USENIX Security 15), Washington, DC, USA, 12–14 August 2015; pp. 1041–1056. [Google Scholar]
Radhakrishnan, K.; Menon, R.R.; Nath, H.V. A survey of zero-day malware attacks and its detection methodology. In Proceedings of the TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India, 17–20 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 533–539. [Google Scholar]
Farnham, G.; Atlasis, A. Detecting DNS tunneling. SANS Inst. Infosec Read. Room 2013, 9, 1–32. [Google Scholar]
Zhang, R.; Zhang, Y.; Ren, K. Distributed privacy-preserving access control in sensor networks. IEEE Trans. Parallel Distrib. Syst. 2011, 23, 1427–1438. [Google Scholar] [CrossRef]
Abualghanam, O.; Alazzam, H.; Elshqeirat, B.; Qatawneh, M.; Almaiah, M.A. Real-time detection system for data exfiltration over DNS tunneling using machine learning. Electronics 2023, 12, 1467. [Google Scholar] [CrossRef]
Matti, E. Evaluation of Open Source Web Vulnerability Scanners and Their Techniques Used to Find SQL Injection and Cross-Site Scripting Vulnerabilities. Dissertation. 2021. Available online: https://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177606 (accessed on 3 November 2024).
Venkatesha, S.; Reddy, K.R.; Chandavarkar, B. Social engineering attacks during the COVID-19 pandemic. SN Comput. Sci. 2021, 2, 78. [Google Scholar] [CrossRef]
Granger, S. Social Engineering Fundamentals, Part I: Hacker Tactics. 2003. Available online: https://api.semanticscholar.org/CorpusID:110906298 (accessed on 3 November 2024).
Wilson, M.; Hash, J. Building an information technology security awareness and training program. NIST Spec. Publ. 2003, 800, 1–39. [Google Scholar]
Kus, D.; Wagner, E.; Pennekamp, J.; Wolsing, K.; Fink, I.B.; Dahlmanns, M.; Wehrle, K.; Henze, M. A False Sense of Security? Revisiting the State of Machine Learning-Based Industrial Intrusion Detection. In Proceedings of the 8th ACM on Cyber-Physical System Security Workshop, Nagasaki, Japan, 30 May 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 73–84. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing Realistic Distributed Denial of Service (DDoS) Attack Dataset and Taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
Houda, Z.A.E.; Brik, B.; Khoukhi, L. “Why Should I Trust Your IDS?”: An Explainable Deep Learning Framework for Intrusion Detection Systems in Internet of Things Networks. IEEE Open J. Commun. Soc. 2022, 3, 1164–1176. [Google Scholar] [CrossRef]
Thakkar, A.; Lohiya, R. Fusion of statistical importance for feature selection in Deep Neural Network-based Intrusion Detection System. Inf. Fusion 2023, 90, 353–363. [Google Scholar] [CrossRef]
Satyanarayana, G.; Chatrapathi, K.S. Improving Intrusion Detection Performance with Genetic Algorithm-Based Feature Extraction and Ensemble Machine Learning Methods. Int. J. Intell. Syst. Appl. Eng. 2023, 11, 100–112. [Google Scholar]
Yin, Y.; Jang-Jaccard, J.; Xu, W.; Singh, A.; Zhu, J.; Sabrina, F.; Kwak, J. IGRF-RFE: A hybrid feature selection method for MLP-based network intrusion detection on UNSW-NB15 dataset. J. Big Data 2023, 10, 15. [Google Scholar] [CrossRef]
Pinto, A.; Herrera, L.C.; Donoso, Y.; Gutierrez, J.A. Survey on Intrusion Detection Systems Based on Machine Learning Techniques for the Protection of Critical Infrastructure. Sensors 2023, 23, 2415. [Google Scholar] [CrossRef]
Thakkar, A.; Lohiya, R. A Review on Challenges and Future Research Directions for Machine Learning-Based Intrusion Detection System. Arch. Comput. Methods Eng. 2023, 30, 4245–4269. [Google Scholar] [CrossRef]
Thakkar, A.; Lohiya, R. A survey on intrusion detection system: Feature selection, model, performance measures, application perspective, challenges, and future research directions. Artif. Intell. Rev. 2022, 55, 453–563. [Google Scholar] [CrossRef]
Sarker, I. Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective. SN Comput. Sci. 2021, 2, 154. [Google Scholar] [CrossRef]
Chan, J.Y.L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.W.; Chen, Y.L. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics 2022, 10, 1283. [Google Scholar] [CrossRef]
Boukerche, A.; Zheng, L.; Alfandi, O. Outlier Detection: Methods, Models, and Classification. ACM Comput. Surv. 2020, 53, 55. [Google Scholar] [CrossRef]
Kumar, P.; Bhatnagar, R.; Gaur, K.; Bhatnagar, A. Classification of imbalanced data: Review of methods and applications. Iop Conf. Ser. Mater. Sci. Eng. 2021, 1099, 012077. [Google Scholar] [CrossRef]
Nabi, F.; Zhou, X. Enhancing Intrusion Detection Systems Through Dimensionality Reduction: A Comparative Study of Machine Learning Techniques for Cyber Security. Cyber Secur. Appl. 2024, 2, 100033. [Google Scholar] [CrossRef]
Zoghi, Z.; Serpen, G. UNSW-NB15 Computer Security Dataset: Analysis through Visualization. arXiv 2021, arXiv:2101.05067. [Google Scholar] [CrossRef]
Musleh, D.; Alotaibi, M.; Alhaidari, F.; Rahman, A.; Mohammad, R.M. Intrusion Detection System Using Feature Extraction with Machine Learning Algorithms in IoT. J. Sens. Actuator Netw. 2023, 12, 29. [Google Scholar] [CrossRef]
Dehlaghi-Ghadim, A.; Moghadam, M.H.; Balador, A.; Hansson, H. Anomaly Detection Dataset for Industrial Control Systems. arXiv 2023, arXiv:c2305.09678. [Google Scholar] [CrossRef]
Kumar, A.; Sharma, I. CNN-based Approach for IoT Intrusion Attack Detection. In Proceedings of the 2023 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS), Erode, India, 23–25 March 2023; pp. 492–496. [Google Scholar] [CrossRef]
Subbiah, S.; Anbananthen, K.S.M.; Thangaraj, S.; Kannan, S.; Chelliah, D. Intrusion detection technique in wireless sensor network using grid search random forest with Boruta feature selection algorithm. J. Commun. Netw. 2022, 24, 264–273. [Google Scholar] [CrossRef]
Imanbayev, A.; Tynymbayev, S.; Odarchenko, R.; Gnatyuk, S.; Berdibayev, R.; Baikenov, A.; Kaniyeva, N. Research of Machine Learning Algorithms for the Development of Intrusion Detection Systems in 5G Mobile Networks and Beyond. Sensors 2022, 22, 9957. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. The Significant Features of the UNSW-NB15 and the KDD99 Data Sets for Network Intrusion Detection Systems. In Proceedings of the 2015 4th International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security (BADGERS), Kyoto, Japan, 5 November 2015; pp. 25–31. [Google Scholar] [CrossRef]
Siganos, M.; Radoglou-Grammatikis, P.; Kotsiuba, I.; Markakis, E.; Moscholios, I.; Goudos, S.; Sarigiannidis, P. Explainable AI-Based Intrusion Detection in the Internet of Things. In Proceedings of the 18th International Conference on Availability, Reliability and Security, Benevento Italy, 29 August–1 September 2023; Association for Computing Machinery: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
Bacevicius, M.; Paulauskaite-Taraseviciene, A. Machine Learning Algorithms for Raw and Unbalanced Intrusion Detection Data in a Multi-Class Classification Problem. Appl. Sci. 2023, 13, 7328. [Google Scholar] [CrossRef]
Hnamte, V.; Hussain, J. Dependable intrusion detection system using deep convolutional neural network: A Novel framework and performance evaluation approach. Telemat. Inform. Rep. 2023, 11, 100077. [Google Scholar] [CrossRef]
Hnamte, V.; Hussain, J. DCNNBiLSTM: An Efficient Hybrid Deep Learning-Based Intrusion Detection System. Telemat. Inform. Rep. 2023, 10, 100053. [Google Scholar] [CrossRef]
Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar] [CrossRef] [PubMed]
Strandberg, P.E.; Söderman, D.; Dehlaghi-Ghadim, A.; Leon, M.; Markovic, T.; Punnekkat, S.; Moghadam, M.H.; Buffoni, D. The Westermo network traffic data set. Data Brief 2023, 50, 109512. [Google Scholar] [CrossRef] [PubMed]
Yang, J.; Li, T.; Liang, G.; He, W.; Zhao, Y. A simple recurrent unit model based intrusion detection system with DCGAN. IEEE Access 2019, 7, 83286–83296. [Google Scholar] [CrossRef]
Dunmore, A.; Jang-Jaccard, J.; Sabrina, F.; Kwak, J. A Comprehensive Survey of Generative Adversarial Networks (GANs) in Cybersecurity Intrusion Detection. IEEE Access 2023, 11, 76071–76094. [Google Scholar] [CrossRef]
Ho, C.Y.; Lai, Y.C.; Chen, I.W.; Wang, F.Y.; Tai, W.H. Statistical analysis of false positives and false negatives from real traffic with intrusion detection/prevention systems. IEEE Commun. Mag. 2012, 50, 146–154. [Google Scholar] [CrossRef]
Pietraszek, T.; Tanner, A. Data mining and machine learning—Towards reducing false positives in intrusion detection. Inf. Secur. Tech. Rep. 2005, 10, 169–183. [Google Scholar] [CrossRef]
Ohta, S.; Kurebayashi, R.; Kobayashi, K. Minimizing false positives of a decision tree classifier for intrusion detection on the internet. J. Netw. Syst. Manag. 2008, 16, 399–419. [Google Scholar] [CrossRef]
Pietraszek, T. Using adaptive alert classification to reduce false positives in intrusion detection. In Proceedings of the Recent Advances in Intrusion Detection: 7th International Symposium, RAID 2004, Sophia Antipolis, France, 15–17 September 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 102–124. [Google Scholar]
Hachmi, F.; Boujenfa, K.; Limam, M. Enhancing the accuracy of intrusion detection systems by reducing the rates of false positives and false negatives through multi-objective optimization. J. Netw. Syst. Manag. 2019, 27, 93–120. [Google Scholar] [CrossRef]
Jose, J.; Jose, D.V. AS-CL IDS: Anomaly and signature-based CNN-LSTM intrusion detection system for internet of things. Int. J. Adv. Technol. Eng. Explor. 2023, 10, 1622–1639. [Google Scholar]
Al Jallad, K.; Aljnidi, M.; Desouki, M.S. Anomaly detection optimization using big data and deep learning to reduce false-positive. J. Big Data 2020, 7, 68. [Google Scholar] [CrossRef]
Latah, M.; Toker, L. Minimizing false positive rate for DoS attack detection: A hybrid SDN-based approach. ICT Express 2020, 6, 125–127. [Google Scholar] [CrossRef]
Pitre, P.; Gandhi, A.; Konde, V.; Adhao, R.; Pachghare, V. An intrusion detection system for zero-day attacks to reduce false positive rates. In Proceedings of the 2022 International Conference for Advancement in Technology (ICONAT), Goa, India, 21–22 January 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–6. [Google Scholar]
Vij, C.; Saini, H. Intrusion detection systems: Conceptual study and review. In Proceedings of the 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC), Solan, India, 7–9 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 694–700. [Google Scholar]
Azeez, N.A.; Bada, T.M.; Misra, S.; Adewumi, A.; Van der Vyver, C.; Ahuja, R. Intrusion detection and prevention systems: An updated review. In Data Management, Analytics and Innovation: Proceedings of ICDMAI 2019, Volume 1; Springer: Singapore, 2020; pp. 685–696. [Google Scholar]
Shin, Y.; Kim, K. Comparison of anomaly detection accuracy of host-based intrusion detection systems based on different machine learning algorithms. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 33. [Google Scholar] [CrossRef]
Laghrissi, F.; Douzi, S.; Douzi, K.; Hssina, B. IDS-attention: An efficient algorithm for intrusion detection systems using attention mechanism. J. Big Data 2021, 8, 149. [Google Scholar] [CrossRef]
Jiang, Y.; Atif, Y. A selective ensemble model for cognitive cybersecurity analysis. J. Netw. Comput. Appl. 2021, 193, 103210. [Google Scholar] [CrossRef]
Alkhudaydi, O.A.; Krichen, M.; Alghamdi, A.D. A deep learning methodology for predicting cybersecurity attacks on the internet of things. Information 2023, 14, 550. [Google Scholar] [CrossRef]
Alahmadi, B.A.; Axon, L.; Martinovic, I. 99% false positives: A qualitative study of {SOC} analysts’ perspectives on security alarms. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 2783–2800. [Google Scholar]
Al-Shehari, T.; Rosaci, D.; Al-Razgan, M.; Alfakih, T.; Kadrie, M.; Afzal, H.; Nawaz, R. Enhancing Insider Threat Detection in Imbalanced Cybersecurity Settings Using the Density-Based Local Outlier Factor Algorithm. IEEE Access 2024, 12, 34820–34834. [Google Scholar] [CrossRef]

Figure 1. Malware infection process and consequences; side effects denoted with dashed arrows.

Figure 2. Ransomware infection process and consequences; side effects denoted with dashed arrows.

Figure 3. Phishing infection process and consequences.

Figure 4. DDoS infection process and consequences.

Figure 5. SQL injection infection process and consequences.

Figure 6. Zero-day exploit infection process and consequences.

Figure 7. DNS tunneling infection process and consequences.

Figure 8. XSS attack infection process and consequences.

Figure 9. Social engineering-based infection process and consequences.

Figure 10. Bubble chart of datasets by year created, number of attack labels, and dataset name as annotation. The bubble size demonstrates the number of sub-groups or scenarios.

Figure 11. User perception of FN/FP in IDS and recommended resolutions.

Table 1. Types of cyber security.

Type	Role & Methods
Infrastructure security	Protects infrastructure, such as power networks and data centers, and confirms the absence of any gaps Physical security–virtual security–redundancy/resilience
Network security	Protects networks from intrusions by utilizing certain tools, such as intrusion detection and prevention systems (IDPS), remote access management (AC), two-factor authentication (2FA), and firewalls Firewalls–IDPS–2FA–AC
Application security	Executes convoluted codes to preserve and encrypt data and codes in a way that is difficult to crack Security by design (sperating system, embedded, application)
Information security	Protects data from unauthorized access and modifications Database and communication encryption–AC
User education	Safeguards all of the above systems by reducing human error factors, especially those related to providing access

Table 2. Assessment of attack types where ML/AI is applicable.

Threat Type	Level of Threat	Sophistication	Potential Impact	Mitigation Complexity
Malware	Moderate to high	Moderate	High	Moderate to high
Ransomware	Moderate to high	Moderate	High	Moderate to high
Phishing	High	Moderate	High	Moderate
Distributed Denial of Service (DDoS)	High	High	Very high	Very high
SQL Injection	High	Moderate	Very high	Moderate to high
Zero-day exploits	Very high	Very high	Very high	Very high
Domain Name System (DNS) tunnel.	Moderate to high	High	Moderate to high	High
Cross-Site Scripting (XSS)	Moderate to high	Moderate	Moderate to high	Moderate
Social engineering	High	Variable	High	Moderate to high

Table 3. Summary of ML/AI modeling methods and prevalence inthe cited literature.

Modeling Approach	Prevelance (%)
SVM	20.83
RNN	8.33
Regression methods	4.17
Isolation, random forest, XGBoost	19.44
Autoencoders	2.78
Unspecified classification	9.72
Digital twins	1.39
Multiobjective optimization	1.39
Hybrid models	18.06
Decision trees	13.89
KNN	4.17
Generative AI	4.17
Ensemble methods	15.28
Bayesian networks	5.56
NN	9.72
CNN	18.06
DNN	25

Table 4. Summary of assessment of most prominent datasets covering a wide range of attacks. Attack Diversity, Realism and Balance are rated with a star system where each full star represents a step from Very Low to Very High.

Dataset	Attack Diversity	Realism	Balance	Quality	Size and Complexity
UNSW-NB15	$★ ★ ★ ★ ☆$	$★ ★ ★ ★ ☆$	$★ ★ ★ ☆ ☆$	Good, some imbalanced classes	Large and complex
NSL-KDD	$★ ★ ★ ☆ ☆$	$★ ★ ★ ☆ ☆$	$★ ★ ☆ ☆ ☆$	Improved balance and reduced redundancy	Manageable
CICIDS2017/2018	$★ ★ ★ ★ ★$	$★ ★ ★ ★ ★$	$★ ★ ★ ★ ☆$	Generally good, some imbalance	Very large

Table 5. Most prominent datasets with a wide range of attacks against assessment criteria.

Dataset	Audit Logs & Raw Data	Modern Attacks	Real or Simulated	Labelled	AI & GDPR Compliant	Scientifically Accepted
UNSW-NB15	Yes (raw packet data, audit logs unclear)	Partially (2015)	Simulated (generated in a controlled environment)	Yes	Presumed yes (anonymized)	Yes (widely used)
NSL KDD	Yes (raw packet data, audit logs unclear)	No (1999)	Simulated (injecting attacks into normal flow)	Yes	Presumed yes (KDD’99, privacy concerns addressed)	Yes (it is still a reference dataset in the community)
CIC datasets	Yes	Yes (2017/19)	Simulated (attacks to emulate real-world situations)	Yes	Yes (anonymized)	Yes (Recently gaining acceptance, but limited applicability)

Table 6. Recent research publications evaluated against our criteria where a checkmark denotes meeting the criterion.

Reference	Explainability	Bias	Robustness	Efficiency
[52]			✓
[67]	✓
[68]	✓
[69]
[55]	✓
[70]				✓
[71]		✓	✓
[56]			✓	✓
[72]
[73]	✓
[74]	✓	✓
[75]	✓	✓
[58]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vourganas, I.J.; Michala, A.L. Applications of Machine Learning in Cyber Security: A Review. J. Cybersecur. Priv. 2024, 4, 972-992. https://doi.org/10.3390/jcp4040045

AMA Style

Vourganas IJ, Michala AL. Applications of Machine Learning in Cyber Security: A Review. Journal of Cybersecurity and Privacy. 2024; 4(4):972-992. https://doi.org/10.3390/jcp4040045

Chicago/Turabian Style

Vourganas, Ioannis J., and Anna Lito Michala. 2024. "Applications of Machine Learning in Cyber Security: A Review" Journal of Cybersecurity and Privacy 4, no. 4: 972-992. https://doi.org/10.3390/jcp4040045

APA Style

Vourganas, I. J., & Michala, A. L. (2024). Applications of Machine Learning in Cyber Security: A Review. Journal of Cybersecurity and Privacy, 4(4), 972-992. https://doi.org/10.3390/jcp4040045

Article Menu