Unmasking Cybercrime with Artificial-Intelligence-Driven Cybersecurity Analytics
<p>The average cost of cybercrime.</p> "> Figure 2
<p>Procedure of digital forensics techniques.</p> "> Figure 3
<p>Structure of CNN model.</p> "> Figure 4
<p>Structure of LSTM model.</p> "> Figure 5
<p>Architecture of the proposed system.</p> "> Figure 6
<p>The structure of the layers used.</p> "> Figure 7
<p>Internal structure of the proposed CNN model.</p> "> Figure 8
<p>Accuracy underfitting result of the CNN LSTM model of the IoT 23 dataset.</p> "> Figure 9
<p>Loss underfitting result of the CNN LSTM model of the IoT 23 dataset.</p> "> Figure 10
<p>Accuracy training and validation.</p> "> Figure 11
<p>Loss training and validation.</p> ">
Abstract
:1. Introduction
- A deep learning model based on LSTM-CNN is proposed for investigating botnet traffic detection, with a focus on indicators of compromise, to enhance digital forensics and cyber threat intelligence, thereby helping to provide effective responses to cybercrime.
- By leveraging deep learning techniques, the proposed model has the potential to adapt to changing attack patterns and to learn intricate features automatically, thereby demonstrating adaptability to evolving advanced botnet techniques that evade detection.
- This study aims to discover hidden patterns and correlations in botnet activities that may not be apparent using traditional approaches. This is crucial in enhancing cyber threat intelligence and in facilitating proactive forensic measures.
2. Related Work
3. Proposed Work
- -
- Convolutional Layer: It consists of multiple learnable filters that slide across the input. Each filter performs a dot product operation between its weights and a small region of the input, producing a feature map. The feature map highlights important patterns or features present in the input.
- -
- Activation Function: After the convolutional operation, an activation function is applied element-wise to the feature map. The activation function introduces non-linearity into the network, allowing it to learn complex relationships between the input and the extracted features.
- -
- Pooling Layer: Following the activation function, a pooling layer is often applied. Pooling reduces the spatial dimensions of the feature maps while retaining important information. Pooling helps to reduce the number of parameters, to decrease computational complexity, and to provide translational invariance.
- -
- Convolution and Pooling Layers: The convolutional and pooling layers are typically repeated multiple times in a CNN architecture to capture increasingly complex and abstract features. This allows the network to learn hierarchical representations of the input data, starting from simple low-level features and progressing to high-level features.
- -
- Flattening: After the convolutional and pooling layers have been applied, the resulting feature maps are flattened into a one-dimensional vector. This flattening operation reshapes the multi-dimensional feature maps into a single continuous vector, which serves as the input to the subsequent fully connected layers.
- -
- Fully Connected Layers: After flattening, fully connected layers are added to the network. These layers are similar to those found in traditional neural networks, where each neuron is connected to every neuron in the previous layer. Fully connected layers perform non-linear transformations on the input data and are responsible for making predictions based on the extracted features.
- -
- -
- Input and Output: At each time step in the sequence, the LSTM receives an input vector. The input can be a single value or a vector of multiple values. The LSTM processes the input and produces an output vector at the same time step.
- -
- Memory Cell: The memory cell is the core component of the LSTM. It maintains and updates its internal state based on the current input, the previous state, and the output of the previous time step. The memory cell has the ability to store and carry information over long durations, allowing the model to capture dependencies over time.
- -
- Forget Gate: The forget gate determines which information from the previous state should be forgotten or discarded. It takes the previous output and current input as inputs, and using a sigmoid activation function, it produces a forget gate vector. This vector selectively removes or keeps information from the previous state.
- -
- Input Gate: The input gate determines which new information should be stored in the memory cell. It takes the previous output and current input as inputs and produces an input gate vector. Additionally, it generates a candidate vector, which represents potential new information.
- -
- Output Gate: The output gate decides what information from the memory cell should be outputted. It takes the previous output and current input as inputs and produces an output gate vector using a sigmoid activation function. The memory cell state is passed through a tanh activation function to squash the values, and then, the output gate vector is applied to filter the values.
- -
3.1. CNN-LSTM Hybrid Model
3.1.1. Data Source
CTU-13 Dataset
- StartTime: the start time for capturing data traffic;
- Dur: the duration of capture of data traffic or duration of the attack on the devices;
- Proto: the protocol used in the traffic;
- SrcAddr: the source IP address;
- Sport: the source port address;
- Dir: the direction of data flow and attack;
- DstAddr: the destination IP address;
- Dport: the destination port address;
- State: the state during the capture;
- dTos: the destination type of service;
- TotPkts: the total number of packets transferred or received during the capture;
- TotBytes: the total size of packets transferred or received during the capture in bytes;
- SrcBytes: size of packets from the source;
- Label: attack tag (indicating whether it was a successful, background, or normal botnet attack).
IoT-23 Dataset
- Attack: The infected device attempts to take advantage of a vulnerability in another host as an attack.
- Benign: The connections do not show any suspicious or malicious activity.
- C&C: The infected device is connected to a Command & Control server.
- DDoS: The infected device executes a distributed denial of service (DDoS) attack.
- FileDownload: The infected device downloads a file.
- HeartBeat: The packets sent over this connection are used by the Command & Control server to keep track of the infected host.
- Mirai: The connections exhibit characteristics of a Mirai botnet.
- Okiru: The connections exhibit the characteristics of an Okiru botnet.
- PartOfAHorizontalPortScan: The connections are used to perform a horizontal port scan to gather information for potential future attacks.
- Torii: The connections have the characteristics of a Torii botnet.
3.1.2. Data Preparation
3.1.3. Model Architecture
Algorithm 1 Pseudo-code of CNN-LSTM |
|
3.1.4. Model Structures
Algorithm 2 Global steps of preprocessing, training, testing, and deployment |
|
4. Experimental Results
4.1. Accuracy (Success Rate)
4.2. Precision
4.3. False-Positive Rate (FPR)
4.4. Recall (Detection Rate)
4.5. F-Score (Harmonic Mean)
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CA | Cybersecurity Analytics |
CTI | Cyber Threat Intelligence |
DF | Digital Forensics |
GDP | Gross Domestic Product |
OSINT | Open-Source INTelligence |
AI | Artificial Intelligence |
DL | Deep Learning |
LSTM | Long Short-Term Memory |
CNN | Convolutional Neural Network |
ML | Machine Learning |
FNN | Feedforward Neural Network |
WE | Word Embedding |
DRNN | Deep Recurrent Neural Network |
SVM | Support Vector Machine |
BLSTM | Bidirectional Long Short-Term Memory |
BiGRU | Bidirectional Gated Recurrent Unit |
TP | True Positive |
TN | True Negative |
FP | False Positive |
FN | False Negative |
IRC | Internet Relay Chat |
P2P | Peer-to-Peer |
HTTP | HyperText Transfer Protocol |
PCAP | Packet Capture |
CSV | Comma-Separated Values) |
IoT | Internet of Things |
C&C | Command and Control |
DDoS | Distributed Denial of Service |
List of mathematical symbols | |
X | Input data |
Y | Output |
W | Weight matrix |
b | Bias vector |
* | Convolution operation |
‖ | Concatenation operation |
Sigmoid activation function | |
tanh | Hyperbolic tangent activation function |
⊗ | Cross-correlation operation |
⊙ | Element-wise multiplication |
⊕ | Element-wise addition operation |
∇ | Gradient symbol |
∂ | Partial derivative symbol |
Model parameters | |
Mixing coefficient for combining original and synthetic samples in SMOTE |
References
- Wannacry, Petya, Notpetya. Available online: https://www.theguardian.com/technology/2017/dec/30/wannacry-petya-notpetya-ransomware (accessed on 7 December 2022).
- Cyberwarfare Special Report. Available online: https://cybersecurityventures.com/hackerpocalypse-cybercrime-report-2016/ (accessed on 8 December 2022).
- Hacking the Hackers: Understanding Their Mindset and Motivations. Available online: https://www.bluefin.com/bluefin-news/hacking-hackers-mindset-motivations/ (accessed on 11 February 2023).
- FBI: Cybercrime Victims Suffered Losses of Over $6.9B. Available online: https://www.darkreading.com/attacks-breaches/fbi-cybercrime-victims-suffered-losses-of-over-6-9b-in-2021 (accessed on 3 March 2023).
- The Hidden Costs of Cybercrime on Government. Available online: https://www.mcafee.com/blogs/other-blogs/executive-perspectives/the-hidden-costs-of-cybercrime-on-government/ (accessed on 3 March 2023).
- Estimated Cost of Cybercrime Worldwide. Available online: https://www.statista.com/statistics/1280009/cost-cybercrime-worldwide/ (accessed on 3 March 2023).
- Understanding Digital Forensics Process Techniques and Tools. Available online: https://www.bluevoyant.com/knowledge-center/understanding-digital-forensics-process-techniques-and-tools (accessed on 13 December 2022).
- Javed, A.R.; Ahmed, W.; Alazab, M.; Jalil, Z.; Kifayat, K.; Gadekallu, T.R. A comprehensive survey on computer forensics: State-of-the-art, tools, techniques, challenges, and future directions. IEEE Access 2022, 10, 11065–11089. [Google Scholar] [CrossRef]
- What Is Database Forensics. Available online: https://www.salvationdata.com/knowledge/what-is-database-forensics/ (accessed on 13 December 2022).
- Computer Forensics. Available online: https://www.techtarget.com/searchsecurity/definition/computer-forensics (accessed on 13 December 2022).
- Djenna, A.; Bouridane, A.; Rubab, S.; Marou, I.M. Artificial Intelligence-Based Malware Detection, Analysis, and Mitigation. Symmetry 2019, 15, 667. [Google Scholar] [CrossRef]
- Hou, J.; Li, Y.; Yu, J.; Shi, W. A survey on digital forensics in Internet of Things. IEEE Internet Things J. 2019, 7, 1–15. [Google Scholar] [CrossRef]
- Abu Al-Haija, Q.; Zein-Sabatto, S. An efficient deep-learning-based detection and classification system for cyber-attacks in IoT communication networks. Electronics 2020, 9, 2152. [Google Scholar] [CrossRef]
- Ge, M.; Fu, X.; Syed, N.; Baig, Z.; Teo, G.; Robles-Kelly, A. Deep learning-based intrusion detection for IoT networks. In Proceedings of the IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC), Kyoto, Japan, 1–3 December 2019. [Google Scholar] [CrossRef]
- McDermott, C.D.; Majdani, F.; Petrovski, A.V. Botnet detection in the internet of things using deep learning approaches. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018. [Google Scholar] [CrossRef]
- Van Roosmalen, J.; Vranken, H.; Van Eekelen, M. Applying deep learning on packet flows for botnet detection. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Pau, France, 9–13 April 2018. [Google Scholar] [CrossRef]
- Popoola, S.I.; Adebisi, B.; Ande, R.; Hammoudeh, M.; Anoh, K.; Atayero, A.A. Smote-drnn: A deep learning algorithm for botnet detection in the internet-of-things networks. Sensors 2021, 21, 2985. [Google Scholar] [CrossRef]
- Hegde, M.; Kepnang, G.; Al Mazroei, M.; Chavis, J.S.; Watkins, L. Identification of botnet activity in IoT network traffic using machine learning. In Proceedings of the IEEE International Conference on Intelligent Data Science Technologies and Applications (IDSTA), Valencia, Spain, 19–22 October 2020. [Google Scholar] [CrossRef]
- Abdalgawad, N.; Sajun, A.; Kaddoura, Y.; Zualkernan, I.A.; Aloul, F. Generative deep learning to detect cyberattacks for the IoT-23 dataset. IEEE Access 2021, 10, 6430–6441. [Google Scholar] [CrossRef]
- Garcia, S.; Grill, M.; Stiborek, J.; Zunino, A. An empirical comparison of botnet detection methods. Comput. Secur. 2014, 45, 100–123. [Google Scholar] [CrossRef]
- Le, D.C.; Zincir-Heywood, A.N.; Heywood, M.I. Data analytics on network traffic flows for botnet behaviour detection. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), Athens, Greece, 6–9 December 2016. [Google Scholar] [CrossRef]
- Geetha, K.; Brahmananda, S.H. Network traffic analysis through deep learning for detection of an army of bots in health IoT network. Int. J. Pervasive Comput. Commun. 2022. [Google Scholar] [CrossRef]
- Alauthman, M.; Aslam, N.; Al-kasassbeh, M.; Khan, S.; Al-Qerem, A.; Raymond Choo, K.K. An efficient reinforcement learning-based Botnet detection approach. J. Netw. Comput. Appl. 2020, 150, 102479. [Google Scholar] [CrossRef]
- Kim, J.; Shim, M.; Hong, S.; Shin, Y.; Choi, E. Intelligent detection of iot botnets using machine learning and deep learning. Appl. Sci. 2020, 10, 7009. [Google Scholar] [CrossRef]
- Bijalwan, A. Botnet forensic analysis using machine learning. Secur. Commun. Netw. 2020, 2020, 9302318. [Google Scholar] [CrossRef]
- Popoola, S.I.; Ande, R.; Adebisi, B.; Gui, G.; Hammoudeh, M.; Jogunola, O. Federated deep learning for zero-day botnet attack detection in IoT-edge devices. IEEE Internet Things J. 2021, 9, 3930–3944. [Google Scholar] [CrossRef]
- Shareena, J.; Ramdas, A.; AP, H. Intrusion detection system for iot botnet attacks using deep learning. SN Comput. Sci. 2021, 2, 205. [Google Scholar] [CrossRef]
- Asadi, M. Detecting IoT botnets based on the combination of cooperative game theory with deep and machine learning approaches. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 5547–5561. [Google Scholar] [CrossRef]
- Hasan, N.; Chen, Z.; Zhao, C.; Zhu, Y.; Liu, C. IoT Botnet Detection framework from Network Behavior based on Extreme Learning Machine. In Proceedings of the IEEE Infocom Ieee Conference on Computer Communications Workshops (Infocom Wkshps), New York, NY, USA, 2–05 May 2022. [Google Scholar] [CrossRef]
- Bojarajulu, B.; Tanwar, S.; Singh, T.P. Intelligent IoT-BOTNET attack detection model with optimized hybrid classification model. Comput. Secur. 2023, 126, 103064. [Google Scholar] [CrossRef]
- Moorthy, R.S.S.; Nathiya, N. Botnet Detection Using Artificial Intelligence. Procedia Comput. Sci. 2023, 218, 1405–1413. [Google Scholar] [CrossRef]
- Guerra-Manzanares, A.; Bahsi, H. On the application of active learning for efficient and effective IoT botnet detection. Future Gener. Comput. Syst. 2023, 141, 40–53. [Google Scholar] [CrossRef]
- Djenna, A.; Saidouni, D.E.; Abada, W. A pragmatic cybersecurity strategies for combating iot-cyberattacks. In Proceedings of the IEEE International Symposium on Networks, Computers and Communications (ISNCC), Montreal, QC, Canada, 20–22 October 2020. [Google Scholar] [CrossRef]
- 2021 Interpol Report. Available online: https://www.interpol.int/content/download/17965/file/INTERPOL/Annual/Report/2021_EN (accessed on 23 February 2023).
- Li, T.; Hua, M.; Wu, X. A hybrid CNN-LSTM model for forecasting particulate matter (PM2. 5). IEEE Access 2020, 8, 26933–26940. [Google Scholar] [CrossRef]
- Cell Classification in Machine Learning. Available online: https://www.madrasresearch.org/post/cell-classification-in-machine-learning (accessed on 22 May 2023).
- Roshan, S.; Srivathsan, G.; Deepak, K.; Chandrakala, S. Violence detection in automated video surveillance: Recent trends and comparative studies. Cogn. Approach Cloud Comput. Internet Things Technol. Surveill. Track. Syst. 2020, 157–171. [Google Scholar] [CrossRef]
- Li, Y.H.; Harfiya, L.N.; Purwandari, K.; Lin, Y.D. Real-time cuffless continuous blood pressure estimation using deep learning model. Sensors 2020, 20, 5606. [Google Scholar] [CrossRef]
- CTU-13 Dataset. Available online: https://www.stratosphereips.org/datasets-ctu13 (accessed on 17 June 2022).
- IoT-23 Dataset. Available online: https://www.stratosphereips.org/datasets-iot23 (accessed on 30 June 2022).
- Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
- Nguyen, H.T.; Ngo, Q.D.; Le, V.H. IoT Botnet Detection Approach Based on PSI graph and DGCNN classifier. In Proceedings of the IEEE International Conference on Information Communication and Signal Processing (ICICSP), Singapore, 28–30 September 2018. [Google Scholar] [CrossRef]
- Letteri, I.; Della Penna, G.; Caianiello, P. Feature selection strategies for http botnet traffic detection. In Proceedings of the 4th IEEE European Symposium on Security and Privacy Workshops EUROS and PW, Stockholm, Sweden, 17–19 June 2019. [Google Scholar] [CrossRef]
- Jimenez, F.; Martinez, C.; Marzano, E.; Palma, J.T.; Sanchez, G.; Sciavicco, G. Multiobjective evolutionary feature selection for fuzzy classification. IEEE Trans. Fuzzy Syst. 2019, 27, 1085–1099. [Google Scholar] [CrossRef]
- Tama, B.A.; Comuzzi, M.; Rhee, K.-H. TSE-IDS: A two-stage classifier ensemble for intelligent anomaly-based intrusion detection system. IEEE Access 2019, 7, 94497–94507. [Google Scholar] [CrossRef]
- Zhao, F.; Xin, Y.; Zhang, K.; Niu, X. Representativeness-based instance selection for intrusion detection. Secur. Commun. Netw. 2021, 2021, 6638134. [Google Scholar] [CrossRef]
- Kannari, P.R.; Shariff, N.C.; Biradar, R.L. Network intrusion detection using sparse autoencoder with swish-PReLU activation model. J. Ambient. Intell. Humaniz. Comput. 2021, 1–13. [Google Scholar] [CrossRef]
- Lo, W.W.; Kulatilleke, G.; Sarhan, M.; Layeghy, S.; Portmann, M. XG-BoT: An explainable deep graph neural network for botnet detection and forensics. Internet Things 2023, 22, 100747. [Google Scholar] [CrossRef]
Work | Year | Journal | Method | Pros | Cons |
---|---|---|---|---|---|
[23] | 2020 | High Speed Networks | ML | Effective in detecting patterns and anomalies | May require a large labled dataset |
[24] | 2020 | Applied Sciences | ML, DL | Capable of learning complex patterns | High computational complexity |
[25] | 2020 | Security and Communication Networks | ML | Can identify hidden patterns and correlations | Limitations in handlings new attacks |
[26] | 2021 | IEEE Internet of Things | Federated DL | Detection of zero-day botnet attacks | Synchronization and communication challenges |
[27] | 2021 | SN Computer Science | DL | Can learn intricate features | Requires large amounts of labled data |
[28] | 2022 | Ambient Intelligence and Humanized Computing | Game theory, DL | Models the strategic behavior of attackers | Requires extensive computational resources |
[29] | 2022 | IEEE INFOCOM | Extreme learning | Fast and efficient learning | Requires fine-tuning for optimal performance |
[30] | 2023 | Computers & Security | BiGRU-RNN | Improved accuracy in detecting IoT botnet attacks | Has increased complexity and resources requirements |
[31] | 2023 | Computer Science | SVM | Adaptable to dynamic botnet | Requires extensive computation resources |
[32] | 2023 | Future Generation Computer Systems | Active learning | Minimizes the labeling cost for the IoT botnet detection | Did not explore the implications and relation of specific features |
Id | IRC | SPAM | CF | PS | DDoS | FF | P2P | US | HTTP |
---|---|---|---|---|---|---|---|---|---|
1 | X | X | X | ||||||
2 | X | X | X | ||||||
3 | X | X | X | ||||||
4 | X | X | X | ||||||
5 | X | X | X | ||||||
6 | X | ||||||||
7 | X | ||||||||
8 | X | ||||||||
9 | X | X | X | X | |||||
10 | X | X | X | ||||||
11 | X | X | X | ||||||
12 | X | ||||||||
13 | X | X | X |
Scenarios | Type | Capture Name | Malware/Device | Duration | Number of Packets | Total Flows |
---|---|---|---|---|---|---|
Scenario 1 | Malicious | CTU-IoT-Malware-Cap-34-1 | Mirai | 24,000 | 233,000 | 23,146,000 |
Scenario 2 | Malicious | CTU-IoT-Malware-Cap-43-1 | Mirai | 1000 | 82,000,000 | 67,321,810,000 |
Scenario 3 | Malicious | CTU-IoT-Malware-Cap-44-1 | Mirai | 2000 | 1,309,000 | 238,000 |
Scenario 4 | Malicious | CTU-IoT-Malware-Cap-49-1 | Mirai | 8000 | 18,000,000 | 5,410,562,000 |
Scenario 5 | Malicious | CTU-IoT-Malware-Cap-52-1 | Mirai | 24,000 | 64,000,000 | 19,781,379,000 |
Scenario 6 | Malicious | CTU-IoT-Malware-Cap-20-1 | Torii | 24,000 | 50,000 | 3,210,000 |
Scenario 7 | Malicious | CTU-IoT-Malware-Cap-21-1 | Torii | 24,000 | 50,000 | 3,287,000 |
Scenario 8 | Malicious | CTU-IoT-Malware-Cap-42-1 | Trojan | 8000 | 24,000 | 4,427,000 |
Scenario 9 | Malicious | CTU-IoT-Malware-Cap-60-1 | Gagfyt | 24,000 | 271,000,000 | 3,581,029,000 |
Scenario 10 | Malicious | CTU-IoT-Malware-Cap-17-1 | Kenjiro | 24,000 | 109,000,000 | 54,659,864,000 |
Scenario 11 | Malicious | CTU-IoT-Malware-Cap-36-1 | Okiru | 24,000 | 13,000,000 | 13,645,107,000 |
Scenario 12 | Malicious | CTU-IoT-Malware-Cap-33-1 | Kenjiro | 24,000 | 54,000,000 | 54,454,592,000 |
Scenarios | Type | Capture Name | Malware/Device | Duration | Number of Packets | Total Flows |
---|---|---|---|---|---|---|
Scenario 13 | Malicious | CTU-IoT-Malware-Cap-8-1 | Hakai | 24,000 | 23,000 | 10,404,000 |
Scenario 14 | Malicious | CTU-IoT-Malware-Cap-35-1 | Mirai | 24,000 | 46,000,000 | 10,447,796,000 |
Scenario 15 | Malicious | CTU-IoT-Malware-Cap-48-1 | Mirai | 24,000 | 13,000,000 | 3,394,347,000 |
Scenario 16 | Malicious | CTU-IoT-Malware-Cap-39-1 | IRCBot | 7000 | 73,000,000 | 73,568,982,000 |
Scenario 17 | Malicious | CTU-IoT-Malware-Cap-7-1 | Linux, Mirai | 24,000 | 11,000,000 | 11,454,723,000 |
Scenario 18 | Malicious | CTU-IoT-Malware-Cap-9-1 | Linux, Hajime | 24,000 | 6,437,000 | 6,378,294,000 |
Scenario 19 | Malicious | CTU-IoT-Malware-Cap-3-1 | Muhstik | 36,000 | 496,000 | 156,104,000 |
Scenario 20 | Malicious | CTU-IoT-Malware-Cap-1-1 | Hide and Seek | 112,000 | 1,686,000 | 1,008,749,000 |
Scenario 21 | Benign | CTU-Honeypot-Cap-7-1 | Soomfy Doorlock | 1400 | 8276 | 139,000 |
Scenario 22 | Benign | CTU-Honeypot-Cap-4-1 | Phillips HUE | 24,000 | 21,000,000 | 461,000 |
Scenario 23 | Benign | CTU-Honeypot-Cap-5-1 | Amazon Echo | 5400 | 398,000,000 | 1,383,000 |
Dataset | Without Sampling | CallBacks | Random under Sampler | SMOTE | SMOTE Tomek | Borderline SMOTE | ADASYN | |
---|---|---|---|---|---|---|---|---|
Accuracy | CTU13 | 0.997520 | 0.998044 | 0.971556 | 0.997805 | 0.995140 | 0.993538 | 0.993538 |
IoT23 | 0.952365 | 0.945287 | 0.892822 | 0.896551 | 0.945287 | 0.892836 | 0.896751 | |
Precision | CTU13 | 0.886515 | 0.868871 | 0.195595 | 0.761460 | 0.588351 | 0.701167 | 0.517816 |
IoT23 | 0.736959 | 0.845727 | 0.995560 | 0.995560 | 0.997780 | 0.999970 | 0.999989 | |
Recall | CTU13 | 0.736959 | 0.845727 | 0.995560 | 0.995560 | 0.997780 | 1 | 1 |
IoT23 | 0.991621 | 1 | 0.886644 | 0.890587 | 1 | 0.886660 | 0.890785 | |
F-Score | CTU13 | 0.804848 | 0.857143 | 0.326955 | 0.862915 | 0.740222 | 0.824337 | 0.682317 |
IoT23 | 0.975221 | 0.971874 | 0.939904 | 0.942116 | 0.971874 | 0.939912 | 0.942233 | |
FPR | CTU13 | 0.000659 | 0.000892 | 0.028612 | 0.002179 | 0.004879 | 0.002978 | 0.006507 |
IoT23 | 0.725878 | 1 | 0.000437 | 0.000402 | 1 | 0.000455 | 0.000175 |
Dataset | Without Sampling | CallBacks | Random under Sampler | SMOTE | SMOTE Tomek | Borderline SMOTE | ADASYN | |
---|---|---|---|---|---|---|---|---|
Accuracy | CTU13 | 0.997135 | 0.991959 | 0.935047 | 0.987422 | 0.977201 | 0.972064 | 0.972064 |
IoT23 | 0.944316 | 0.889367 | 0.945266 | 0.892805 | 0.962097 | 0.878699 | 0.892822 | |
Precision | CTU13 | 0.872011 | 0.455946 | 0.091452 | 0.330241 | 0.224216 | 0.190368 | 0.195288 |
IoT23 | 0.958250 | 0.995851 | 0.945288 | 0.999970 | 0.154402 | 0.995798 | 0.958250 | |
Recall | CTU13 | 0.688124 | 0.821310 | 0.935627 | 0.790233 | 0.928968 | 0.970936 | 0.930078 |
IoT23 | 0.983963 | 0.886658 | 0.999976 | 0.886627 | 0.996670 | 0.875372 | 0.983963 | |
F-Score | CTU13 | 0.769231 | 0.586371 | 0.166617 | 0.465816 | 0.361243 | 0.970936 | 0.316048 |
IoT23 | 0.970936 | 0.938087 | 0.971863 | 0.939894 | 0.267381 | 0.931710 | 0.970936 | |
FPR | CTU13 | 0.000706 | 0.006849 | 0.064957 | 0.011200 | 0.022462 | 0.000892 | 0.027643 |
IoT23 | 0.740679 | 0.063824 | 0.999965 | 0.000455 | 0.038144 | 0.063824 | 0.970936 |
Work | Year | Method | Dataset | Accuracy |
---|---|---|---|---|
[42] | 2018 | PSI Graph CNN Classifier | IoTPOT-IotBotnet | 92% |
[43] | 2019 | Decision Tree | CTU-13 | 97.54% |
[44] | 2019 | MEFC | Real life dataset | 87.04% |
[45] | 2019 | Hybrid feature selection | NSL-KDD UNSW-NB15 | 91.27% |
[23] | 2020 | Reinforcement learning | ISOT, P2P, ISCX | 98.3% |
[46] | 2021 | Representativeness-based instance selection | KDD Cup 99 | 94.25% |
[47] | 2021 | Sparse autoencoder | NSL-KDD CIC-IDS2017 AWID | 98.10% |
[29] | 2022 | Extreme learning | MedBIoT | 97.7% |
[30] | 2023 | SVM DT MLP | CTU-13 | 92% |
[32] | 2023 | Active learning | MedBIoT | 97% |
[48] | 2023 | BiGRU-RNN | IoT-bot | 97% |
Proposed | 2023 | Hybrid CNN-LSTM | CTU-13 IoT-23 | 98.74% 98.29% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Djenna, A.; Barka, E.; Benchikh, A.; Khadir, K. Unmasking Cybercrime with Artificial-Intelligence-Driven Cybersecurity Analytics. Sensors 2023, 23, 6302. https://doi.org/10.3390/s23146302
Djenna A, Barka E, Benchikh A, Khadir K. Unmasking Cybercrime with Artificial-Intelligence-Driven Cybersecurity Analytics. Sensors. 2023; 23(14):6302. https://doi.org/10.3390/s23146302
Chicago/Turabian StyleDjenna, Amir, Ezedin Barka, Achouak Benchikh, and Karima Khadir. 2023. "Unmasking Cybercrime with Artificial-Intelligence-Driven Cybersecurity Analytics" Sensors 23, no. 14: 6302. https://doi.org/10.3390/s23146302
APA StyleDjenna, A., Barka, E., Benchikh, A., & Khadir, K. (2023). Unmasking Cybercrime with Artificial-Intelligence-Driven Cybersecurity Analytics. Sensors, 23(14), 6302. https://doi.org/10.3390/s23146302