[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Performance evaluation of intrusion detection based on machine learning using Apache Spark

Published: 01 May 2018 Publication History

Abstract

Nowadays, network intrusion is considered as one of the major concerns in network communications. Thus, the developed network intrusion detection systems aim to identify attacks or malicious activities in a network environment. Various methods have been already proposed for finding an effective and efficient solution to detect and prevent intrusion in the network, ensuring network security and privacy. Machine learning is an effective analysis framework to detect any anomalous events occurred in the network traffic flow. Based on this framework, the paper in hand evaluates the performance of four well-known classification algorithms; SVM, Nave Bayes, Decision Tree and Random Forest using Apache Spark, a big data processing tool for intrusion detection in network traffic. The overall performance comparison is evaluated in terms of detection accuracy, building time and prediction time. Experimental results on UNSW-NB15, a recent public dataset for network intrusion detection, show an important advantage for Random Forest classifier among other well-known classifiers in terms of detection accuracy and prediction time, using the complete dataset with all 42 features.

References

[1]
S. Peddabachigari, A. Abraham, C. Grosan, J. Thomas, Modeling intrusion de- tection system using hybrid intelligent systems, Journal of network and computer applications, 30 (2007) 114-132.
[2]
G. Giacinto, R. Perdisci, M.Del Rio, F. Roli, Intrusion detection in computer networks by a modular ensemble of one-class classifiers, Information Fusion, 9 (2008) 69-82.
[3]
S. Chebrolu, A. Abraham, J.P. Thomas, Feature deduction and ensemble design of intrusion detection systems, Computers & security, 24 (2005) 295-307.
[4]
S.X. Wu, W. Banzhaf, The use of computational intelligence in intrusion detection systems: A review, Applied Soft Computing, 10 (2010) 1-35.
[5]
M. Tavallaee, E. Bagheri, W. Lu & A. A. Ghorbani, A detailed analysis of the KDD CUP 99 data set, in Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium, (2009) 16.
[6]
KDD CUP 1999 Data Set Available on: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.
[7]
NSL-KDD Data Set for network-based intrusion detection systems, Available on: http://iscx.ca/NSL-KDD/, 2009.
[8]
McHugh, John, Testing intrusion detection systems: a critique of the 1998 and 1999 darpa intrusion detection system evaluations as performed by lincoln laboratory, ACM Transactions on Information and System Security (TISSEC), 3 (2000) 262-294.
[9]
S. Choudhury & A. Bhowal, Comparative analysis of machine learning algorithms along with classifiers for network intrusion detection, Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM), 2015 International Conference, (2015) 8995.
[10]
B.A. Tama, K.H. Rhee, A combination of PSO-based feature selection and tree-based classifiers ensemble for intrusion detection systems, Advances in Computer Science and Ubiquitous Computing (2015) 489-495.
[11]
N. Moustafa and J. Slay, UNSW-NB15: a comprehensive dataset for network intrusion detection systems (UNSW-NB15 network data set), in Military Communications and Information Systems Conference, 2015.
[12]
N. Moustafa, J. Slay, The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set, Information Security Journal: A Global Perspective, 25 (2016) 18-31.
[13]
UNSW-NB15 data set. Available on: http://www.cybersecurity.unsw.adfa.edu.au/ADFA%20NB15%20Datasets/, 2015.
[14]
I. J. Good, I. Hacking, R. C. Jeffrey & H. Trnebohm, The estimation of probabilities: an essay on modern Bayesian methods, MIT Press, 1965.
[15]
P. Langley, W. Iba, K. Thompson, Ananalysis of Bayesian classifier, Proceedings of the 10th national Conference on Artificial Intelligence (1992) 223-228.
[16]
R. Kohavi, Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid, KDD, 96 (1996) 202-207.
[17]
L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Wadsworth & Brooks, Monterey, CA, 1984.
[18]
L. Breiman, Random forests, Machine learning, 45 (2001) 5-32.
[19]
A. Liaw, M. Wiener, Classification and regression by randomForest, R news, 2 (2002).
[20]
J. R. Quinlan, C4. 5: programs for machine learning, 2014.
[21]
H. Karau, A. Konwinski, P. Wendell & M. Zaharia. Learning spark: lightning-fast big data analysis. " OReilly Media, Inc.". (2015).
[22]
Apache Spark MLlib, http://spark.apache.org/docs/latest/mllib-guide.html.
[23]
Boser, Bernhard E., Isabelle M. Guyon, and Vladimir N. Vapnik. "A training algorithm for optimal margin classifiers." Proceedings of the fifth annual workshop on Computational learning theory. ACM, 1992.
[24]
Guobin Zhu, DanG. Blumberg, "Classification using ASTER data and SVM algorithms;: The case study of Beer Sheva, Israel.", Remote sensing of Environment, 80 (2002) 233-240.

Cited By

View all
  • (2024)Anomaly-Based Intrusion Detection System in Wireless Sensor Networks Using Machine Learning AlgorithmsApplied Computational Intelligence and Soft Computing10.1155/2024/26259222024Online publication date: 1-Jan-2024
  • (2024)Design of Intrusion Detection System Using GA and CNN for MQTT-Based IoT NetworksWireless Personal Communications: An International Journal10.1007/s11277-024-10984-w134:4(2059-2082)Online publication date: 1-Feb-2024
  • (2024)Intrusion detection based on ensemble learning for big data classificationCluster Computing10.1007/s10586-023-04168-727:3(3771-3798)Online publication date: 1-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Procedia Computer Science
Procedia Computer Science  Volume 127, Issue C
May 2018
552 pages
ISSN:1877-0509
EISSN:1877-0509
Issue’s Table of Contents

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 May 2018

Author Tags

  1. Apache Spark
  2. Intrusion Detection
  3. Machine Learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Anomaly-Based Intrusion Detection System in Wireless Sensor Networks Using Machine Learning AlgorithmsApplied Computational Intelligence and Soft Computing10.1155/2024/26259222024Online publication date: 1-Jan-2024
  • (2024)Design of Intrusion Detection System Using GA and CNN for MQTT-Based IoT NetworksWireless Personal Communications: An International Journal10.1007/s11277-024-10984-w134:4(2059-2082)Online publication date: 1-Feb-2024
  • (2024)Intrusion detection based on ensemble learning for big data classificationCluster Computing10.1007/s10586-023-04168-727:3(3771-3798)Online publication date: 1-Jun-2024
  • (2023)Intrusion detection in big data environment using hybrid deep learning algorithm (VAE-CNN)Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23431145:5(8637-8649)Online publication date: 4-Nov-2023
  • (2023)NIDS-VSB: Network Intrusion Detection System for VANET Using Spark-Based Big Data Optimization and Transfer LearningIEEE Transactions on Consumer Electronics10.1109/TCE.2023.332832070:1(1798-1809)Online publication date: 30-Oct-2023
  • (2022)Cloud-based multiclass anomaly detection and categorization using ensemble learningJournal of Cloud Computing: Advances, Systems and Applications10.1186/s13677-022-00329-y11:1Online publication date: 3-Nov-2022
  • (2022)Design of Network Intrusion Detection Model Based on TCASecurity and Communication Networks10.1155/2022/92488532022Online publication date: 1-Jan-2022
  • (2022)iNIDSComputer Communications10.1016/j.comcom.2022.08.022195:C(227-247)Online publication date: 1-Nov-2022
  • (2022)Hyperparameter search based convolution neural network with Bi-LSTM model for intrusion detection system in multimedia big data environmentMultimedia Tools and Applications10.1007/s11042-021-11271-781:24(34951-34968)Online publication date: 1-Oct-2022
  • (2021)Assessing Security of Software Components for Internet of ThingsSecurity and Communication Networks10.1155/2021/66778672021Online publication date: 1-Jan-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media