More Web Proxy on the site http://driver.im/

research-article

Intrusion Detection Using Big Data and Deep Learning Techniques

Authors:

Erdogan DogduAuthors Info & Claims

ACMSE '19: Proceedings of the 2019 ACM Southeast Conference

Pages 86 - 93

https://doi.org/10.1145/3299815.3314439

Published: 18 April 2019 Publication History

Abstract

In this paper, Big Data and Deep Learning Techniques are integrated to improve the performance of intrusion detection systems. Three classifiers are used to classify network traffic datasets, and these are Deep Feed-Forward Neural Network (DNN) and two ensemble techniques, Random Forest and Gradient Boosting Tree (GBT). To select the most relevant attributes from the datasets, we use a homogeneity metric to evaluate features. Two recently published datasets UNSW NB15 and CICIDS2017 are used to evaluate the proposed method. 5-fold cross validation is used in this work to evaluate the machine learning models. We implemented the method using the distributed computing environment Apache Spark, integrated with Keras Deep Learning Library to implement the deep learning technique while the ensemble techniques are implemented using Apache Spark Machine Learning Library. The results show a high accuracy with DNN for binary and multiclass classification on UNSW NB15 dataset with accuracies at 99.16% for binary classification and 97.01% for multiclass classification. While GBT classifier achieved the best accuracy for binary classification with the CICIDS2017 dataset at 99.99%, for multiclass classification DNN has the highest accuracy with 99.56%.

References

[1]

M. Al-Zewairi, S. Almajali, and A. Awajan. 2017. Experimental Evaluation of a Multi-layer Feed-Forward Artificial Neural Network Classifier for Network Intrusion Detection System. 2017 International Conference on New Trends in Computing Sciences (ICTCS), Amman, Jordan, pp. 167--172, IEEE

[2]

M. Belouch, S. El Hadaj, and M. Idhammad. 2017. Two-stage Classifier Approach Using RepTree algorithm for Network Intrusion Detection. International Journal of Advanced Computer Science and Applications, 8(6), pp. 389--394.

[3]

M. Belouch, S. El Hadaj, and M. Idhammad. 2018. Performance Evaluation of Intrusion Detection based on Machine Learning Using Apache Spark. Procedia Computer Science 127, pp. 1--6.

Digital Library

[4]

L. Breiman. 2001. Random Forests. Machine Learning, 45(1), pp. 5--32.

Digital Library

[5]

V. Chandola, A. Banerjee, and V. Kumar. 2009. Anomaly Detection: A Survey. ACM Computing Surveys, 41(3), pp. 1--15.

Digital Library

[6]

F. Coelho, A. Braga, and M. Verleysen. 2012. Cluster Homogeneity as a Semi-supervised Principle for Feature Selection Using Mutual Information. European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium.

[7]

P. Dahiya and D. Srivastava. 2018. Network Intrusion Detection in Big Dataset Using Spark. Procedia Computer Science 132, pp. 253--262.

Digital Library

[8]

L. Dhanabal, and S. p.Shantharajah. 2015. A Study on NSL KDD Dataset for Intrusion Detection System based on Classification Algorithms. International Journal of Advanced Research in Computer and Communication Engineering, 4(6), pp. 446--452.

[9]

R. Di Pietro and L. V. Mancini, eds. 2008. Intrusion Detection Systems. Springer Science & Business, vol. 38. Media.

[10]

Osama Faker. 2018. Intrusion Detection Using Big Data and Deep Learning Techniques. MS Thesis, Cankaya University.

[11]

J.H. Friedman. 2002. Stochastic Gradient Boosting. Computational Statistics & Data Analysis, 38(4), pp. 367--378.

Digital Library

[12]

H. Gharaee and H. Hosseinvand. 2016. A New Feature Selection IDS based on Genetic Algorithm and SVM. Telecommunications (IST), 2016 8th International Symposium on. IEEE, pp. 139--144.

[13]

G.P. Gupta and M. Kulariya. 2016. A Framework for Fast and Efficient Cyber Security Network Intrusion Detection Using Apache Spark. Procedia Computer Science 93, Kochi, India, pp. 824--831.

[14]

J. Han, E. Haihong, G. Le, and J. Du. 2011. Survey on NoSQL Databases. In Pervasive Computing and Applications (ICPCA), Port Elizabeth, South Africa 2011 6th International Conference on, pp. 363--366. IEEE.

[15]

A. Lashkari, G. Draper-Gil, M. Mamun, and A. Ghorbani. 2017. Characterization of Tor Traffic Using Time based Features. The 3rd International Conference on Information Systems Security and Privacy, pp. 253--262.

[16]

Y. Liu. 2014. Random Forest Algorithm in Big Data Environment. Computer Modelling & New Technologies, 18(12A), pp. 147--151.

[17]

N. Moustafa and J. Slay. 2016. The Evaluation of Network Anomaly Detection Systems: Statistical Analysis of the UNSW NB15 Data Set and the Comparison with the KDD99 Data Set. Information Security Journal: A Global Perspective, 25(13), pp. 18--31.

Digital Library

[18]

N. Moustafa and J. Slay. 2015. UNSW-NB15: A Comprehensive Data Set for Network Intrusion Detection Systems (UNSW-NB15 Network Data Set). Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, pp. 1--6, IEEE.

[19]

N. Moustafa and J. Slay. 2018. The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set. Information Security Journal: A Global Perspective, 25(1-3), pp. 18--31.

Digital Library

[20]

R. Primartha and B. Tama. 2017. Anomaly Detection Using Random Forest: A Performance Revisited. Data and Software Engineering (ICoDSE), International Conference on, Palembang Sumatra Selatan, Indonesia, pp. 1--6, IEEE.

[21]

P. Resende and A. Drummond. 2018. Adaptive Anomaly-based Intrusion Detection System Using Genetic Algorithm and Profiling. Security and Privacy, e36, pp. 1--13.

[22]

A. Rosenberg and J. Hirschberg. 2007. V-measure: A Conditional Entropy-based External Cluster Evaluation Measure. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(EMNLP-CoNLL), pp. 410--420.

[23]

J. Schmidhuber. 2015. Deep Learning in Neural Networks: An Overview. Neural Networks, vol. 61, pp. 85--117.

Digital Library

[24]

I. Sharafaldin, A. Lashkari, and A. A. Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization. Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018). Funchal, Madeira-Portugal, pp. 108--116.

[25]

I. Sharafaldin, A. Gharib, A. H. Lashkari, and A. A. Ghorbani. 2018. Towards a Reliable Intrusion Detection Benchmark Dataset. Software Networking, 2018(1), pp. 177--200.

[26]

K. Shvachko, H. Kuang, S. Radia, and R. Chansler. 2010. The Hadoop Distributed File System. Mass Storage Systems and Technologies (MSST), IEEE 26th symposium on, pp. 1--10.

Digital Library

[27]

O.B. Sezer, M. Ozbayoglu, E. Dogdu. 2017. A Deep Neural-Network Based Stock Trading System Based on Evolutionary Optimized Technical Analysis Parameters. Procedia Computer Science, 114, pp. 473--480.

Digital Library

[28]

S. Suthaharan. 2014. Big Data Classification: Problems and Challenges in Network Intrusion Prediction with Machine Learning. ACM SIGMETRICS Performance Evaluation Review 41(4), pp. 70--73.

Digital Library

[29]

M. Tavallaee, E. Bagheri, W. Lu, and A. A.Ghorbani. 2009. A Detailed Analysis of the KDD CUP 99 Data Set. In Computational Intelligence for Security and Defense Applications. CISDA 2009. IEEE Symposium on, pp. 1--6, IEEE.

Digital Library

[30]

A. Thusoo, et al.2009. Hive: A Warehousing Solution over a Map-Reduce Framework. Proceedings of the VLDB Endowment 2(2), pp. 1626--1629.

Digital Library

[31]

E.D. Ubeyli and E. Dogdu. 2010. Automatic Detection of Erythemato-squamous Diseases Using K-means Clustering. Journal of Medical Systems, 34(2), pp. 179--184.

Digital Library

[32]

R. Vijayanand, D. Devaraj, and B. Kannapiran. 2018. Intrusion Detection System for Wireless Mesh Network Using Multiple Support Vector Machine Classifiers with Genetic-Algorithm-based Feature Selection. Computers & Security 77, pp. 304--314.

Digital Library

[33]

M. Zaharia, et al. 2016. Apache Spark: A Unified Engine for Big Data Processing. Communications of the ACM 59(11), pp. 56--65.

Digital Library

[34]

C. Zhang and Y. Ma, eds. 2012. Ensemble Machine Learning: Methods and Applications. Springer Science & Business Media, Springer.

[35]

P. Zikopoulos and C. Eaton. 2011. Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill Osborne Media.

Digital Library

[36]

R. Zuech, T. M. Khoshgoftaar, and R. Wald. 2015. Intrusion Detection and Big Heterogeneous Data: A Survey. Journal of Big Data, 2(3), pp. 1--41.

Cited By

Ma ZChen ZZheng XWang TYou YZou SWang Y(2024)A Biological Immunity-Based Neuro Prototype for Few-Shot Anomaly Detection with Character EmbeddingCyborg and Bionic Systems10.34133/cbsystems.00865Online publication date: 16-Jan-2024
https://doi.org/10.34133/cbsystems.0086
Rai HYoo JAgarwal S(2024)The Improved Network Intrusion Detection Techniques Using the Feature Engineering Approach with Boosting ClassifiersMathematics10.3390/math1224390912:24(3909)Online publication date: 11-Dec-2024
https://doi.org/10.3390/math12243909
Genuario FSantoro GGiliberti MBello SZazzera EImpedovo D(2024)Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and InsightsInformation10.3390/info1511074115:11(741)Online publication date: 20-Nov-2024
https://doi.org/10.3390/info15110741
Show More Cited By

Index Terms

Intrusion Detection Using Big Data and Deep Learning Techniques
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation

Recommendations

Improving performance of intrusion detection system using ensemble methods and feature selection
ACSW '18: Proceedings of the Australasian Computer Science Week Multiconference

The main task of an intrusion detection system (IDS) is to detect anomalous behaviors from both within and outside the network system, and there have been increasing studies applying machine learning in this area. The limitations of using a single ...
Analysis of Feature Selection and Ensemble Classifier Methods for Intrusion Detection

Day by day network security is becoming more challenging task. Intrusion detection systems IDSs are one of the methods used to monitor the network activities. Data mining algorithms play a major role in the field of IDS. NSL-KDD'99 dataset is used to ...
An Ensemble Model for 2D-data Classification based on Classical & Deep Learning Classifier
ICMAI '24: Proceedings of the 2024 9th International Conference on Mathematics and Artificial Intelligence

Ensemble learning is one of the most studied topics in classification domain, it is proven that ensemble learning is effective for classification tasks with multiple labels. Nevertheless, achieving accurate predictions for data with varying dimensions ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ACMSE '19: Proceedings of the 2019 ACM Southeast Conference

April 2019

295 pages

ISBN:9781450362511

DOI:10.1145/3299815

Conference Chair:
Dan Lo
Kennesaw State University
,
Program Chair:
Donghyun Kim
Kennesaw State University
,
Publications Chair:
Eric Gamess
Jacksonville State University

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ACM SE '19

Sponsor:

ACM

ACM SE '19: 2019 ACM Southeast Conference

April 18 - 20, 2019

GA, Kennesaw, USA

Acceptance Rates

Overall Acceptance Rate 502 of 1,023 submissions, 49%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

142
Total Citations
View Citations
1,609
Total Downloads

Downloads (Last 12 months)120
Downloads (Last 6 weeks)18

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ma ZChen ZZheng XWang TYou YZou SWang Y(2024)A Biological Immunity-Based Neuro Prototype for Few-Shot Anomaly Detection with Character EmbeddingCyborg and Bionic Systems10.34133/cbsystems.00865Online publication date: 16-Jan-2024
https://doi.org/10.34133/cbsystems.0086
Rai HYoo JAgarwal S(2024)The Improved Network Intrusion Detection Techniques Using the Feature Engineering Approach with Boosting ClassifiersMathematics10.3390/math1224390912:24(3909)Online publication date: 11-Dec-2024
https://doi.org/10.3390/math12243909
Genuario FSantoro GGiliberti MBello SZazzera EImpedovo D(2024)Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and InsightsInformation10.3390/info1511074115:11(741)Online publication date: 20-Nov-2024
https://doi.org/10.3390/info15110741
Feng YYang ZSun QLiu Y(2024)SEDAT: A Stacked Ensemble Learning-Based Detection Model for Multiscale Network AttacksElectronics10.3390/electronics1315295313:15(2953)Online publication date: 26-Jul-2024
https://doi.org/10.3390/electronics13152953
Muneer SFarooq UAthar AAhsan Raza MGhazal TSakib S(2024)A Critical Review of Artificial Intelligence Based Approaches in Intrusion Detection: A Comprehensive AnalysisJournal of Engineering10.1155/2024/39091732024(1-16)Online publication date: 15-Apr-2024
https://doi.org/10.1155/2024/3909173
Mohammed Sayem IIslam Sayed MSaha SHaque A(2024)ENIDS: A Deep Learning-Based Ensemble Framework for Network Intrusion Detection SystemsIEEE Transactions on Network and Service Management10.1109/TNSM.2024.341430521:5(5809-5825)Online publication date: Oct-2024
https://doi.org/10.1109/TNSM.2024.3414305
Obamiyi SAdebusuyi AOguntimilehin ABadeji-Ajisafe BOkebule TAbiola OAkinduyite CBabalola GTope-Oke A(2024)A Network Intrusion Detection Model for IoT Networks2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG)10.1109/SEB4SDG60871.2024.10629842(1-8)Online publication date: 2-Apr-2024
https://doi.org/10.1109/SEB4SDG60871.2024.10629842
Mamdouh HTarek MRadwan ASaeed AAbdeen AAshraf MFouad KAbdelbaky I(2024)Apache Spark Powered: Enhancing Network Intrusion Detection System Using Random Forest2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES)10.1109/NILES63360.2024.10753188(289-294)Online publication date: 19-Oct-2024
https://doi.org/10.1109/NILES63360.2024.10753188
Lunawat SRao JPatil P(2024)A Comprehensive Survey on Anomaly Detection in Social Media Networks: Challenges, Methods, and Future Directions2024 4th International Conference on Sustainable Expert Systems (ICSES)10.1109/ICSES63445.2024.10763303(363-370)Online publication date: 15-Oct-2024
https://doi.org/10.1109/ICSES63445.2024.10763303
Ghamya KPrema KKumar PReddy PReddy PNaidu M(2024)Deep Attention Learning for Extreme Minority Class Intrusion Detection in Network Traffic2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS)10.1109/ICKECS61492.2024.10617078(1-9)Online publication date: 18-Apr-2024
https://doi.org/10.1109/ICKECS61492.2024.10617078
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents