Abstract
The recent trend in network intrusion detection leverages key features of machine learning (ML) algorithms to detect network traffic anomalies. Network traffic flows contain high dimensional features which significantly affect data-driven approaches. Therefore, the performance of ML-based approaches mainly depends on the appropriate set of features of network data. Different feature selection and extraction methods are extensively employed to attain the informative and compact set of features. Existing methods often suffer from achieving the expected performance due to the lacking of effectively removing redundant features as well as incorporating features with complementary information. In this paper, we present a cluster-based feature extraction method using Mahalanobis distance (cFEM) that clusters the correlated features and extracts new feature representations based on a distance metric. The extracted features on the transformed dimensions are employed to train different machine learning classifiers. We conducted extensive experiments using three renowned datasets. The results show that cFEM outperforms the state-of-the-art intrusion detection methods in several performance metrics such as detection rate (99.61%) and false alarm rate (0.26%). Further experiments on extracted features show that our extracted features are discriminative, free of redundancy, and able to capture complementary information.
Similar content being viewed by others
Data availability
The network traffic data that support the findings of this study are publicly available in UNSW and UNB repository. The persistent link to the datasets are: https://research.unsw.edu.au/projects/unsw-nb15-dataset, https://www.unb.ca/cic/datasets/nsl.html and https://www.unb.ca/cic/datasets/ids-2017.html.
References
Abdullah, M., et al.: Enhanced intrusion detection system using feature selection method and ensemble learning algorithms. IJCSIS 16(2), 48–55 (2018)
Aburomman, A.A., Reaz, M.B.I.: Ensemble of binary SVM classifiers based on PCA and LDA feature extraction for intrusion detection. In: IMCEC, pp. 636–640 (2016)
Al-Qatf, M., et al.: Deep learning approach combining sparse autoencoder with SVM for network intrusion detection. IEEE Access 6, 52843–52856 (2018)
Ambusaidi, M.A., et al.: Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans. Comput. 65(10), 2986–2998 (2016). https://doi.org/10.1109/TC.2016.2519914
Annachhatre, C., Austin, T.H., Stamp, M.: Hidden Markov models for malware classification. J. Comput. Virol. Hack. Techn. 11(2), 59–73 (2015)
Ashok, R., et al.: Optimized feature selection with k-means clustered triangle SVM for Intrusion Detection. In: 2011 3rd International Conference on Advanced Computing, pp. 23–27 (2011). https://doi.org/10.1109/ICoAC.2011.6165213
Ayub, M.A., et al: Model evasion attack on intrusion detection systems using adversarial machine learning. In: CISS, pp. 1–6. IEEE (2020)
Aziz, M.N., Ahmad, T.: Cluster analysis-based approach features selection on machine learning for detecting intrusion. Int. J. Intell. Eng. Syst. 12(4), 233–243 (2019)
Beqiri, E.: Neural networks for intrusion detection systems. In: International Conference on Global Security, Safety, and Sustainability, pp. 156–165. Springer (2009)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Chen, L.-S., Syu, J.-S.: Feature extraction based approaches for improving the performance of intrusion detection systems. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 1, pp. 18–20 (2015)
Cyber Security Report. https://docs.broadcom.com/doc/istr-22-2017-en. Accessed (2021)
Eid, H., et al.: Linear correlation-based feature selection for network intrusion detection model. In: vol. 381. ISBN: 978-3-642- 40596-9 (2013). https://doi.org/10.1007/978-3-642-40597-6_21
Elhag, S., et al.: On the combination of genetic fuzzy systems and pairwise learning for improving detection rates on intrusion detection systems. Expert Syst. Appl. 42(1), 193–202 (2015)
Farahnakian, F., Heikkonen, J.: A deep auto-encoder based approach for intrusion detection system. In: 2018 20th International Conference on Advanced Communication Technology (ICACT), pp. 178–183. IEEE (2018)
Gan, X., et al.: Anomaly intrusion detection based on PLS feature extraction and core vector machine. Knowl. Based Syst. 40, 1–6 (2013)
Gottwalt, F., Chang, E., Dillon, T.: CorrCorr: a feature selection method for multivariate correlation network anomaly detection techniques. Comput. Secur. 83, 234–245 (2019)
Hota, H.S., Shrivas, A.K.: Decision tree techniques applied on NSL-KDD data and its comparison with various feature selection techniques. In: Advanced Computing, Networking and Informatics, vol. 1, pp. 205–211. Springer (2014)
Hu, X., et al.: Model complexity of deep learning: a survey. Knowl. Inf. Syst. 63(10), 2585–2619 (2021)
Javaid, A., et al.: A deep learning approach for network intrusion detection system. Eai Endorsed Trans. Secur. Saf. 3(9), e2 (2016)
Javed, K., Babri, H.A., Saeed, M.: Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Trans. Knowl. Data Eng. 24(3), 465–477 (2010)
Jha, J., Ragha, L.: Intrusion detection system using support vector machine. IJAIS 3, 25–30 (2013)
Khan, F.A., et al.: A novel two-stage deep learning model for efficient network intrusion detection. IEEE Access 7, 30373–30385 (2019)
Khraisat, A., et al.: Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2(1), 1–22 (2019)
Kim, J. et al.: Long short term memory recurrent neural network classifier for intrusion detection. In: International Conference on Platform Technology and Service (PlatCon), vol 2016, pp. 1–5. IEEE (2016)
Koc, L., Mazzuchi, T.A., Sarkani, S.: A network intrusion detection system based on a Hidden Naıve Bayes multiclass classifier. Expert Syst. Appl. 39(18), 13492–13500 (2012)
Kohli, M., Arora, S.: Chaotic grey wolf optimization algorithm for constrained optimization problems. J. Comput. Design Eng. (2017). https://doi.org/10.1016/j.jcde.2017.02.005
Kreibich, C., Crowcroft, J.: Honeycomb: creating intrusion detection signatures using honeypots. ACM SIGCOMM Comput. Commun. Rev. 34(1), 51–56 (2004)
Le, T.-T.-H., Kim, J., Kim, H., et al.:An effective intrusion detection classifier using long short-term memory with gradient descent optimization. In: 2017 International Conference on Platform Technology and Service (Plat- Con), pp. 1–6. IEEE (2017)
Li, Q., et al.: An intrusion detection system based on polynomial feature correlation analysis. In: IEEE Trustcom/BigDataSE/ICESS. vol. 2017, pp. 978–983. IEEE (2017)
Li, Y., et al.: An efficient intrusion detection system based on support vector machines and gradually feature removal method. Exp. Syst. Appl. 39(1), 424–430 (2012)
Liao, Y., Vemuri, V.R.: Use of k-nearest neighbor classifier for intrusion detection. Comput. Secur. 21(5), 439–448 (2002)
Lin, P.-C., Lin, Y.-D., Lai, Y.-C.: A hybrid algorithm of backward hashing and automaton tracking for virus scanning. IEEE Trans. Comput. 60(4), 594–601 (2010)
Lin, S.-W., et al.: An intelligent algorithm with feature selection and decision rules applied to anomaly intrusion detection. Appl. Soft Comput. 12(10), 3285–3290 (2012)
Lin, W.-C., Ke, S.-W., Tsai, C.-F.: CANN: An intrusion detection system based on combining cluster centers and nearest neighbors. Knowl.-Based Syst. 78, 13–21 (2015)
Liu, H., Lang, B.: Machine learning and deep learning methods for intrusion detection systems: a survey. Appl. Sci. 9(20), 4396 (2019)
Manocha, S., Girolami, M.A.: An empirical analysis of the probabilistic K-nearest neighbour classifier. Pattern Recognit. Lett. 28(13), 1818–1824 (2007)
Mao, J., et al.: CBFS: a clustering-based feature selection mechanism for network anomaly detection. IEEE Access 8, 116216–116225 (2020)
Meiners, C.R., et al.: Fast Regular Expression Matching Using Small TCAMs for Network Intrusion Detection and Prevention Systems. In: 19th USENIX Security Symposium (USENIX Security 10).Washington, DC: USENIX Association (2010)
Mighan, S.N., Kahani, M.: A novel scalable intrusion detection system based on deep learning. Int. J. Inf. Secur. 20(3), 387–403 (2021)
Mirza, A.H., Cosan, S.: Computer network intrusion detection using sequential LSTM neural networks autoencoders. In: 26th Signalprocessing and Communications Applications Conference (SIU), vol. 2018, pp. 1–4. IEEE (2018)
Modi, C., et al.: A survey of intrusion detection techniques in cloud. J. Netw. Comput. Appl. 36(1), 42–57 (2013)
Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In: MilCIS, pp. 1–6. IEEE (2015)
Mudzingwa, D., Rajeev Agrawal, A.: Study of methodologies used in intrusion detection and prevention systems (IDPS). In: Proceedings of IEEE Southeastcon, vol. 2012, pp. 1–6. IEEE (2012)
Nguyen, S.-N., et al.: Design and implementation of intrusion detection system using convolutional neural network for DoS detection. In: Proceedings of the 2nd International Conference on Machine Learning and Soft Computing, pp. 34–38 (2018)
Nkiama, H., Said, S.Z.M., Saidu, M.: A subset feature elimination mechanism for intrusion detection system. IJACSA 7(4), 148–157 (2016)
Panda, M., Abraham, A., Patra, M.R.: A hybrid intelligent approach for network intrusion detection. Procedia Eng. 30, 1–9 (2012)
Potluri, S., Henry, N.F., Diedrich, C.: Evaluation of hybrid deep learning techniques for ensuring security in networked control systems. In: 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA) , pp. 1–8 (2017)
Rao, K.N., Rao, K.V., Prasad Reddy, P.V.G.D.: A comprehensive survey of machine learning for intrusion detection. Int. J. Res. Advent Technol. 7:148–156 (2019)
Rao, K.N., Rao, K.V., Prasad Reddy, P.V.G.D.: A hybrid intrusion detection system based on sparse autoencoder and deep neural network. Comput. Commun. 180:77–88 (2021)
Ravinder Reddy, R., Ramadevi, Y., Sunitha, K.V.N.: Effective discriminant function for intrusion detection using SVM. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI), vol. 2016, p. 11481153. IEEE (2016)
Rezvy, S., et al.: Intrusion detection and classification with autoencoded deep neural network. In: International Conference on Security for Information Technology and Communications, pp. 142–156. Springer (2018)
Swarna Priya, R.M., et al.: An effective feature engineering for DNN using hybrid PCA-GWO for intrusion detection in IoMT architecture. Comput. Commun. 160, 139–149 (2020)
Roesch, M., et al.: Snort: lightweight intrusion detection for networks. Lisa 99(1), 229–238 (1999)
Selvakumar, B., Muneeswaran, K.: Firefly algorithm based feature selection for network intrusion detection. Comput. Secur. 81, 148–155 (2019)
Sethi, K., et al.: A context-aware robust intrusion detection system: a reinforcement learning-based approach. Int. J. Inf. Secur. 19(6), 657–678 (2020)
Sharafaldin, I., Lashkari, A.H., Ghorbani, A.: Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP (2018)
Sharmin, S., et al.: Simultaneous feature selection and discretization based on mutual information. Pattern Recogn. 91, 162–174 (2019)
Sharmin, S., et al.: Simultaneous feature selection and discretization based on mutual information. Pattern Recogn. 91, 162–174 (2019)
Shone, N., et al.: A deep learning approach to network intrusion detection. IEEE Trans. Emerg. Top. Comput. Intell. 2(1), 41–50 (2018)
Song, J., Zhu, Z., Price, C.: Feature grouping for intrusion detection system based on hierarchical clustering. In: International Conference on Availability, Reliability, and Security, pp. 270–280. Springer (2014)
Su, M.-Y.: Real-time anomaly detection systems for Denial-of-Service attacks by weighted knearest-neighbor classifiers. Expert Syst. Appl. 38(4), 3492–3498 (2011)
Süzen, A.A.: Developing a multi-level intrusion detection system using hybrid-DBN. J. Ambient Intell. Human. Comput. 12(2), 1913–1923 (2021)
Tang, P., Jiang, R., Zhao, M.: Feature selection and design of intrusion detection system based on k-means and triangle area support vector machine. In: 2010 2nd International Conference on Future Networks, pp. 144–148 (2010). https://doi.org/10.1109/ICFN.2010.42.
Tavallaee, M., et al.: A detailed analysis of the KDD CUP 99 data set. In: CISDA, pp. 1–6. IEEE (2009)
Van Der Maaten, L.: Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15(1), 3221–3245 (2014)
Vigna, G., Kemmerer, R.A.: Net- STAT: a network-based intrusion detection system. J. Comput. Secur. 7(1), 37–71 (1999)
Wang, G., et al.: A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Syst. Appl. 37(9), 6225–6232 (2010)
Yan, B., Han, G.: Effective feature extraction via stacked sparse autoencoder to improve intrusion detection system. IEEE Access 6, 41238–41248 (2018)
Yang, X., Tian, Y.: EigenJointsbased action recognition using Naïve-Bayes-Nearest- Neighbor. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 14–19 (2012)
Yin, C., et al.: A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 5, 21954–21961 (2017)
Zhang, G.: An improvement of payloadbased intrusion detection using fuzzy support vector machine. In: 2nd International Workshop on Intelligent Systems and Applications, vol. 2010, pp. 1–4 (2010)
Zhang, X., Chen, J.: Deep learning based intelligent intrusion detection. In: IEEE 9th International Conference on Communication Software and Networks (ICCSN), vol. 2017, pp. 1133–1137. IEEE (2017)
Acknowledgements
This research was supported by a research grant (56.00.0000.028.33.005.20-120) from the Innovation Fund (2020-21) of the ICT Division, Ministry of Posts, Telecommunications and Information Technology, Bangladesh.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with animals performed by any of the authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Mazumder, M.M.H.U., Kadir, M.E., Sharmin, S. et al. cFEM: a cluster based feature extraction method for network intrusion detection. Int. J. Inf. Secur. 22, 1355–1369 (2023). https://doi.org/10.1007/s10207-023-00694-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10207-023-00694-y