[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A multi-step outlier-based anomaly detection approach to network-wide traffic

Published: 20 June 2016 Publication History

Abstract

We propose a multi-step outlier-based anomaly detection approach to network-wide traffic.We propose a feature selection algorithm to select relevant non-redundant subset of features.We propose a tree-based clustering algorithm to generate non-redundant overlapped clusters.We introduce an efficient score-based outlier estimation technique to detect anomalies in network-wide traffic.We establish a fast distributed feature extraction framework to extract significant features from raw network-wide traffic.We conduct extensive experiments using the proposed algorithms with synthetic and real-life network-wide traffic datasets. Outlier detection is of considerable interest in fields such as physical sciences, medical diagnosis, surveillance detection, fraud detection and network anomaly detection. The data mining and network management research communities are interested in improving existing score-based network traffic anomaly detection techniques because of ample scopes to increase performance. In this paper, we present a multi-step outlier-based approach for detection of anomalies in network-wide traffic. We identify a subset of relevant traffic features and use it during clustering and anomaly detection. To support outlier-based network anomaly identification, we use the following modules: a mutual information and generalized entropy based feature selection technique to select a relevant non-redundant subset of features, a tree-based clustering technique to generate a set of reference points and an outlier score function to rank incoming network traffic to identify anomalies. We also design a fast distributed feature extraction and data preparation framework to extract features from raw network-wide traffic. We evaluate our approach in terms of detection rate, false positive rate, precision, recall and F-measure using several high dimensional synthetic and real-world datasets and find the performance superior in comparison to competing algorithms.

References

[1]
A. Agrawal, Local subspace based outlier detection, in: Proceedings of the Communication in Computer and Information Science, vol. 40, Springer, 2009, pp. 149-157.
[2]
F. Amiri, M.M.R. Yousefi, C. Lucas, A. Shakery, N. Yazdani, Mutual information-based feature selection for intrusion detection systems, J. Netw. Comput. Appl. 34 (4) (2011) 1184-1199.
[3]
R. Andersen, D.F. Gleich, V. Mirrokni, Overlapping clusters for distributed computation, in: Proceedings of the 5th ACM International Conference on Web Search and Data Mining, ACM, New York, USA, 2012, pp. 273-282.
[4]
F. Angiulli, S. Basta, C. Pizzuti, Distance-based detection and prediction of outliers, IEEE Trans. Knowl. Data Eng. 18 (2) (2006) 145-160.
[5]
K. Bache, M. Lichman, UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml (accessed 10.05.14).
[6]
D. Barbara, J. Couto, S. Jajodia, L. Popyack, N. Wu, ADAM: detecting intrusions by data mining, in: Proceedings of the IEEE Workshop on Information Assurance and Security, West Point, NY, 2001, pp. 11-16.
[7]
D. Barbara, N. Wu, S. Jajodia, Detecting novel network intrusions using Bayes estimators, in: Proceedings of the 1st SIAM Conference on Data Mining, Chicago, IL, 2001.
[8]
S. Bay, M. Schwabacher, Mining distance-based outliers in near linear time with randomization and a simple pruning rule, in: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003, pp. 29-38.
[9]
R. Beghdad, Critical study of supervised learning techniques in predicting attacks, Inf. Secur. J. A Glob. Perspect. 19 (1) (2010) 22-35.
[10]
J.C. Bezdek, R. Ehrlich, W. Full, FCM: the fuzzy c-means clustering algorithm, Comput. Geosci. 10 (2-3) (1984) 191-203.
[11]
D.K. Bhattacharyya, J.K. Kalita, Network Anomaly Detection: A Machine Learning Perspective, CRC Press, 2013.
[12]
M.H. Bhuyan, D. Bhattacharyya, J. Kalita, Surveying port scans and their detection methodologies, Comput. J. 54 (10) (2011) 1565-1581.
[13]
M.H. Bhuyan, D.K. Bhattacharyya, J.K. Kalita, NADO: network anomaly detection using outlier approach, in: Proceedings of the International Conference on Communication, Computing & Security, ACM, Odisha, India, 2011, pp. 531-536.
[14]
M.H. Bhuyan, D.K. Bhattacharyya, J.K. Kalita, Towards generating real-life datasets for network intrusion detection, Int. J. Netw. Secur. 17 (6) (2015) 675-693.
[15]
M.M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, LOF: identifying density-based local outliers, in: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2000, pp. 386-395.
[16]
W. Bul'ajoul, A. James, M. Pannu, Improving network intrusion detection system performance through quality of service configuration and parallel technology, J. Comput. Syst. Sci. 81 (6) (2015) 981-999.
[17]
P. Casas, J. Mazel, P. Owezarski, Unsupervised network intrusion detection systems: detecting the unknown without knowledge, Comput. Commun. 35 (7) (2012) 772-783.
[18]
K.A.P. Costa, L.A.M. Pereira, R.Y.M. Nakamura, C.R. Pereira, J.P. Papa, A.X. Falcão, A nature-inspired approach to speed up optimum-path forest clustering and its application to intrusion detection in computer networks, Inf. Sci. 294 (2015) 95-108.
[19]
G. D'angelo, F. Palmieri, M. Ficco, S. Rampone, An uncertainty-managing batch relevance-based approach to network anomaly detection, Appl. Soft. Comput. 36 (2015) 408-418.
[20]
L. Ertoz, E. Eilertson, A. Lazarevic, P. Tan, J. Srivastava, V. Kumar, P. Dokas, Next Generation Data Mining, Chap. The MINDS-Minnesota Intrusion Detection System, MIT Press, 2004.
[21]
M. Ester, H. Kriegel, S. Jörg, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, AAAI Press, Portland, Oregon, 1996, pp. 226-231.
[22]
G. Fernandes, J.J. Rodrigues, M.L. Proença, Autonomous profile-based anomaly detection system using principal component analysis and flow analysis, Appl. Soft. Comput. 34 (2015) 513-525.
[23]
A. Frank, A. Asuncion, UCI machine learning repository, 2010, URL http://archive.ics.uci.edu/ml (accessed 03.04.14).
[24]
C. Freeman, D. Kulic, O. Basir, An evaluation of classifier-specific filter measure performance for feature selection, Pattern Recognit. 48 (5) (2015) 1812-1826.
[25]
J. Ha, S. Seok, J.S. Lee, A precise ranking method for outlier detection, Inf. Sci. 324 (2015) 88-107.
[26]
J.A. Hartigan, M.A. Wong, Algorithm AS 136: a K-means clustering algorithm, J. R. Stat. Soc. Ser. C (Appl. Stat.) 28 (1) (1979) 100-108.
[27]
S.J. Horng, M.Y. Su, Y.H. Chen, T.W. Kao, R.J. Chen, J.L. Lai, C.D. Perkasa, A novel intrusion detection system based on hierarchical clustering and support vector machines, Expert Syst. Appl. 38 (1) (2011) 306-313.
[28]
A.K. Jain, R.C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Upper Saddle River, USA, 1988.
[29]
S.Y. Ji, B.K. Jeong, S. Choi, D.H. Jeong, A multi-level intrusion detection method for abnormal network behaviors, J. Netw. Comput. Appl. 62 (2016) 9-17.
[30]
S. Jiang, X. Song, H. Wang, J.-J. Han, Q.-H. Li, A clustering-based method for unsupervised intrusion detections, Pattern Recognit. Lett. 27 (7) (2006) 802-810.
[31]
L. Kaufman, P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, vol. 344, John Wiley & Sons, 2009.
[32]
E.M. Knorr, R.T. Ng, Algorithms for mining distance-based outliers in large datasets, in: Proceedings of the 24th International Conference on VLDB, Morgan Kaufmann, USA, 1998, pp. 392-403.
[33]
E.M. Knorr, R.T. Ng, V. Tucakov, Distance-based outliers: algorithms and applications, VLDB J. 8 (3-4) (2000) 237-253.
[34]
A. Koufakou, M. Georgiopoulos, A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes, Data Mining Knowl. Discov. 20 (2) (2010) 259-289.
[35]
W. Lee, S.J. Stolfo, Data mining approaches for intrusion detection, in: Proceedings of the USENIX Security Symposium, USENIX Association, 1998.
[36]
T. Li, N.F. Xiao, Novel heuristic dual-ant clustering algorithm for network intrusion outliers detection, Optik - Int. J. Light Electron Optics 126 (4) (2015) 494-497.
[37]
W.C. Lin, S.W. Ke, C.F. Tsai, CANN: an intrusion detection system based on combining cluster centers and nearest neighbors, Knowl. Based Syst. 78 (2015) 13-21.
[38]
H. Liu, S. Shah, W. Jiang, On-line outlier detection and data cleaning, Comput. Chem. Eng. 28 (9) (2004) 1635-1647.
[39]
Z. Liu, R. Wang, M. Tao, X. Cai, A class-oriented feature selection approach for multi-class imbalanced network traffic datasets based on local and global metrics fusion, Neurocomputing 168 (2015) 365-381.
[40]
M.M. Masud, Q. Chen, L. Khan, C.C. Aggarwal, J. Gao, J. Han, A. Srivastava, N.C. Oza, Classification and adaptive novel class detection of feature-evolving data streams, IEEE Trans. Knowl. Data Eng. 25 (7) (2013) 1484-1497.
[41]
H.D.K. Moonesinghe, P.N. Tan, OutRank: a graph-based outlier detection framework using random walk, Int. J. Artif. Intell. Tools 17 (1) (2008) 19-36.
[42]
S. Mukkamala, G. Janoski, A. Sung, Intrusion detection using neural networks and support vector machines, in: Proceedings of the International Joint Conference on Neural Networks, vol. 2, IEEE, 2002, pp. 1702-1707.
[43]
R.T. Ng, J. Han, CLARANS: a method for clustering objects for spatial data mining, IEEE Trans. Knowl. Data Eng. 14 (5) (2002) 1003-1016.
[44]
G.H. Orair, C.H.C. Teixeira, J.W. Meira, Y. Wang, S. Parthasarathy, Distance-based outlier detection: consolidation and renewed bearing, in: Proceedings of the VLDB Endowment, 3, 2010, pp. 1469-1480.
[45]
L. Ozçelik, R.R. Brooks, Deceiving entropy based DoS detection, Comput. Secur. 48 (2015) 234-245.
[46]
N. Paulauskas, A.F. Bagdonas, Local outlier factor use for the network flow anomaly detection, Secur. Commun. Netw. 8 (18) (2015) 4203-4212.
[47]
Y. Pei, O.R. Zaiane, Y. Gao, An efficient reference-based approach to outlier detection in large datasets, in: Proceedings of the 6th International Conference on Data Mining, IEEE, USA, 2006, pp. 478-487.
[48]
L. Portnoy, E. Eskin, S. Stolfo, Intrusion detection with unlabeled data using clustering, in: Proceedings of the ACM CSS Workshop on Data Mining Applied to Security, Philadelphia, PA, 2001, pp. 5-8.
[49]
K. Prakobphol, J. Zhan, A novel outlier detection scheme for network intrusion detection systems, in: Proceedings of the International Conference on Information Security and Assurance, IEEE, Washington, USA, 2008, pp. 555-560.
[50]
F. Shaari, A.A. Bakar, A.R. Hamdan, Outlier detection based on rough sets theory, Intell. Data Anal. 13 (2) (2009) 191-206.
[51]
J. Song, H. Takakura, Y. Okabe, K. Nakao, Toward a more practical unsupervised anomaly detection system, Inf. Sci. 231 (0) (2013) 4-14.
[52]
B. Swingle, Rényi entropy, mutual information, and fluctuation properties of fermi liquids, Phys. Rev. B 86 (2012) 045109.
[53]
C.C. Szeto, E. Hung, Mining outliers with faster cutoffupdate and space utilization, Pattern Recognit. Lett. 31 (11) (2010) 1292-1301.
[54]
M. Thottan, C. Ji, Anomaly detection in IP networks, IEEE Trans. Signal Process. 51 (8) (2003) 2191-2204. Special Issue on Signal Processing in Networking.
[55]
C.F. Tsai, C.Y. Lin, A triangle area based nearest neighbors approach to intrusion detection, Pattern Recognit. 43 (1) (2010) 222-229.
[56]
Z. Wang, M. Li, J. Li, A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure, Inf. Sci. 307 (2015) 73-88.
[57]
J. Zhang, H. Li, Q. Gao, H. Wang, Y. Luo, Detecting anomalies from big network traffic data using an adaptive detection approach, Inf. Sci. 318 (2015) 91-110.
[58]
J. Zhang, M. Zulkernine, Anomaly based network intrusion detection with unsupervised outlier detection, in: Proceedings of the IEEE International Conference on Communications, vol. 5, 2006, pp. 2388-2393.
[59]
Y. Zhang, S. Yang, Y. Wang, LDBOD: a novel local distribution based outlier detector, Pattern Recognit. Lett. 29 (7) (2008) 967-976.
[60]
N. Zhou, Y. Xu, H. Cheng, J. Fang, W. Pedrycz, Global and local structure preserving sparse subspace learning: an iterative approach to unsupervised feature selection, Pattern Recognit. 53 (2016) 87-101.

Cited By

View all
  • (2024)Deep Learning-Based Anomaly Detection in Network Traffic for Cyber Threat IdentificationProceedings of the Cognitive Models and Artificial Intelligence Conference10.1145/3660853.3660932(303-309)Online publication date: 25-May-2024
  • (2023)Anomaly detection for fault detection in wireless community networks using machine learningComputer Communications10.1016/j.comcom.2023.02.019202:C(191-203)Online publication date: 15-Mar-2023
  • (2022)Enhancing Detection of R2L Attacks by Multistage Clustering Based Outlier DetectionWireless Personal Communications: An International Journal10.1007/s11277-022-09482-8124:3(2637-2659)Online publication date: 1-Jun-2022
  • Show More Cited By
  1. A multi-step outlier-based anomaly detection approach to network-wide traffic

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Information Sciences: an International Journal
      Information Sciences: an International Journal  Volume 348, Issue C
      June 2016
      394 pages

      Publisher

      Elsevier Science Inc.

      United States

      Publication History

      Published: 20 June 2016

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 01 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Deep Learning-Based Anomaly Detection in Network Traffic for Cyber Threat IdentificationProceedings of the Cognitive Models and Artificial Intelligence Conference10.1145/3660853.3660932(303-309)Online publication date: 25-May-2024
      • (2023)Anomaly detection for fault detection in wireless community networks using machine learningComputer Communications10.1016/j.comcom.2023.02.019202:C(191-203)Online publication date: 15-Mar-2023
      • (2022)Enhancing Detection of R2L Attacks by Multistage Clustering Based Outlier DetectionWireless Personal Communications: An International Journal10.1007/s11277-022-09482-8124:3(2637-2659)Online publication date: 1-Jun-2022
      • (2022)Identification of IoT Device From Network Traffic Using Artificial Intelligence Based Capsule NetworksWireless Personal Communications: An International Journal10.1007/s11277-021-09236-y123:3(2227-2243)Online publication date: 1-Apr-2022
      • (2022)ADD: a new average divergence difference-based outlier detection method with skewed distribution of data objectsApplied Intelligence10.1007/s10489-021-02399-y52:5(5100-5124)Online publication date: 1-Mar-2022
      • (2021)S-DPSSecurity and Communication Networks10.1155/2021/66290982021Online publication date: 1-Jan-2021
      • (2021)A Novel Model for Anomaly Detection in Network Traffic Based on Support Vector Machine and ClusteringSecurity and Communication Networks10.1155/2021/21707882021Online publication date: 1-Jan-2021
      • (2021)Anomaly detection in a large-scale cloud platformProceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice10.1109/ICSE-SEIP52600.2021.00024(150-159)Online publication date: 25-May-2021
      • (2020)Comparative Study of Features Selection MethodsProceedings of the 4th International Conference on Algorithms, Computing and Systems10.1145/3423390.3423393(40-44)Online publication date: 6-Jan-2020
      • (2020)Outlier detection based on sparse coding and neighbor entropy in high-dimensional spaceProceedings of the 17th ACM International Conference on Computing Frontiers10.1145/3387902.3392612(202-207)Online publication date: 11-May-2020
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media