[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Ensemble network traffic classification

Published: 09 November 2017 Publication History

Abstract

Network Traffic Classification (NTC) is a key piece for network monitoring, Quality-of-Service management and network security. Machine Learning algorithms have drawn the attention of many researchers during the last few years as a promising solution for network traffic classification. In Machine Learning, ensemble algorithms are classifiers formed by a set of base estimators that cooperate to build more complex models according to given training and classification strategies. Resulting models normally exhibit significant accuracy improvements compared to single estimators, but also extra time cost, which may obstruct the application of these methods to online NTC. This paper studies and compares the performance of seven popular ensemble algorithms based on Decision Trees, focusing on model accuracy, byte accuracy, and latency to determine whether ensemble learning can be properly applied to this modeling task. We show that some of the studied algorithms overcome single Decision Tree in terms of model accuracy and byte accuracy. However, the notable latency increase hinders the application of these methods in real time contexts. Additionally, we introduce a novel ensemble classifier that exploits the imbalanced populations presented in traffic networks datasets to achieve faster classifications. The experimental results show that our scheme retains the accuracy improvements of ensemble methods but with low latency punishment, enhancing the prospect of ensembles methods for online network traffic classification.

References

[1]
T. Nguyen, G. Armitage, A survey of techniques for internet traffic classification using machine learning, IEEE Commun. Surv. Tutorials, 10 (2008) 56-76.
[2]
A. Dainotti, A. Pescape, K. Claffy, Issues and future directions in traffic classification, IEEE Netw., 26 (2012) 35-40.
[3]
N. Williams, S. Zander, G. Armitage, A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification, ACM SIGCOMM Comput. Commun. Rev., 36 (2006) 5.
[4]
W. Li, A.W. Moore, A machine learning approach for efficient traffic classification, in: 2007 15th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, 2007, pp. 310-317.
[5]
L. Hernndez, C. Baladrn, J. Aguiar, B. Carro, A. Snchez-Esguevillas, Classification and clustering of electricity demand patterns in industrial parks, Energies, 5 (2012) 5215-5228.
[6]
L. Hernndez, C. Baladrn, J.M. Aguiar, B. Carro, A. Snchez-Esguevillas, J. Lloret, Artificial neural networks for short-term load forecasting in microgrids environment, Energy, 75 (2014) 252-264.
[7]
L. Bernaille, R. Teixeira, I. Akodjenou, A. Soule, K. Salamatian, Traffic classification on the fly, ACM SIGCOMM Comput. Commun. Rev., 36 (2006) 23-26.
[8]
L. Bernaille, R. Teixeira, K. Salamatian, Early application identification, in: Proc. 2006 ACM Conex. Conf., 2006.
[9]
L. Peng, B. Yang, Y. Chen, Effective packet number for early stage internet traffic identification, Neurocomputing, 156 (2015) 252-267.
[10]
M. Soysal, E.G. Schmidt, Machine learning algorithms for accurate flow-based network traffic classification: evaluation and comparison, Perform. Eval., 67 (2010) 451-467.
[11]
A. Callado, J. Kelner, D. Sadok, C. Alberto Kamienski, S. Fernandes, Better network traffic identification through the independent combination of techniques, J. Netw. Comput. Appl., 33 (2010) 433-446.
[12]
P. Casas, J. Mazel, and P. Owezarski, MINETRAC: Mining Flows for Unsupervised Analysis & Semi-Supervised Classification. (2011).
[13]
Y. Jin, N. Duffield, J. Erman, P. Haffner, S. Sen, Z.-L. Zhang, A modular machine learning system for flow-level traffic classification in large networks, ACM Trans. Knowl. Discov. Data, 6 (2012) 1-34.
[14]
J. Zhang, X. Chen, Y. Xiang, W. Zhou, J. Wu, Robust network traffic classification, IEEE/ACM Trans. Netw., 23 (2015) 1257-1270.
[15]
D.M. Divakaran, L. Su, Y.S. Liau, V.L. Vrizlynn, SLIC: self-learning intelligent classifier for network traffic, Comput. Netw., 91 (2015) 283-297.
[16]
T.T.T. Nguyen, G. Armitage, P. Branch, S. Zander, Timely and continuous machine-learning-based classification for interactive IP traffic, IEEE/ACM Trans. Netw., 20 (2012) 1880-1894.
[17]
V. Carela-Espaol, P. Barlet-Ros, A. Cabellos-Aparicio, J. Sol-Pareta, Analysis of the impact of sampling on NetFlow traffic classification, Comput. Netw., 55 (2011) 1083-1099.
[18]
A.W. Moore, D. Zuev, Internet traffic classification using Bayesian analysis techniques, ACM SIGMETRICS Perform. Eval. Rev., 33 (2005) 50.
[19]
T. Auld, A.W. Moore, S.F. Gull, Bayesian neural networks for internet traffic classification, IEEE Trans. Neural Netw., 18 (2007) 223-239.
[20]
A. Este, F. Gringoli, L. Salgarelli, Support vector machines for TCP traffic classification, Comput. Netw., 53 (Sep. 2009) 2476-2490.
[21]
A. Callado, A survey on internet traffic identification, IEEE Commun. Surv. Tutorials, 11 (2009) 37-52.
[22]
H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, K.Y. Lee, Internet traffic classification demystified: myths, caveats, and the best practices, Traffic, 50 (2008) 1-12.
[23]
R. Barandela, J.S. Snchez, V. Garca, E. Rangel, Strategies for learning in class imbalance problems, Pattern Recognit., 36 (2003) 849-851.
[24]
V. Carela-Espaol, T. Bujlow, P. Barlet-Ros, Is Our Ground-Truth for Traffic Classification Reliable?, 2014.
[25]
L. Deri, M. Martinelli, T. Bujlow, A. Cardigliano, nDPI: Open-source high-speed deep packet inspection, in: 2014 International Wireless Communications and Mobile Computing Conference (IWCMC), 2014, pp. 617-622.
[27]
T. Bujlow, V. Carela-Espaol, P. Barlet-Ros, Independent comparison of popular DPI tools for traffic classification, Comput. Netw., 76 (2015) 75-89.
[28]
L. Yu, H. Liu, Feature selection for high-dimensional data: a fast correlation-based filter solution, in: Int. Conf. Mach. Learn., 2003, pp. 1-8.
[29]
B. Senliol, G. Gulgezen, L. Yu, Z. Cataltepe, Fast Correlation Based Filter (FCBF) with a different search strategy, in: 2008 23rd Int. Symp. Comput. Inf. Sci. Isc. 2008, 2008.
[30]
S.E. Gmez,. https://github.com/SantiagoEG/FCBF_module
[31]
J. Demar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res., 7 (2006) 1-30.
[32]
S. Garca, A. Fernndez, J. Luengo, F. Herrera, Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power, Inf. Sci., 180 (2010) 2044-2064.
[33]
E. Bauer, R. Kohavi, An empirical comparison of voting classification algorithms: bagging, boosting, and variants, Mach. Learn., 36 (1999) 105-139.
[34]
T.G. Dietterich, An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Mach. Learn., 40 (2000) 139-157.
[35]
R.E. Banfield, L.O. Hall, K.W. Bowyer, W.P. Kegelmeyer, A comparison of decision tree ensemble creation techniques, IEEE Trans. Pattern Anal. Mach. Intell., 29 (2007) 173-180.
[36]
T.G. Dietterich, G. Bakiri, Solving multiclass learning problems via error-correcting output codes, J. Artif. Intell. Res., 2 (1995) 263-286.
[37]
L. Breiman, Random forests, Mach. Learn., 45 (2001) 5-32.
[38]
J. Milgram, M. Cheriet, R. Sabourin, One against One or One against All: which one is better for handwriting recognition with SVMs?, in: Tenth Int. Work. Front. Handwrit. Recognit., 2006, pp. 1-6.
[39]
P. Geurts, D. Ernst, L. Wehenkel, Extremely randomized trees, Mach. Learn., 63 (2006) 3-42.
[40]
T.G. Dietterich, Ensemble Methods in Machine Learning, 2000.
[41]
K.C. Lan, J. Heidemann, A measurement study of correlations of internet flow characteristics, Comput. Netw., 50 (2006) 46-62.
[42]
P. Carvalho, P. Solis, B. Queiroz, B. Carneiro, M. Deus, A traffic analysis per application in real IP/MPLS service provider network, in: 2007 2nd IEEE/IFIP International Workshop on Broadband Convergence Networks, 2007, pp. 1-5.
[43]
S.E. Gmez,. https://github.com/SantiagoEG/TEC_module

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Networks: The International Journal of Computer and Telecommunications Networking
Computer Networks: The International Journal of Computer and Telecommunications Networking  Volume 127, Issue C
November 2017
350 pages

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 09 November 2017

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)HClassJournal of High Speed Networks10.3233/JHS-23014530:4(517-533)Online publication date: 15-Oct-2024
  • (2023)TCGNNEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106531123:PCOnline publication date: 1-Aug-2023
  • (2023)A new platform for machine-learning-based network traffic classificationComputer Communications10.1016/j.comcom.2023.05.010208:C(1-14)Online publication date: 1-Aug-2023
  • (2022)A Survey of Network Traffic Classification Methods Using Machine LearningProgramming and Computing Software10.1134/S036176882207005248:7(413-423)Online publication date: 29-Nov-2022
  • (2021)Dynamic traffic classification algorithm and simulation of energy Internet of things based on machine learningNeural Computing and Applications10.1007/s00521-020-05457-733:9(3967-3976)Online publication date: 1-May-2021
  • (2019)Exploratory study on Class Imbalance and solutions for Network Traffic ClassificationNeurocomputing10.1016/j.neucom.2018.07.091343:C(100-119)Online publication date: 28-May-2019
  • (2019)BigFlowFuture Generation Computer Systems10.1016/j.future.2018.09.05193:C(473-485)Online publication date: 1-Apr-2019
  • (2019)Identifying IoT devices and events based on packet length from encrypted trafficComputer Communications10.1016/j.comcom.2019.05.012144:C(8-17)Online publication date: 15-Aug-2019
  • (2018)Efficient Distribution-Derived Features for High-Speed Encrypted Flow ClassificationProceedings of the 2018 Workshop on Network Meets AI & ML10.1145/3229543.3229548(21-27)Online publication date: 7-Aug-2018

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media