[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

QLLog: : A log anomaly detection method based on Q-learning algorithm

Published: 01 May 2021 Publication History

Abstract

Most of the existing log anomaly detection methods suffer from scalability and numerous false positives. Besides, they cannot rank the severity level of abnormal events. This paper proposes a log anomaly detection based on Q-learning, namely QLLog, which can detect multiple types of system anomalies and rank the severity level of abnormal events. We first build a mathematical model of log anomaly detection, proving that log anomaly detection is a sequential decision problem. Second, we use the Q-learning algorithm to build the core of the anomaly detection model. This allows QLLog to automatically learn directed acyclic graph log patterns from normal execution and adjust the training model according to the reward value. Then, QLLog combines the advantages of the Q-learning algorithm and the specially designed rules to detect anomalies when log patterns deviate from the model trained from log data under normal execution. Besides, we provide a feedback mechanism and build an abnormal level table. Therefore, QLLog can adapt to new log states and log patterns. Experiments on real datasets show that the method can quickly and effectively detect system anomalies. Compared with the state of the art, QLLog can detect numerous real problems with high accuracy 95%, and its scalability outperforms other existing log-based anomaly detection methods.

Highlights

For all we know, this paper is the first successful application of the Q-learning algorithm in the field of log anomaly detection and has achieved good detection results.
QLLog can detect multiple types of log anomalies to reduce the false negative rate.
QLLog provides a feedback mechanism to update the detection model and the abnormal level of abnormal logs.
We summarize the existing log anomaly detection methods, compare and analyze the advantages and disadvantages of them. The experimental result proves the superiority of QLLog.

References

[1]
Astekin M., Zengin H., Sözer H., DILAF: A framework for distributed analysis of large-scale system logs for anomaly detection, Software - Practice and Experience 49 (2) (2019) 153–170,.
[2]
Bertero C., Roy M., Sauvanaud C., Tredan G., Experience report: Log mining using natural language processing and application to anomaly detection, in: 2017 IEEE 28th international symposium on software reliability engineering (ISSRE), IEEE, 2017, pp. 351–360,.
[3]
Christopher P., Watkins L., Q-learning, Machine Learning 8 (1992) 279–292.
[4]
Clifton J., Laber E., Q-learning: Theory and applications, Annual Review of Statistics and Its Application 7 (2020) 279–301,.
[5]
Das A., Mueller F., Siegel C., Vishnu A., Desh: deep learning for system health prediction of lead times to failure in HPC, in: International symposium, 2018, pp. 40–51,.
[6]
Ding N., Ma H., Gao H., Ma Y., Tan G., Real-time anomaly detection based on long short-term memory and gaussian mixture model, Computers & Electrical Engineering 79 (2019),.
[7]
Du M., Li F., Spell: Streaming parsing of system event logs, in: 2016 IEEE 16th international conference on data mining (ICDM), IEEE, 2016, pp. 859–864,.
[8]
Du M., Li F., Zheng G., Srikumar V., Deeplog: Anomaly detection and diagnosis from system logs through deep learning, in: Proceedings of the ACM conference on computer and communications security, ACM Press, 2017, pp. 1285–1298,.
[9]
Enderlein G., Hawkins D.M., Identification of outliers, Biometrical Journal - BIOM J 29 (1987) 198,.
[10]
Haddadpajouh H., Dehghantanha A., Khayami R., Choo K.-K.R., A deep recurrent neural network based approach for internet of things malware threat hunting, Future Generation Computer Systems 85 (2018) 88–96,.
[11]
Hasan M., Orgun M.A., Schwitter R., Real-time event detection from the twitter data stream using the twitternews+ framework, Information Processing & Management 56 (3) (2019) 1146–1165,.
[12]
He P., Zhu J., He S., Li J., Lyu M.R., An evaluation study on log parsing and its use in log mining, in: 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN), IEEE, 2016, pp. 654–661,.
[13]
He S., Zhu J., He P., Lyu M.R., Experience report: System log analysis for anomaly detection, in: 2016 IEEE 27th international symposium on software reliability engineering (ISSRE), 2016, pp. 207–218,.
[14]
He P., Zhu J., Zheng Z., Lyu M.R., Drain: An online log parsing approach with fixed depth tree, in: 2017 IEEE international conference on web services (ICWS), IEEE, 2017, pp. 33–40,.
[15]
Huang C., Zhou S., Xu J., Niu Z., Zhang R., Cui S., Markov decision process, in: Signal processing for cognitive radios, John Wiley & Sons, Ltd, 2014, pp. 207–268,. Ch. 7.
[16]
Jahromi A.N., Hashemi S., Dehghantanha A., Choo K.-K.R., Karimipour H., Newton D.E., et al., An improved two-hidden-layer extreme learning machine for malware hunting, Computers & Security 89 (2020),.
[17]
Javed A., Burnap P., Rana O., Prediction of drive-by download attacks on twitter, Information Processing & Management 56 (3) (2019) 1133–1145,.
[18]
Kaur R., Singh S., A comparative analysis of structural graph metrics to identify anomalies in online social networks, Computers & Electrical Engineering 57 (2017) 294–310,.
[19]
Kwon D., Kim H., Kim J., Suh S., Kim I., Kim K., A survey of deep learning-based network anomaly detection, Cluster Computing 22 (2019) 949–961,.
[20]
LeCun Y., Bengio Y., Hinton G., Deep learning, Nature 521 (2015) 436–444,.
[21]
Liang Y., Zhang Y., Sivasubramaniam A., Jette M., Sahoo R., Bluegene/l failure analysis and prediction models, in: Proceedings of the international conference on dependable systems and networks, 2006, pp. 425–434,.
[22]
Lin Q., Zhang H., Lou J.-G., Zhang Y., Chen X., Log clustering based problem identification for online service systems, in: Proceedings of the 38th international conference on software engineering companion, Association for Computing Machinery, 2016, pp. 102–111,.
[23]
Liu F.T., Ting K., Zhou Z.-H., Isolation forest, in: Proceedings of the 2008 eighth IEEE international conference on data mining, IEEE Computer Society, 2009, pp. 413–422,.
[24]
Lou J.-G., Fu Q., Yang S., Xu Y., Li J., Mining invariants from console logs for system problem detection, in: Proceedings of the 2010 USENIX conference on USENIX annual technical conference, USENIX Association, USA, 2010, pp. 231–244,.
[25]
Luo W., Liu W., Gao S., Remembering history with convolutional LSTM for anomaly detection, in: 2017 IEEE international conference on multimedia and expo (ICME), 2017, pp. 439–444,.
[26]
Luo W., Liu W., Gao S., A revisit of sparse coding based anomaly detection in stacked rnn framework, in: 2017 IEEE international conference on computer vision (ICCV), 2017, pp. 341–349,.
[27]
Marchi E., Vesperini F., Weninger F., Eyben F., Squartini S., Schuller B., Non-linear prediction with lstm recurrent neural networks for acoustic novelty detection, in: IJCNN 2015, 2015, pp. 1–7,.
[28]
Meng W., Liu Y., Zhu Y., Zhang S., Pei D., Liu Y., et al., Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs, 2019, pp. 4739–4745,.
[29]
Mudassar B., Ko J., Mukhopadhyay S., An unsupervised anomalous event detection framework with class aware source separation, in: ICASSP 2018 - 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), 2018, pp. 2671–2675,.
[30]
Ren R., Fu X., Zhan J., Zhou W., Logmaster: Mining event correlations in logs of large-scale cluster systems, in: Proceedings of the IEEE symposium on reliable distributed systems, 2012, pp. 71–80,.
[31]
Scholkopf B., Smola A.J., Learning with kernels: Support vector machines, regularization, optimization, and beyond, MIT Press, 2018,.
[32]
Siddiqui A., Fern A., Dietterich T., Wright R., Theriault A., Archer D., Feedback-guided anomaly discovery via online optimization, in: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, in: KDD ’18, 2018, pp. 2200–2209,.
[33]
Tong Z., Chen H., Deng X., Li K., Li K., A scheduling scheme in the cloud computing environment using deep Q-learning, Information Sciences 512 (2019) 1170–1191,.
[34]
Turkoz M., Kim S., Son Y., Jeong M.K., Elsayed E.A., Generalized support vector data description for anomaly detection, Pattern Recognition 100 (2020),.
[35]
Watanabe Y., Otsuka H., Sonoda M., Kikuchi S., Matsumoto Y., Online failure prediction in cloud datacenters by real-time message pattern learning, in: 4th IEEE international conference on cloud computing technology and science proceedings, 2012, pp. 504–511,.
[36]
Wu X., Turner D., Chen C.-C., Maltz D.A., Yang X., Yuan L., et al., Netpilot: Automating datacenter network failure mitigation, in: Proceedings of the ACM SIGCOMM 2012 conference on applications, technologies, architectures, and protocols for computer communication - SIGCOMM ’12, ACM Press, 2012, p. 419,.
[37]
Xu W., Huang L., Fox A., Patterson D., Jordan M., Online system problem detection by mining patterns of console logs, in: 2009 Ninth IEEE international conference on data mining, (ISSN ) IEEE, ISBN 978-1-4244-5242-2, 2009, pp. 588–597,.
[38]
Xu W., Huang L., Fox A., Patterson D., Jordan M.I., Detecting large-scale system problems by mining console logs, in: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles - SOSP ’09, ACM Press, 2010, pp. 117–131,.
[39]
Ye Y., Li T., Adjeroh D., Iyengar S., A survey on malware detection using data mining techniques, ACM Computing Surveys 50 (2017) 1–40,.
[40]
Yu X., Joshi P., Xu J., Jin G., Zhang H., Jiang G., Cloudseer: Workflow monitoring of cloud infrastructures via interleaved logs, in: Proceedings of the twenty-first international conference on architectural support for programming languages and operating systems - ASPLOS ’16, ACM Press, 2016, pp. 489–502,. (2).
[41]
Yuan G., Li B., Yao Y., Zhang S., A deep learning enabled subspace spectral ensemble clustering approach for web anomaly detection, in: 2017 international joint conference on neural networks (IJCNN), 2017, pp. 3896–3903,.
[42]
Zhang Y., Sivasubramaniam A., Failure prediction in IBM bluegene/l event logs, in: 2008 IEEE international symposium on parallel and distributed processing, 2008, pp. 1–5,.
[43]
Zhang S., Zhang Y., Chen Y., Dong H., Qu X., Song L., et al., Prefix: Switch failure prediction in datacenter networks, in: Proceedings of the ACM on measurement and analysis of computing systems, Vol. 2, 2018, pp. 1–29,.
[44]
Zhu J., He S., Liu J., He P., Xie Q., Zheng Z., et al., Tools and benchmarks for automated log parsing, in: 2019 IEEE/ACM 41st international conference on software engineering: Software engineering in practice (ICSE-SEIP), IEEE, 2019, pp. 121–130,.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Processing and Management: an International Journal
Information Processing and Management: an International Journal  Volume 58, Issue 3
May 2021
1030 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 May 2021

Author Tags

  1. Log anomaly detection
  2. Q-learning
  3. Reinforcement learning
  4. Data analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)DLLogInternational Journal of Intelligent Systems10.1155/2024/59619932024Online publication date: 16-Apr-2024
  • (2023)PVEInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10347660:5Online publication date: 1-Sep-2023
  • (2023)Double locality sensitive hashing Bloom filter for high-dimensional streaming anomaly detectionInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10330660:3Online publication date: 1-May-2023
  • (2023)SSDLog: a semi-supervised dual branch model for log anomaly detectionWorld Wide Web10.1007/s11280-023-01174-y26:5(3137-3153)Online publication date: 13-Jun-2023
  • (2023)Log Drift Impact on Online Anomaly Detection WorkflowsProduct-Focused Software Process Improvement10.1007/978-3-031-49266-2_19(267-283)Online publication date: 11-Dec-2023
  • (2022)Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networksInformation Processing and Management: an International Journal10.1016/j.ipm.2022.10292959:3Online publication date: 3-Jun-2022
  • (2022)Robust log anomaly detection based on contrastive learning and multi-scale MASSThe Journal of Supercomputing10.1007/s11227-022-04508-178:16(17491-17512)Online publication date: 1-Nov-2022

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media