More Web Proxy on the site http://driver.im/

research-article

AllInfoLog: Robust Diverse Anomalies Detection Based on All Log Features

Authors:

Shuyuan JinAuthors Info & Claims

IEEE Transactions on Network and Service Management, Volume 20, Issue 3

Pages 2529 - 2543

https://doi.org/10.1109/TNSM.2022.3224974

Published: 01 September 2023 Publication History

Abstract

Large-scale services are generating massive logs, which trace the runtime states and critical events. Anomaly detection via logs is critical for service maintenance and reliability assurance. Existing log-based anomaly detection methods make use of the limited information in log data, resulting in their incapability of detecting diverse anomalies related to unused log features. In this paper, we propose AllInfoLog, a robust log-based anomaly detection method taking advantage of all log information, to detect diverse types of anomalies. To capture all log features, AllInfoLog utilizes four encoders to extract semantic, parameter, time, and other feature embeddings, respectively. The embeddings of all log features are then combined to train an attention-based Bi-LSTM model to detect diverse anomalies. The experimental evaluations on real-world log datasets, synthetic datasets, and unstable log datasets demonstrate AllInfoLog outperforms the state-of-the-art log-based anomaly detection methods from aspects of performance and robustness, and has effectiveness to detect diverse types of anomalies.

References

[1]

D. El-Masri, F. Petrillo, Y.-G. Guéhéneuc, A. Hamou-Lhadj, and A. Bouziane, “A systematic literature review on automated log abstraction techniques,” Inf. Softw. Technol., vol. 122, Jun. 2020, Art. no.

[2]

H. Mi, H. Wang, Y. Zhou, M. R.-T. Lyu, and H. Cai, “Toward fine-grained, unsupervised, scalable performance diagnosis for production cloud computing systems,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 6, pp. 1245–1255, Jun. 2013.

Digital Library

[3]

S. He, P. He, Z. Chen, T. Yang, Y. Su, and M. R. Lyu, “A survey on automated log analysis for reliability engineering,” ACM Comput. Surveys, vol. 54, no. 6, pp. 130:1–130:37, Jul. 2021.

[4]

S. Zhanget al., “FUNNEL: Assessing software changes in Web-based services,” IEEE Trans. Services Comput., vol. 11, no. 1, pp. 34–48, Jan./Feb. 2018.

[5]

S. Satpathi, S. Deb, R. Srikant, and H. Yan, “Learning latent events from network message logs,” IEEE/ACM Trans. Netw., vol. 27, no. 4, pp. 1728–1741, Aug. 2019.

Digital Library

[6]

S. Zhanget al., “PreFix: Switch failure prediction in datacenter networks,” Proc. ACM Meas. Anal. Comput. Syst., vol. 2, no. 1, pp. 2:1–2:29, Apr. 2018.

Digital Library

[7]

W. Menget al., “Device-agnostic log anomaly classification with partial labels,” in Proc. IEEE/ACM 26th Int. Symp. Qual. Service (IWQoS), Jun. 2018, pp. 1–6.

[8]

S. He, Q. Lin, J.-G. Lou, H. Zhang, M. R. Lyu, and D. Zhang, “Identifying impactful service system problems via log analysis,” in Proc. 26th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., New York, NY, USA, Oct. 2018, pp. 60–70.

[9]

S. Khatuya, N. Ganguly, J. Basak, M. Bharde, and B. Mitra, “ADELE: Anomaly detection from event log empiricism,” in Proc. IEEE Conf. Comput. Commun., Apr. 2018, pp. 2114–2122.

[10]

M. R. Lyu, Handbook of Software Reliability Engineering. Los Alamitos, CA, USA: IEEE Computer Soc. Press, 1996.

Digital Library

[11]

B. Chen and Z. M. Jiang, “Characterizing logging practices in java-based open source software projects—A replication study in apache software foundation,” Empir. Softw. Eng., vol. 22, no. 1, pp. 330–374, Feb. 2017.

Digital Library

[12]

M. Du, F. Li, G. Zheng, and V. Srikumar, “DeepLog: Anomaly detection and diagnosis from system logs through deep learning,” in Proc. ACM SIGSAC Conf. Comput. Commun. Security, New York, NY, USA, Oct. 2017, pp. 1285–1298.

[13]

X. Zhanget al., “Robust log-based anomaly detection on unstable log data,” in Proc. 27th ACM Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng., New York, NY, USA, Aug. 2019, pp. 807–817.

[14]

W. Menget al., “LogAnomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs,” in Proc. 28th Int. Joint Conf. Artif. Intell., Macao, China, Aug. 2019, pp. 4739–4745.

[15]

S. Lu, X. Wei, Y. Li, and L. Wang, “Detecting anomaly in big data system logs using convolutional neural network,” in Proc. IEEE 16th Int. Conf. Dependable, Auton. Secure Comput., 16th Int. Conf Pervasive Intell. Comput., 4th Int. Conf. Big Data Intell. Comput. Cyber Sci. Technol. Congr. (DASC/PiCom/DataCom/CyberSciTech), Aug. 2018, pp. 151–158.

[16]

X. Li, P. Chen, L. Jing, Z. He, and G. Yu, “SwissLog: Robust and unified deep learning based log anomaly detection for diverse faults,” in Proc. IEEE 31st Int. Symp. Softw. Rel. Eng. (ISSRE), Oct. 2020, pp. 92–103.

[17]

S. Huanget al., “HitAnomaly: Hierarchical transformers for anomaly detection in system log,” IEEE Trans. Netw. Service Manag., vol. 17, no. 4, pp. 2064–2076, Dec. 2020.

Digital Library

[18]

W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proc. ACM SIGOPS 22nd Symp. Oper. Syst. Princ., New York, NY, USA, Oct. 2009, pp. 117–132.

[19]

J.-G. Lou, Q. Fu, S. Yang, Y. Xu, and J. Li, “Mining invariants from console logs for system problem detection,” in Proc. USENIX Conf. USENIX Annu. Tech. Conf., Boston, MA, USA, Jun. 2010, p. 24.

[20]

M. Farshchi, J.-G. Schneider, I. Weber, and J. Grundy, “Experience report: Anomaly detection of cloud application operations using log and cloud metric correlation analysis,” in Proc. IEEE 26th Int. Symp. Softw. Rel. Eng. (ISSRE), Nov. 2015, pp. 24–34.

[21]

S. He, J. Zhu, P. He, and M. R. Lyu, “Experience report: System log analysis for anomaly detection,” in Proc. IEEE 27th Int. Symp. Softw. Rel. Eng. (ISSRE), Oct. 2016, pp. 207–218.

[22]

W. Menget al., “A semantic-aware representation framework for online log analysis,” in Proc. 29th Int. Conf. Comput. Commun. Netw. (ICCCN), Aug. 2020, pp. 1–7.

[23]

L. Yanget al., “PLELog: Semi-supervised log-based anomaly detection via probabilistic label estimation,” in Proc. IEEE/ACM 43rd Int. Conf. Softw. Eng. Compan. (ICSE-Companion), May 2021, pp. 230–231.

[24]

P. He, Z. Chen, S. He, and M. R. Lyu, “Characterizing the natural language descriptions in software logging statements,” in Proc. 33rd ACM/IEEE Int. Conf. Autom. Softw. Eng., New York, NY, USA, Sep. 2018, pp. 178–189.

[25]

Y. Liuet al., “RoBERTa: A robustly optimized BERT pretraining approach,” 2019, arXiv:1907.11692.

[26]

A. Graves, A.-R. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process., May 2013, pp. 6645–6649.

[27]

Z. Huang, W. Xu, and K. Yu, “Bidirectional LSTM-CRF models for sequence tagging,” 2015, arXiv:1508.01991.

[28]

A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in Proc. 37th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Netw. (DSN), Jun. 2007, pp. 575–584.

[29]

S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: A large collection of system log datasets towards automated log analytics,” 2020, arXiv:2008.06448.

[30]

J. Zhuet al., “Tools and benchmarks for automated log parsing,” in Proc. IEEE/ACM 41st Int. Conf. Softw. Eng.: Softw. Eng. Pract. (ICSE-SEIP), May 2019, pp. 121–130.

[31]

X. Yu, S. Han, D. Zhang, and T. Xie, “Comprehending performance from real-world execution traces: A device-driver case,” in Proc. 19th Int. Conf. Archit. Support Program. Lang. Oper. Syst., New York, NY, USA, Feb. 2014, pp. 193–206.

[32]

T. Mizouchi, K. Shimari, T. Ishio, and K. Inoue, “PADLA: A dynamic log level adapter using online phase detection,” in Proc. IEEE/ACM 27th Int. Conf. Program Comprehension (ICPC), May 2019, pp. 135–138.

[33]

A. Das, F. Mueller, C. Siegel, and A. Vishnu, “Desh: Deep learning for system health prediction of lead times to failure in HPC,” in Proc. 27th Int. Symp. High-Perform. Parallel Distrib. Comput., New York, NY, USA, Jun. 2018, pp. 40–51.

[34]

X. Zhao, K. Rodrigues, Y. Luo, D. Yuan, and M. Stumm, “Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle,” in Proc. 12th USENIX Symp. Oper. Syst. Design Implement., 2016, pp. 603–618.

[35]

“OpenStack Mitaka.” Accessed: Nov. 18, 2021. [Online]. Available: https://www.openstack.org/software/mitaka/

[36]

“OpenStack victoria.” Accessed: Nov. 18, 2021. [Online]. Available: https://www.openstack.org/software/victoria/

[37]

Q. Lin, H. Zhang, J.-G. Lou, Y. Zhang, and X. Chen, “Log clustering based problem identification for online service systems,” in Proc. IEEE/ACM 38th Int. Conf. Softw. Eng. Compan. (ICSE-C), May 2016, pp. 102–111.

[38]

P. He, J. Zhu, S. He, J. Li, and M. R. Lyu, “Towards automated log parsing for large-scale log data analysis,” IEEE Trans. Dependable Secure Comput., vol. 15, no. 6, pp. 931–944, Nov./Dec. 2018.

[39]

Q. Fu, J.-G. Lou, Y. Wang, and J. Li, “Execution anomaly detection in distributed systems through unstructured log analysis,” in Proc. 9th IEEE Int. Conf. Data Min., Dec. 2009, pp. 149–158.

[40]

M. Du and F. Li, “Spell: Online streaming parsing of large unstructured system logs,” IEEE Trans. Knowl. Data Eng., vol. 31, no. 11, pp. 2213–2227, Nov. 2019.

[41]

P. He, J. Zhu, Z. Zheng, and M. R. Lyu, “Drain: An online log parsing approach with fixed depth tree,” in Proc. IEEE Int. Conf. Web Services (ICWS), Jun. 2017, pp. 33–40.

[42]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proc. Adv. Neural Inf. Process. Syst., vol. 26, 2013, Art. no.

[43]

Y. Zuo, Y. Wu, G. Min, C. Huang, and K. Pei, “An intelligent anomaly detection scheme for micro-services architectures with temporal and spatial data analysis,” IEEE Trans. Cogn. Commun. Netw., vol. 6, no. 2, pp. 548–561, Jun. 2020.

[44]

M. Peterset al., “Deep Contextualized word representations,” in Proc. Conf. North Amer. Chapter Assoc. Comput. Linguist., 2018, pp. 2227–2237.

[45]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” 2019, arXiv:1810.04805.

[46]

Z. Liu, Y. Lin, and M. Sun, Representation Learning for Natural Language Processing. Singapore: Springer, 2020.

[47]

A. Radfordet al., Language Models are Unsupervised Multitask Learners, OpenAI Blog, San Francisco, CA, USA, 2019, p. 24.

[48]

“Roberta-base hugging face.” Accessed: Nov. 19, 2021. [Online]. Available: https://huggingface.co/roberta-base

[49]

H. Xiao. “Bert-as-service.” 2018. [Online]. Available: https://github.com/hanxiao/bert-as-service

[50]

C. D. Manning, M. Surdeanu, J. Bauer, J. R. Finkel, S. Bethard, and D. McClosky, “The stanford CoreNLP natural language processing toolkit,” in Proc. 52nd Annu. Meeting Assoc. Comput. Linguist. Syst. Demonstrations, 2014, pp. 55–60.

[51]

M. Farshchi, J.-G. Schneider, I. Weber, and J. Grundy, “Metric selection and anomaly detection for cloud operations using log and metric correlation analysis,” J. Syst. Softw., vol. 137, pp. 531–549, Mar. 2018.

[52]

“Log4j—Apache Log4j 2.” 2022. [Online]. Available: https://logging.apache.org/log4j/2.x/

[53]

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997.

Digital Library

[54]

J. L. Elman, “Finding structure in time,” Cogn. Sci., vol. 14, no. 2, pp. 179–211, 1990.

[55]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2017, arXiv:1412.6980.

[56]

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015.

[57]

L. McInnes, J. Healy, and S. Astels, “HDBSCAN: Hierarchical density based clustering,” J. Open Source Softw., vol. 2, no. 11, p. 205, 2017.

[58]

A. Paszkeet al., “Automatic differentiation in PyTorch,” in Proc. 31st Conf. Neural Inf. Process. Syst. Workshop, Oct. 2017.

[59]

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, “Bag of tricks for efficient text classification,” in Proc. 15th Conf. Eur. Chapter Assoc. Comput. Linguist., Valencia, Spain, Apr. 2017, pp. 427–431.

[60]

G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Inf. Process. Manage., vol. 24, no. 5, pp. 513–523, 1988.

Digital Library

[61]

M. Cinque, D. Cotroneo, and A. Pecchia, “Event logs for the analysis of software failures: A rule-based approach,” IEEE Trans. Softw. Eng., vol. 39, no. 6, pp. 806–821, Jun. 2013.

Digital Library

[62]

A. Oprea, Z. Li, T.-F. Yen, S. H. Chin, and S. Alrwais, “Detection of early-stage enterprise infection by mining large-scale log data,” in Proc. 45th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Netw., Jun. 2015, pp. 45–56.

Cited By

Xiao RLi WLu JJin S(2024)ContexLog: Non-Parsing Log Anomaly Detection With All Information Preservation and Enhanced Contextual RepresentationIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340028321:4(4750-4762)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1109/TNSM.2024.3400283

Index Terms

AllInfoLog: Robust Diverse Anomalies Detection Based on All Log Features

Index terms have been assigned to the content through auto-classification.

Recommendations

Robust log-based anomaly detection on unstable log data
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Logs are widely used by large and complex software-intensive systems for troubleshooting. There have been a lot of studies on log-based anomaly detection. To detect the anomalies, the existing methods mainly construct a detection model using log event ...
Robust Anomaly Detection and Localization via Simulated Anomalies
VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

Anomaly detection refers to identifying abnormal images and localizing anomalous regions. Reconstruction-based anomaly detection is a commonly used method; however, traditional reconstruction-based methods perform poorly as deep models generalize ...
Robust log anomaly detection based on contrastive learning and multi-scale MASS
Abstract
System logs are an important data source for performance monitoring and anomaly detection. Analyzing logs for anomaly detection can improve service quality. At present, although machine learning algorithms for anomaly detection can achieve high ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Network and Service Management

IEEE Transactions on Network and Service Management Volume 20, Issue 3

Sept. 2023

1837 pages

ISSN:1932-4537

Issue’s Table of Contents

1932-4537 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 September 2023

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xiao RLi WLu JJin S(2024)ContexLog: Non-Parsing Log Anomaly Detection With All Information Preservation and Enhanced Contextual RepresentationIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340028321:4(4750-4762)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1109/TNSM.2024.3400283

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents