[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3650212.3652123acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article
Open access

A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?

Published: 11 September 2024 Publication History

Abstract

Log data have facilitated various tasks of software development and maintenance, such as testing, debugging and diagnosing. Due to the unstructured nature of logs, log parsing is typically required to transform log messages into structured data for automated log analysis. Given the abundance of log parsers that employ various techniques, evaluating these tools to comprehend their characteristics and performance becomes imperative. Loghub serves as a commonly used dataset for benchmarking log parsers, but it suffers from limited scale and representativeness, posing significant challenges for studies to comprehensively evaluate existing log parsers or develop new methods. This limitation is particularly pronounced when assessing these log parsers for production use. To address these limitations, we provide a new collection of annotated log datasets, denoted Loghub-2.0, which can better reflect the characteristics of log data in real-world software systems. Loghub-2.0 comprises 14 datasets with an average of 3.6 million log lines in each dataset. Based on Loghub-2.0, we conduct a thorough re-evaluation of 15 state-of-the-art log parsers in a more rigorous and practical setting. Particularly, we introduce a new evaluation metric to mitigate the sensitivity of existing metrics to imbalanced data distributions. We are also the first to investigate the granular performance of log parsers on logs that represent rare system events, offering in-depth details for software diagnosis. Accurately parsing such logs is essential, yet it remains a challenge. We believe this work could shed light on the evaluation and design of log parsers in practical settings, thereby facilitating their deployment in production systems.

References

[1]
2023. The replication repository of our evaluation artifacts. https://github.com/logpai/Loghub-2.0 [Online; accessed 1 Dec 2023]
[2]
2023. Scipy. https://scipy.org/ [Online; accessed 1 July 2023]
[3]
Anunay Amar and Peter C Rigby. 2019. Mining historical test logs to predict bugs and localize faults in the test logs. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 140–151.
[4]
James H Andrews. 1998. Testing using log file analysis: tools, methods, and issues. In Proceedings 13th IEEE International Conference on Automated Software Engineering (Cat. No. 98EX239). 157–166.
[5]
Vincent Bushong, Russell Sanders, Jacob Curtis, Mark Du, Tomas Cerny, Karel Frajtak, Miroslav Bures, Pavel Tisnovsky, and Dongwan Shin. 2020. On matching log analysis to source code: A systematic mapping study. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems. 181–187.
[6]
An Ran Chen, Tse-Hsun Chen, and Shaowei Wang. 2021. Pathidea: Improving information retrieval-based bug localization by re-constructing execution paths using logs. IEEE Transactions on Software Engineering (TSE), 48, 8 (2021), 2905–2919.
[7]
Boyuan Chen, Jian Song, Peng Xu, Xing Hu, and Zhen Ming Jiang. 2018. An automated approach to estimating code coverage measures via execution logs. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 305–316.
[8]
Zhichao Chen, Junjie Chen, Weijing Wang, Jianyi Zhou, Meng Wang, Xiang Chen, Shan Zhou, and Jianmin Wang. 2023. Exploring better black-Box test case prioritization via log analysis. ACM Transactions on Software Engineering and Methodology, 32, 3 (2023), 1–32.
[9]
Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun Chen. 2020. Logram: Efficient Log Parsing Using n n-Gram Dictionaries. IEEE Transactions on Software Engineering (TSE), 48, 3 (2020), 879–892.
[10]
Hetong Dai, Yiming Tang, Heng Li, and Weiyi Shang. 2023. PILAR: Studying and Mitigating the Influence of Configurations on Log Parsing. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 818–829.
[11]
Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). 859–864.
[12]
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 1285–1298.
[13]
Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 ninth IEEE international conference on data mining (ICDM). 149–158.
[14]
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Where do developers log? an empirical study on logging practices in industry. In Companion Proceedings of the 36th International Conference on Software Engineering. 24–33.
[15]
Ying Fu, Meng Yan, Jian Xu, Jianguo Li, Zhongxin Liu, Xiaohong Zhang, and Dan Yang. 2022. Investigating and improving log parsing in practice. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). 1566–1577.
[16]
Hossein Hamooni, Biplob Debnath, Jianwu Xu, Hui Zhang, Guofei Jiang, and Abdullah Mueen. 2016. Logmine: Fast pattern recognition for log analytics. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM). 1573–1582.
[17]
Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE international conference on web services (ICWS). 33–40.
[18]
Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R Lyu. 2021. A survey on automated log analysis for reliability engineering. ACM computing surveys (CSUR), 54, 6 (2021), 1–37.
[19]
Shilin He, Xu Zhang, Pinjia He, Yong Xu, Liqun Li, Yu Kang, Minghua Ma, Yining Wei, Yingnong Dang, and Saravanakumar Rajmohan. 2022. An empirical study of log analysis at Microsoft. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). 1465–1476.
[20]
Shilin He, Jieming Zhu, Pinjia He, and Michael R Lyu. 2020. Loghub: A large collection of system log datasets towards automated log analytics. arXiv preprint arXiv:2008.06448.
[21]
Yintong Huo, Yuxin Su, Cheryl Lee, and Michael R Lyu. 2021. Semparser: A semantic parser for log analysis. arXiv preprint arXiv:2112.12636.
[22]
Tong Jia, Lin Yang, Pengfei Chen, Ying Li, Fanjing Meng, and Jingmin Xu. 2017. Logsed: Anomaly diagnosis through mining time-weighted control flow graph in logs. In 2017 IEEE 10th International Conference on Cloud Computing (CLOUD). 447–455.
[23]
Zhihan Jiang, Jinyang Liu, Zhuangbin Chen, Yichen Li, Junjie Huang, Yintong Huo, Pinjia He, Jiazhen Gu, and Michael R Lyu. 2023. Llmparser: A llm-based log parsing framework. arXiv preprint arXiv:2310.01796.
[24]
Zhen Ming Jiang, Ahmed E Hassan, Parminder Flora, and Gilbert Hamann. 2008. Abstracting execution logs to execution events for enterprise applications (short paper). In 2008 The Eighth International Conference on Quality Software. 181–186.
[25]
Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and Lionel Briand. 2022. Guidelines for assessing the accuracy of log message template identification techniques. In Proceedings of the 44th International Conference on Software Engineering (ICSE). 1095–1106.
[26]
Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and Lionel Briand. 2023. Impact of Log Parsing on Log-based Anomaly Detection. arXiv preprint arXiv:2305.15897.
[27]
Van-Hoang Le and Hongyu Zhang. 2022. Log-based anomaly detection with deep learning: How far are we? In Proceedings of the 44th international conference on software engineering (ICSE). 1356–1367.
[28]
Van-Hoang Le and Hongyu Zhang. 2023. Log Parsing with Prompt-based Few-shot Learning. arXiv preprint arXiv:2302.07435.
[29]
Yichen Li, Yintong Huo, Zhihan Jiang, Renyi Zhong, Pinjia He, Yuxin Su, and Michael R Lyu. 2023. Exploring the Effectiveness of LLMs in Automated Logging Generation: An Empirical Study. arXiv preprint arXiv:2307.05950.
[30]
Yichen Li, Yintong Huo, Renyi Zhong, Zhihan Jiang, Jinyang Liu, Junjie Huang, Jiazhen Gu, Pinjia He, and Michael R Lyu. 2024. Go Static: Contextualized Logging Statement Generation. arXiv preprint arXiv:2402.12958.
[31]
Zhenhao Li, Chuan Luo, Tse-Hsun Chen, Weiyi Shang, Shilin He, Qingwei Lin, and Dongmei Zhang. 2023. Did We Miss Something Important? Studying and Exploring Variable-Aware Log Abstraction. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).
[32]
Jinyang Liu, Junjie Huang, Yintong Huo, Zhihan Jiang, Jiazhen Gu, Zhuangbin Chen, Cong Feng, Minzhi Yan, and Michael R Lyu. 2023. Scalable and Adaptive Log-based Anomaly Detection with Expert in the Loop. arXiv preprint arXiv:2306.05032.
[33]
Jiahao Liu, Jun Zeng, Xiang Wang, Kaihang Ji, and Zhenkai Liang. 2022. Tell: log level suggestions via modeling multi-level code block information. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). 27–38.
[34]
Jinyang Liu, Jieming Zhu, Shilin He, Pinjia He, Zibin Zheng, and Michael R Lyu. 2019. Logzip: Extracting hidden structures via iterative clustering for log compression. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 863–873.
[35]
Yudong Liu, Xu Zhang, Shilin He, Hongyu Zhang, Liqun Li, Yu Kang, Yong Xu, Minghua Ma, Qingwei Lin, and Yingnong Dang. 2022. Uniparser: A unified log parser for heterogeneous log data. In Proceedings of the ACM Web Conference 2022 (WWW). 1893–1901.
[36]
Steven Locke, Heng Li, Tse-Hsun Peter Chen, Weiyi Shang, and Wei Liu. 2021. LogAssist: Assisting log analysis through log summarization. IEEE Transactions on Software Engineering (TSE), 48, 9 (2021), 3227–3241.
[37]
Junchen Ma, Yang Liu, Hongjie Wan, and Guozi Sun. 2023. Automatic Parsing and Utilization of System Log Features in Log Analysis: A Survey. Applied Sciences, 13, 8 (2023), 4930.
[38]
Shiqing Ma, Juan Zhai, Yonghwi Kwon, Kyu Hyung Lee, Xiangyu Zhang, Gabriela Ciocarlie, Ashish Gehani, Vinod Yegneswaran, Dongyan Xu, and Somesh Jha. 2018. $Kernel-Supported$$Cost-Effective$ Audit Logging for Causality Tracking. In 2018 USENIX Annual Technical Conference (USENIX ATC). 241–254.
[39]
Adetokunbo AO Makanju, A Nur Zincir-Heywood, and Evangelos E Milios. 2009. Clustering event logs using iterative partitioning. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD). 1255–1264.
[40]
Salma Messaoudi, Annibale Panichella, Domenico Bianculli, Lionel Briand, and Raimondas Sasnauskas. 2018. A search-based approach for accurate identification of log message formats. In Proceedings of the 26th Conference on Program Comprehension. 167–177.
[41]
Salma Messaoudi, Donghwan Shin, Annibale Panichella, Domenico Bianculli, and Lionel C Briand. 2021. Log-based slicing for system-level test cases. In Proceedings of the 30th ACM SIGSOFT international symposium on software testing and analysis (ISSTA). 517–528.
[42]
Masayoshi Mizutani. 2013. Incremental mining of system log format. In 2013 IEEE International Conference on Services Computing. 595–602.
[43]
Meiyappan Nagappan and Mladen A Vouk. 2010. Abstracting log lines to log event types for mining software system logs. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR). 114–117.
[44]
Meiyappan Nagappan, Kesheng Wu, and Mladen A Vouk. 2009. Efficiently extracting operational profiles from execution logs using suffix arrays. In 2009 20th International Symposium on Software Reliability Engineering. 41–50.
[45]
Karthik Nagaraj, Charles Killian, and Jennifer Neville. 2012. Structured comparative analysis of systems logs to diagnose performance problems. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI 12). 353–366.
[46]
Paolo Notaro, Soroush Haeri, Jorge Cardoso, and Michael Gerndt. 2023. LogRule: Efficient Structured Log Mining for Root Cause Analysis. IEEE Transactions on Network and Service Management.
[47]
Antonio Pecchia, Marcello Cinque, Gabriella Carrozza, and Domenico Cotroneo. 2015. Industry practices and event logging: Assessment of a critical software development process. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE). 2, 169–178.
[48]
Daan Schipper, Maurício Aniche, and Arie van Deursen. 2019. Tracing back log data to its log statement: from research to practice. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 545–549.
[49]
Issam Sedki, Abdelwahab Hamou-Lhadj, Otmane Ait-Mohamed, and Naser Ezzati-Jivan. 2023. Towards a Classification of Log Parsing Errors. In 2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC). 84–88.
[50]
Weiyi Shang. 2012. Bridging the divide between software developers and operators using logs. In 2012 34th international conference on software engineering (ICSE). 1583–1586.
[51]
Keiichi Shima. 2016. Length matters: Clustering system log messages using length of words. arXiv preprint arXiv:1611.03213.
[52]
Liang Tang, Tao Li, and Chang-Shing Perng. 2011. LogSig: Generating system events from raw textual logs. In Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM). 785–794.
[53]
Risto Vaarandi. 2003. A data clustering algorithm for mining patterns from event logs. In Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM)(IEEE Cat. No. 03EX764). 119–126.
[54]
Risto Vaarandi and Mauno Pihelgas. 2015. Logcluster-a data clustering and pattern mining algorithm for event logs. In 2015 11th International conference on network and service management (CNSM). 1–7.
[55]
Xuheng Wang, Xu Zhang, Liqun Li, Shilin He, Hongyu Zhang, Yudong Liu, Lingling Zheng, Yu Kang, Qingwei Lin, and Yingnong Dang. 2022. SPINE: a scalable log parser with feedback guidance. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). 1198–1208.
[56]
Kundi Yao, Mohammed Sayagh, Weiyi Shang, and Ahmed E Hassan. 2021. Improving state-of-the-art compression techniques for log management tools. IEEE Transactions on Software Engineering (TSE), 48, 8 (2021), 2748–2760.
[57]
Siyu Yu, Ningjiang Chen, Yifan Wu, and Wensheng Dou. 2023. Self-supervised log parsing using semantic contribution difference. Journal of Systems and Software, 200 (2023), 111646.
[58]
Siyu Yu, Pinjia He, Ningjiang Chen, and Yifan Wu. 2023. Brain: Log Parsing with Bidirectional Parallel Tree. IEEE Transactions on Services Computing (TSC).
[59]
Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and Shankar Pasupathy. 2010. Sherlog: error diagnosis by connecting clues from run-time logs. In Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems. 143–154.
[60]
Tianzhu Zhang, Han Qiu, Gabriele Castellano, Myriana Rifai, Chung Shue Chen, and Fabio Pianese. 2023. System Log Parsing: A Survey. IEEE Transactions on Knowledge and Data Engineering (TKDE).
[61]
Chen Zhi, Jianwei Yin, Shuiguang Deng, Maoxin Ye, Min Fu, and Tao Xie. 2019. An exploratory study of logging configuration practice in java. In 2019 IEEE international conference on software maintenance and evolution (ICSME). 459–469.
[62]
Jieming Zhu, Shilin He, Pinjia He, Jinyang Liu, and Michael R Lyu. 2023. Loghub: A large collection of system log datasets for ai-driven log analytics. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 355–366.
[63]
Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 121–130.

Cited By

View all
  • (2024)IPLog: An Efficient Log Parsing Method Based on Few-Shot LearningElectronics10.3390/electronics1316332413:16(3324)Online publication date: 21-Aug-2024
  • (2024)Large language model-based optical network log analysis using LLaMA2 with instruction tuningJournal of Optical Communications and Networking10.1364/JOCN.52787416:11(1116)Online publication date: 22-Oct-2024
  • (2024)Demonstration-Free: Towards More Practical Log Parsing with Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694994(153-165)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis
    September 2024
    1928 pages
    ISBN:9798400706127
    DOI:10.1145/3650212
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 September 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. benchmark
    2. empirical study
    3. log analysis
    4. log parsing

    Qualifiers

    • Research-article

    Conference

    ISSTA '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 58 of 213 submissions, 27%

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)382
    • Downloads (Last 6 weeks)156
    Reflects downloads up to 19 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)IPLog: An Efficient Log Parsing Method Based on Few-Shot LearningElectronics10.3390/electronics1316332413:16(3324)Online publication date: 21-Aug-2024
    • (2024)Large language model-based optical network log analysis using LLaMA2 with instruction tuningJournal of Optical Communications and Networking10.1364/JOCN.52787416:11(1116)Online publication date: 22-Oct-2024
    • (2024)Demonstration-Free: Towards More Practical Log Parsing with Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694994(153-165)Online publication date: 27-Oct-2024
    • (2024)Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00055(511-522)Online publication date: 28-Oct-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media