Abstract
Software logging is the practice of recording different events and activities that occur within a software system, which are useful for different activities such as failure prediction and anomaly detection. While previous research focused on improving different aspects of logging practices, the goal of this paper is to conduct a systematic literature review and the existing challenges of practitioners in software logging practices. In this paper, we focus on the logging practices that cover the steps from the instrumentation of a software system, the storage of logs, up to the preprocessing steps that prepare log data for further follow-up analysis. Our systematic literature review (SLR) covers 204 research papers and a quantitative and qualitative analysis of 20,766 and 149 questions on StackOverflow (SO). We observe that 53% of the studies focus on improving the techniques that preprocess logs for analysis (e.g., the practices of log parsing, log clustering and log mining), 37% focus on how to create new log statements, and 10% focus on how to improve log file storage. Our analysis of SO topics reveals that five out of seven identified high-level topics are not covered by the literature and are related to dependency configuration of logging libraries, infrastructure related configuration, scattered logging, context-dependant usage of logs and handling log files.
Similar content being viewed by others
Data Availability
Our replication package is available on: https://github.com/MABATOUN/SLRReplicationPackage.git.
Notes
Open-source logging library for Java applications: https://github.com/apache/logging-log4j2
A search engine developed in Java, allowing an easy search and analysis of large amounts of data
An open server-side data processing pipeline that ingests, transforms and sends data from various sources to a certain destination
A data visualization dashboard software for Elasticsearch
Closed card sort is a categorization approach where participants are provided with pre-defined categories and asked to sort items into such categories.
Open card sort is a categorization approach where the participants are free to add their own categories.
Open-Source log capture tool and provides analysis solution for operational intelligence
Command-line tool that dumps a log of system messages including messages written from an android application
Degree of interest model (DOI) was proposed by Kersten and Murphy (2005) to measure the degree of developers’ interests in program elements
PoC: a sample code that can be used to exploit a specific vulnerability, usually created by security researchers or ethical hackers to illustrate how a vulnerability can be exploited and to demonstrate the impact of such an exploit.
Log statements can be guarded by conditional statements, known as logging guards, to ensure they are only executed when necessary (Zhi et al. 2022)
unsupervised data clustering algorithm that operates by iteratively refining the clustering results
blockchain is tamper-proof and decentralized which allows enterprises to execute business processes with privacy and security
GDPR: a data protection and privacy regulation implemented by the European Union (EU)
Data protection technique that involves replacing or encrypting personal data with pseudonyms or pseudonym identifiers
Elasticsearch, Logstash and Kibana
NSGA-II: Non-dominated Sorting Genetic Algorithm, a well-known and efficient technique to solve problems with multiple objectives
ILP is a subfield of AI which uses logic programming to induce logical rules from sets of examples.
In the context of natural language processing or text analysis, the LCS approach is often employed to measure the similarity or dissimilarity between two texts or strings
Algorithm inspired by the behavior of honey bees, that is used to solve optimization problems
N-gram models are statistical language models used to analyze the patterns and relationships within textual data
Graph-based ranking algorithm used for automated text summarization and keyword extraction
Programming model and framework designed to process and analyze large volumes of data in a parallel and distributed manner
Collection of binary values where each bit represents whether a log chunk contains logs of a certain format
References
Abbasli N, Ganiz MC (2021) Log and execution trace analytics system. In: Proceedings of the 2021 international conference on innovations in intelligent systems and applications (INISTA), pp 1–7
Agrawal A, Dixit A, Shettar NA, Kapadia D, Agrawal V, Gupta R, Karlupia R (2019) Delog: A high-performance privacy preserving log filtering framework. In: Proceedings of the 2019 IEEE international conference on big data (Big Data), pp 1739–1748
Agrawal A, Karlupia R, Gupta R (2019) Logan: A distributed online log parser. In: Proceedings of the 2019 IEEE international conference on data engineering (ICDE), pp 1946–1951
Amar H, Bao L, Busany N, Lo D, Maoz S (2018) Using finite-state models for log differencing. In: Proceedings of the 2018 ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 49–59
Anu H, Chen J, Shi W, Hou J, Liang B, Qin B (2019) An approach to recommendation of verbosity log levels based on logging intention. In: Proceedings of the 2019 IEEE international conference on software maintenance and evolution (ICSME)), pp 125–134
Aslan U, Şen B (2021) Gdpr compliant audit log management system with blockchain. In: Proceedings of the 2021 Turkish national software engineering symposium (UYMS), pp 1–3
Aussel N, Petetin Y, Chabridon S (2018) Improving performances of log mining for anomaly prediction through nlp-based log parsing. In: Proceedings of the 2018 IEEE international symposium on modeling, analysis, and simulation of computer and telecommunication systems (MASCOTS), pp 237–243
Baccanico F, Carrozza G, Cinque M, Cotroneo D, Pecchia A, Savignano A (2014) Event logging in an industrial development process: Practices and reengineering challenges. In: Proceedings of the 2014 international symposium on software reliability engineering workshops, pp 10–13
Baccanico F, Carrozza G, Cinque M, Cotroneo D, Pecchia A, Savignano A (2014) Tell: Log level suggestions via modeling multi-level code block information. In: Proceedings of the 2014 international symposium on software reliability engineering workshops, pp 10–13
Bai Y, Chi Y, Zhao D (2023) Patcluster: A top-down log parsing method based on frequent words. IEEE Access 8275–8282
Bao L, Busany N, Lo D, Maoz S (2019) Statistical log differencing. In: Proceedings of the 2019 IEEE/ACM international conference on automated software engineering (ASE), pp 851–862
Barua A, Thomas SW, Hassan AE (2014) What are developers talking about? an analysis of topics and trends in stack overflow. Empir Softw Eng 619–654
Bhosale V, Thakar A, Pandit C, Deshpande A, Khanuja H (2018) Hadoop in action: Building a generic log analyzing system. In: Proceedings of the 2018 international conference on computing communication control and automation (ICCUBEA), pp 1–7
Bodik P, Goldszmidt M, Fox A, Woodard DB, Andersen H (2010) Fingerprinting the datacenter: Automated classification of performance crises. In: Proceedings of the 2010 european conference on computer systems, pp 111–124
Bosch N, Bosch J (2020) Software logs for machine learning in a devops environment. In: Proceedings of the 2020 euromicro conference on software engineering and advanced applications (SEAA), pp 29–33
Bunker J, Curtis K, Girolami M, Sriharsha R (2022) A mixture modeling approach for clustering log files with coreset and user feedback. Pattern Recognit Lett 74–80
Bushong V, Sanders R, Curtis J, Du M, Cerny T, Frajtak K, Bures M, Tisnovsky P, Shin D (2020) On matching log analysis to source code: A systematic mapping study. In: Proceedings of the 2020 international conference on research in adaptive and convergent systems, pp 181–187
Cândido J, Haesen J, Aniche M, van Deursen A (2021) An exploratory study of log placement recommendation in an enterprise system. In: Proceedings of the 2021 IEEE/ACM international conference on mining software repositories (MSR), pp 143–154
Chen AR, Chen TH, Wang S (2021) Demystifying the challenges and benefits of analyzing user-reported logs in bug reports. Empir Softw Eng 1–30
Chen TH, Thomas SW, Hassan AE (2016) A survey on the use of topic models when mining software repositories. Empir Softw Eng 1843–1919
Chen B, Jiang ZM (2017) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 2017 IEEE/ACM international conference on software engineering (ICSE)), pp 71–81
Chen B, Jiang ZM (2017) Characterizing logging practices in java-based open source software projects –a replication study in apache software foundation. Empir Softw Eng 330–374
Chen B, Jiang ZM (2019) Extracting and studying the logging-code-issue-introducing changes in java-based large-scale open source software systems. Empir Softw Eng 2285–2322
Chen B, Jiang ZM (2020) Studying the use of java logging utilities in the wild. In: Proceedings of the 2020 IEEE/ACM international conference on software engineering (ICSE), pp 397–408
Chen B, Jiang ZM (2021) A survey of software log instrumentation. ACM Comput Surv 1–34
Chen J, Wang P, Qiao F, Du SQ, Wang W (2022) Plq: An efficient approach to processing pattern-based log queries. J Comput Sci Technol 1239–1254
Chen M, Zheng AX, Lloyd J, Jordan MI, Brewer E (2004) Failure diagnosis using decision trees. In: Proceedings of the 2004 international conference on autonomic computing, pp 36–43
Chi S, Li S, Guo Y, Dong W, Jia Z, He H, Liao Q (2018) Notonlylog: Mining patch-log associations from software evolution history to enhance failure diagnosis capability. In: Proceedings of the 2018 asia-pacific software engineering conference (APSEC), pp 189–198
Chowdhury S, Di Nardo S, Hindle A, Jiang ZM (2018) An exploratory study on assessing the energy impact of logging on android applications. Empir Softw Eng 1422–1456
Chunyong Z, Meng X (2020) Log parser with one-to-one markup. In: Proceedings of the 2020 international conference on information and computer technologies (ICICT), pp 251–257
Chu G, Wang J, Qi Q, Sun H, Tao S, Liao J (2021) Prefix-graph: A versatile log parsing approach merging prefix tree with probabilistic graph. In: Proceedings of the 2021 IEEE international conference on data engineering (ICDE), pp 2411–2422
Copstein R, Schwartzentruber J, Zincir-Heywood N, Heywood M (2021) Log abstraction for information security: Heuristics and reproducibility. In: Proceedings of the 2021 international conference on availability, reliability and security, pp 1–10
Coustié O, Mothe J, Teste O, Baril X (2020) Meting: A robust log parser based on frequent n-gram mining. In: Proceedings of the 2020 IEEE international conference on web services (ICWS), pp 84–88
Dai H, Li H, Chen CS, Shang W, Chen TH (2020) Logram: Efficient log parsing using n-gram dictionaries. IEEE Trans Softw Eng
Dai S, Luan Z, Huang S, Fung C, Wang H, Yang H, Qian D (2022) Reval: Recommend which variables to log with pre-trained model and graph neural network. IEEE Trans Netw Serv Manag
Decker L, Leite D, Bonacorsi D (2022) Explainable log parsing and online interval granular classification from streams of words. In: Proceedings of the 2022 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 1–8
Di S, Gupta R, Snir M, Pershey E, Cappello F (2017) Logaider: A tool for mining potential correlations of hpc log events. In: Proceedings of the 2017 IEEE/ACM international symposium on cluster, cloud and grid computing (CCGRID), pp 442–451
Ding Z, Li H, Shang W (2022) Logentext: Automatically generating logging texts using neural machine translation. In: Proceedings of the 2022 IEEE international conference on software analysis, evolution and reengineering (SANER), pp 349–360
Duan X, Ying S, Cheng H, Yuan W, Yin X (2021) Oilog: An online incremental log keyword extraction approach based on mdp-lstm neural network. Inf Syst 101618
Du M, Li F (2016) Spell: Streaming parsing of system event logs. In: Proceedings of the 2016 IEEE international conference on data mining (ICDM), pp 859–864
Du M, Li F (2018) Spell: Online streaming parsing of large unstructured system logs. IEEE Trans Knowl Data Eng 2213–2227
Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 1285–1298
Dusane P, Sujatha G (2021) Logea: Log extraction and analysis tool to support forensic investigation of linux-based system. In: Proceedings of the 2021 international conference on trends in electronics and informatics (ICOEI), pp 909–916
Egersdoerfer C, Zhang D, Dai D (2022) Clusterlog: Clustering logs for effective log-based anomaly detection. In: Proceedings of the 2022 IEEE/ACM workshop on fault tolerance for HPC at eXtreme Scale (FTXS), pp 1–10
Ekelhart A, Ekaputra FJ, Kiesling E (2021) The slogert framework for automated log knowledge graph construction. In: Proceedings of the 2021 international conference on the semantic web, pp 631–646
El-Masri D, Petrillo F, Guéhéneuc YG, Hamou-Lhadj A, Bouziane A (2020) A systematic literature review on automated log abstraction techniques. Inf Softw Technol 106276
Fang L, Di X, Liu X, Qin Y, Ren W, Ding Q (2021) Quicklogs: A quick log parsing algorithm based on template similarity. In: Proceedings of the 2021 IEEE international conference on trust, security and privacy in computing and communications (TrustCom), pp 1085–1092
Fei P, Li Z, Wang Z, Yu X, Li D, Jee K (2021) Seal: Storage-efficient causality analysis on enterprise logs with query-friendly compression. In: Proceedings of the 2021 USENIX security symposium, pp 2987–3004
Feng B, Wu C, Li J (2016) Mlc: An efficient multi-level log compression method for cloud backup systems. In: Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, pp 1358–1365
Fu Y, Yan M, Xu J, Li J, Liu Z, Zhang X, Yang D (2022) Investigating and improving log parsing in practice. In: Proceedings of the 2022 ACM joint european software engineering conference and symposium on the foundations of software engineering, pp 1566–1577
Fu Y, Yan M, Xu Z, Xia X, Zhang X, Yang D (2023) An empirical study of the impact of log parsers on the performance of log-based anomaly detection. Empir Softw Eng 1–39
Fu Q, Zhu J, Hu W, Lou JG, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? an empirical study on logging practices in industry. In: Proceedings of the 2014 international conference on software engineering, pp 24–33
Gholamian S (2021) Leveraging code clones and natural language processing for log statement prediction. In: Proceedings of the 2021 IEEE/ACM international conference on automated software engineering (ASE), pp 1043–1047
Gholamian S, Ward PA (2020) Logging statements’ prediction based on source code clones. In: Proceedings of the 2020 annual ACM symposium on applied computing, pp 82–91
Gujral H, Lal S, Li H (2020) An exploratory semantic analysis of logging questions. J Softw Evol Process e2361
Gujral H, Sharma A, Lal S, Kaur A, Kumar A, Sureka A (2018) Empirical analysis of the logging questions on the stack overflow website. In: Proceedings of the 2018 conference on software engineering & data sciences (CoSEDS)
Gujral H, Sharma A, Lal S, Kumar L (2019) A three dimensional empirical study of logging questions from six popular q & a websites. E-Informatica Softw Eng J 105–139
Guo S, Liu Z, Chen W, Li T (2019) Event extraction from streaming system logs. In: Proceedings of the 2019 information science and applications (ICISA), pp 465–474
Hamooni H, Debnath B, Xu J, Zhang H, Jiang G, Mueen A (2016) Logmine: Fast pattern recognition for log analytics. In: Proceedings of the 2016 ACM international on conference on information and knowledge management, pp 1573–1582
Harty J, Zhang H, Wei L, Pascarella L, Aniche M, Shang W (2021) Logging practices with mobile analytics: An empirical study on firebase. In: Proceedings of the 2021 IEEE/ACM international conference on mobile software engineering and systems (MobileSoft), pp 56–60
Harutyunyan AN, Poghosyan AV, Grigoryan NM, Hovhannisyan NA, Kushmerick N (2019) On machine learning approaches for automated log management. J Univers Comput Sci 925–945
Hashemi S, Mäntylä M (2022) Sialog: Detecting anomalies in software execution logs using the siamese network. Autom Softw Eng 61
Hassani M, Shang W, Shihab E, Tsantalis N (2018) Studying and detecting log-related issues. Empir Softw Eng 3248–3280
He P (2017) An end-to-end log management framework for distributed systems. In: Proceedings of the 2017 IEEE symposium on reliable distributed systems (SRDS), pp 266–267
He P, Chen Z, He S, Lyu MR (2018) Characterizing the natural language descriptions in software logging statements. In: Proceedings of the 2018 IEEE/ACM international conference on automated software engineering (ASE)), pp 178–189
He S, He P, Chen Z, Yang T, Su Y, Lyu MR (2021) A survey on automated log analysis for reliability engineering. ACM Comput Surv (CSUR) 1–37
He S, Lin Q, Lou JG, Zhang H, Lyu MR, Zhang D (2018) Identifying impactful service system problems via log analysis. In: Proceedings of the 2018 ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 60–70
He S, Zhang X, He P, Xu Y, Li L, Kang Y, Ma M, Wei Y, Dang Y, Rajmohan , et al (2022) An empirical study of log analysis at microsoft. In: Proceedings of the 2022 ACM joint european software engineering conference and symposium on the foundations of software engineering, pp 1465–1476
He P, Zhu J, He S, Li J, Lyu MR (2016) An evaluation study on log parsing and its use in log mining. In: Proceedings of the 2016 annual IEEE/IFIP international conference on dependable systems and networks (DSN), pp 654–661
He P, Zhu J, He S, Li J, Lyu MR (2017) Towards automated log parsing for large-scale log data analysis. IEEE Trans Dependable Secure Comput 931–944
He P, Zhu J, Zheng Z, Lyu MR (2017) Drain: An online log parsing approach with fixed depth tree. In: Proceedings of the 2017 IEEE international conference on web services (ICWS), pp 33–40
Hickman M, Fulp D, Baseman E, Blanchard S, Greenberg H, Jones W, DeBardeleben N (2018) Enhancing hpc system log analysis by identifying message origin in source code. In: Proceedings of the 2018 IEEE international symposium on software reliability engineering workshops (ISSREW), pp 100–105
Huang S, Liu Y, Fung C, He R, Zhao Y, Yang H, Luan Z (2020) Paddy: An event log parsing approach using dynamic dictionary. In: Proceedings of the 2020 IEEE/IFIP network operations and management symposium, pp 1–8
Huo Y, Su Y, Lyu M (2022) Logvm: Variable semantics miner for log messages. In: Proceedings of the 2022 IEEE international symposium on software reliability engineering workshops (ISSREW), pp 124–125
Jayathilake D (2012) Towards structured log analysis. In: Proceedings of the 2012 international conference on computer science and software engineering, pp 259–264
Jayathilake PW, Weeraddana NR, Hettiarachchi HK (2017) Automatic detection of multi-line templates in software log files. In: Proceedings of the 2017 international conference on advances in ICT for emerging regions (ICTer), pp 1–8
Jia Z, Li S, Liu X, Liao X, Liu Y (2018) Smartlog: Place error log statement by deep understanding of log intention. In: Proceedings of the 2018 IEEE international conference on software analysis, evolution and reengineering (SANER), pp 61–71
Jia T, Li Y, Zhang C, Xia W, Jiang J, Liu Y (2018) Machine deserves better logging: a log enhancement approach for automatic fault diagnosis. In: Proceedings of the 2018 IEEE international symposium on software reliability engineering workshops (ISSREW), pp 106–111
Kabinna S, Bezemer CP, Shang W, Hassan AE (2016) Logging library migrations: A case study for the apache software foundation projects. In: Proceedings of the 2016 international conference on mining software repositories, pp 154–164
Kabinna S, Bezemer CP, Shang W, Syer MD, Hassan AE (2018) Examining the stability of logging statements. Empir Softw Eng pp 290–333
Kalamatianos T, Kontogiannis K (2014) Schema independent reduction of streaming log data. In: Proceedings of the 2014 international conference on advanced information systems engineering, pp 394–408
Keele S (2007) Guidelines for performing systematic literature reviews in software engineering
Kersten M, Murphy GC (2005) Mylar: a degree-of-interest model for ides. In: Proceedings of the 2005 international conference on aspect-oriented software development, pp 159–168
Khan ZA, Shin D, Bianculli D, Briand L (2022) Guidelines for assessing the accuracy of log message template identification techniques. In: Proceedings of the 2022 international conference on software engineering, pp 1095–1106
Kim T, Kim S, Park S, Park Y (2020) Automatic recommendation to appropriate log levels. Softw- Pract Exp 189–209
Kim T, Kim S, Yoo CJ, Cho S, Park S (2018) An automatic approach to validating log levels in java. In: Proceedings of the 2018 Asia-pacific software engineering conference (APSEC), pp 623–627
King J, Pandita R, Williams L (2015) Enabling forensics by proposing heuristics to identify mandatory log events. In: Proceedings of the 2015 symposium and bootcamp on the science of security, pp 1–11
King J, Stallings J, Riaz M, Williams L (2017) To log, or not to log: Using heuristics to identify mandatory log events–a controlled experiment. Empir Softw Eng 2684–2717
Kiran D, Rao M (2022) Modelling auto-scalable big data enabled log analytic framework. In: Computer networks and inventive communication technologies: Proceedings of Fifth ICCNCT 2022, pp 857–870
Kobayashi S, Fukuda K, Esaki H (2014) Towards an nlp-based log template generation algorithm for system log analysis. In: Proceedings of the 2014 international conference on future internet technologies, pp 1–4
Kobayashi S, Yamashiro Y, Otomo K, Fukuda K (2022) Amulog: A general log analysis framework for comparison and combination of diverse template generation methods. Int J Netw Manag e2195
Korzeniowski Ł, Goczyła K (2022) Landscape of automated log analysis: A systematic literature review and mapping study. IEEE Access
Kratzke N (2022) Cloud-native observability: The many-faceted benefits of structured and unified logging-a multi-case study. Future Internet 274
Krippendorff K (2011) Computing krippendorff’s alpha-reliability
Kubacki M, Sosnowski J (2016) Multidimensional log analysis. In: Proceedings of the 2016 european dependable computing conference (EDCC), pp 193–196
Kubacki M, Sosnowski J (2017) Holistic processing and exploring event logs. In: Proceedings of the 2017 international workshop of software engineering for resilient systems, pp 184–200
Kurniawan K, Ekelhart A, Kiesling E, Winkler D, Quirchmayr G, Tjoa AM (2022) Vlograph: a virtual knowledge graph framework for distributed security log analysis. Mach Learn Know Extr
Lal S, Sardana N, Sureka A (2015) Two level empirical study of logging statements in open source java projects. Int J Open Source Softw Process (IJOSSP)49–73
Lal S, Sardana N, Sureka A (2016) Logoptplus: Learning to optimize logging in catch and if programming constructs. In: Proceedings of the 2016 IEEE annual computer software and applications conference (COMPSAC), pp 215–220
Lal S, Sardana N, Sureka A (2017) Analysis and prediction of log statement in open source java projects. Buenos Aires, Argentina p 65
Lal S, Sardana N, Sureka A (2019) Three-level learning for improving cross-project logging prediction for if-blocks. J King Saud Univ Comput Inf Sci 481–496
Lal S, Sardana N, Sureka A (2020) Improving logging prediction on imbalanced datasets: A case study on open source java projects. In: Cognitive analytics: concepts, methodologies, tools, and applications, pp 740–772
Lal S, Sureka A (2016) Logopt: Static feature extraction from source code for automated catch block logging prediction. In: Proceedings of the 2016 india software engineering conference, pp 151–155
Landauer M, Wurzenberger M, Skopik F, Settanni G, Filzmoser P (2018) Dynamic log file analysis: An unsupervised cluster evolution approach for anomaly detection. Comput Secur 94–116
Lee KH, Zhang X, Xu D (2013) Loggc: Garbage collecting audit log. In: Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pp 1005–1016
Li Z, Chen TH, Shang W (2020) Where shall we log? studying and suggesting logging locations in code blocks. In: Proceedings of the 2020 IEEE/ACM international conference on automated software engineering, pp 361–372
Li H, Chen TH, Shang W, Hassan AE (2018) Studying software logging using topic models. Empir Softw Eng 2655–2694
Li Z, Chen TH, Yang J, Shang W (2016) Dlfinder: Characterizing and detecting duplicate logging code smells. In: Proceedings of the 2019 IEEE/ACM international conference on software engineering (ICSE), pp 877–887
Li Z, Chen TH, Yang J, Shang W (2021) Studying duplicate logging statements and their relationships with code clones. J Syst Softw 2476–2494
Li Y, Jiang Y, Gu J, Lu M, Yu M, Armstrong EM, Huang T, Moroni D, McGibbney LJ, Frank G, Yang C et al (2019) A cloud-based framework for large-scale log mining through apache spark and elasticsearch. Appl Sci 1114
Li T, Jiang Y, Zeng C, Xia B, Liu Z, Zhou W, Zhu X, Wang W, Zhang L, Wu J, et al (2017) Flap: An end-to-end event log analysis platform for system management. In: Proceedings of the 2017 ACM SIGKDD international conference on knowledge discovery and data mining, pp 1547–1556
Li Z, Li H, Chen TH, Shang W (2021) Deeplv: Suggesting log levels using ordinal based neural networks. In: Proceedings of the 2021 IEEE/ACM international conference on software engineering (ICSE), pp 1461–1472
Li S, Niu X, Jia Z, Liao X, Wang J, Li T (2020) Guiding log revisions by learning from software evolution history. Empir Softw Eng 2302–2340
Li S, Niu X, Jia Z, Wang J, He H, Wang T (2018) Logtracker: Learning log revision behaviors proactively from software evolution history. In: Proceedings of the 2018 conference on program comprehension, pp 178–188
Lin X, Wang P, Wu B (2013) Log analysis in cloud computing environment with hadoop and spark. In: Proceedings of the 2013 IEEE international conference on broadband network & multimedia technology, pp 273–276
Lin Q, Zhang H, Lou JG, Zhang Y, Chen X (2016) Log clustering based problem identification for online service systems. In: Proceedings of the 2016 IEEE/ACM international conference on software engineering companion (ICSE-C), pp 102–111
Lin H, Zhou J, Yao B, Guo M, Li J (2015) Cowic: A column-wise independent compression for log stream analysis. In: Proceedings of the 2015 IEEE/ACM international symposium on cluster, cloud and grid computing, pp 21–30
Li H, Shang W, Adams B, Sayagh M, Hassan A (2020) A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Trans Softw Eng
Li H, Shang W, Hassan AE (2017) Which log level should developers choose for a new logging statement? Empir Softw Eng 1684–1716
Li H, Shang W, Zou YE, Hassan A (2017) Towards just-in-time suggestions for log changes. Empir Softw Eng 1831–1865
Liu X, Jia T, Li Y, Yu H, Yue Y, Hou C (2020) Automatically generating descriptive texts in logging statements: How far are we? In: Proceedings of the 2020 programming languages and systems: asian symposium, pp 251–269
Liu Z, Xia X, Lo D, Xing Z, Hassan AE, Li S (2019) Which variables should i log? IEEE Trans Softw Eng 2012–2031
Liu Y, Zhang X, He S, Zhang H, Li L, Kang Y, Xu Y, Ma M, Lin Q, Dang Y, et al (2022) Uniparser: A unified log parser for heterogeneous log data. In: Proceedings of the 2022 ACM web conference, pp 1893–1901
Liu J, Zhu J, He S, He P, Zheng Z, Lyu MR (2019) Logzip: Extracting hidden structures via iterative clustering for log compression. In: Proceedings of the 2019 IEEE/ACM international conference on automated software engineering (ASE), pp 863–873
Li X, Wang Y, Feng H, Ke W (2018) A parallel host log analysis approach based on spark. In: Proceedings of the 2018 international conference on computational intelligence and security (CIS), pp 301–305
Li X, Wang T, Wang S (2021) Pattern-based deep learning method to extract information from the log dataset. J Circuits Syst Comput 2150296
Locke S, Li H, Chen TH, Shang W, Liu W (2021) Logassist: Assisting log analysis through log summarization. IEEE Trans Softw Eng
Lupton S, Washizaki H, Yoshioka N, Fukazawa Y (2021) Online log parsing: Preliminary literature review. In: Proceedings of the 2021 IEEE international symposium on software reliability engineering workshops (ISSREW), pp 304–305
Makanju A, Zincir-Heywood AN, Milios EE (2011) A lightweight algorithm for message type extraction in system application logs. J Syst Softw 1921–1936
Marjai P, Lehotay-Kéry P, Kiss A (2021) The use of template miners and encryption in log message compression. Computers 83
Marjai P, Lehotay-Kéry P, Kiss A (2022) A novel dictionary-based method to compress log files with different message frequency distributions. Appl Sci 2044
Marlaithong T, Barroso VC, Phunchongharn P (2021) A hyperparameter tuning approach for an online log parser. In: Proceedings of the 2021 international conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), pp 1036–1040
Mastropaolo A, Pascarella L, Bavota G (2022) Using deep learning to generate complete log statements. In: Proceedings of the 2022 international conference on software engineering, pp 2279–2290
Mavridis I, Karatza H (2017) Performance evaluation of cloud-based log file analysis with apache hadoop and apache spark. J Syst Softw 133–151
Mendes E, Petrillo F (2021) Log severity levels matter: A multivocal mapping. In: Proceedings of the 2021 IEEE international conference on software quality, reliability and security (QRS), pp 1002–1013
Meng W, Liu Y, Huang Y, Zhang S, Zaiter F, Chen B, Pei D (2020) A semantic-aware representation framework for online log analysis. In: Proceedings of the 2020 international conference on computer communications and networks (ICCCN), pp 1–7
Meng W, Liu Y, Zaiter F, Zhang S, Chen Y, Zhang Y, Zhu Y, Wang E, Zhang R, Tao S, et al (2020) Logparse: Making log parsing adaptive through word classification. In: Proceedings of the 2020 international conference on computer communications and networks (ICCCN), pp 1–9
Meng W, Zaiter F, Huang Y, Liu Y, Zhang S, Zhang Y, Zhu Y, Zhang T, Wang E, Ren Z, et al (2020) Summarizing unstructured logs in online services. arXiv:2012.08938
Meng W, Zaiter F, Zhang Y, Liu Y, Zhang S, Tao S, Zhu Y, Han T, Zhao Y, Wang E, et al (2023) Logsummary: Unstructured log summarization for software systems. IEEE Trans Netw Serv Manag
Messaoudi S, Panichella A, Bianculli D, Briand L, Sasnauskas R (2018) A search-based approach for accurate identification of log message formats. In: Proceedings of the 2018 IEEE/ACM international conference on program comprehension (ICPC), pp 167–16710
Miranskyy A, Hamou-Lhadj A, Cialini E, Larsson A (2016) Operational-log analysis for big data systems: Challenges and solutions. IEEE Softw 52–59
Mizouchi T, Shimari K, Ishio T, Inoue K (2019) Padla: a dynamic log level adapter using online phase detection. In: Proceedings of the 2019 IEEE/ACM international conference on program comprehension (ICPC), pp 135–138
Mizutani M (2013) Incremental mining of system log format. In: Proceedings of the 2013 IEEE international conference on services computing, pp 595–602
Nagappan M, Vouk MA (2017) Abstracting log lines to log event types for mining software system logs. In: Proceedings of the 2010 working conference on mining software repositories, pp 71–81
Narkhede S, Baraskar T (2013) Hmr log analyzer: Analyze web application logs over hadoop mapreduce. Int J UbiComp p 41
Nedelkoski S, Bogatinovski J, Acker A, Cardoso J, Kao O (2021) Self-supervised log parsing. In: Proceedings of the 2021 european conference on machine learning and knowledge discovery in databases, pp 122–138
Ning, X., Jiang G, Chen H, Yoshihira K (2014) Hlaer: A system for heterogeneous log analysis. In: Proceedings of the 2014 SDM workshop on heterogeneous learning, 1
Obrȩbski D, Sosnowski J (2020) Log based analysis of software application operation. In: Proceedings of the 2020 international conference on dependability of computer systems, pp 371–382
Ouatiti YE, Sayagh M, Kerzazi N, Hassan AE (2022) An empirical study on log level prediction for multi-component systems. IEEE Trans Softw Eng 1–1
Patel K, Faccin J, Hamou-Lhadj A, Nunes I (2022) The sense of logging in the Linux kernel. Empir Softw Eng 153
Pecchia A, Cinque M, Carrozza G, Cotroneo D (2015) Industry practices and event logging: Assessment of a critical software development process. In: Proceedings of the 2012 IEEE annual computer software and applications conference, pp 169–178
Pi A, Chen W, Zeller W, Zhou X (2019) It can understand the logs, literally. In: Proceedings of the 2019 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 446–451
Plaisted D, Xie M (2022) Dip: A log parser based on disagreement index token conditions. In: Proceedings of the 2022 ACM southeast conference, pp 113–122
Platini M, Ropars T, Pelletier B, De Palma N (2021) Logflow: Simplified log analysis for large scale systems. In: Proceedings of the 2021 international conference on distributed computing and networking, pp 116–125
Portillo-Dominguez AO, Ayala-Rivera V (2019) Towards an efficient log data protection in software systems through data minimization and anonymization. In: Proceedings of the 2019 international conference in software engineering research and innovation (CONISOFT), pp 107–115
Pourmajidi W, Zhang L, Steinbacher J, Erwin T, Miranskyy A (2021) Immutable log storage as a service on private and public blockchains. IEEE Trans Serv Comput
Prayurahong P, Phunchongharn P, Barroso VC (2022) A topic modeling for alice’s log messages using latent dirichlet allocation. In: Proceedings of the 2022 IEEE international conference on knowledge innovation and invention (ICKII), pp 75–82
Raffety J, Stone B, Svacina J, Woodahl C, Cerny T, Tisnovsky P (2021) Multi-source log clustering in distributed systems. In: Proceedings of the 2021 information science and applications ICISA, pp 31–41
Rand J, Miranskyy A (2021) On automatic parsing of log records. In: Proceedings of the 2021 IEEE/ACM international conference on software engineering: new ideas and emerging results (ICSE-NIER), pp 41–45
Raynal M, Buob MO, Quénot G (2022) A novel pattern-based edit distance for automatic log parsing. In: Proceedings of the 2022 international conference on pattern recognition (ICPR), pp 1236–1242
Rivera-Ortiz F (2022) Engineering forensic-ready software systems using automated logging. In: Proceedings of the 2022 REFSQ Workshops
Rivera-Ortiz F, Pasquale L (2020) Automated modelling of security incidents to represent logging requirements in software systems. In: Proceedings of the 2020 international conference on availability, reliability and security, pp 1–8
Rodrigues K, Luo Y, Yuan D (2021) Clp: Efficient and scalable search on compressed text logs. In: Proceedings of the 2021 OSDI, pp 183–198
Rong G, Gu S, Zhang H, Shao D, Liu W (2018) How is logging practice implemented in open source software projects? a preliminary exploration. In: Proceedings of the 2018 australasian software engineering conference (ASWEC), pp 171–180
Rong G, Xu Y, Gu S, Zhang H, Shao D (2018) Can you capture information as you intend to? a case study on logging practice in industry. In: Proceedings of the 2020 IEEE international conference on software maintenance and evolution (ICSME), pp 171–180
Rong G, Zhang Q, Liu X, Gu S (2017) A systematic review of logging practice in software engineering. In: Proceedings of the 2017 Asia-Pacific software engineering conference (APSEC), pp 534–539
Rosenberg CM, Moonen L (2018) Improving problem identification via automated log clustering using dimensionality reduction. In: Proceedings of the 2018 ACM/IEEE international symposium on empirical software engineering and measurement, pp 1–10
Rücker N, Maier A (2022) Flexparser-the adaptive log file parser for continuous results in a changing world. J Softw Evol Process e2426
Sadeghi MA, Parambath S, Lucas J, Meguebli Y, Toure M, Al Qahtani F, Yu T, Chawla S (2021) Log representation as an interface for log processing applications. J Inf Secur Appl 103021
Schipper D, Aniche M, van Deursen A (2019) Tracing back log data to its log statement: From research to practice. In: Proceedings of the 2019 IEEE/ACM international conference on mining software repositories (MSR), pp 545–549
Sedki I, Hamou-Lhadj A, Ait-Mohamed O, Shehab MA (2022) An effective approach for parsing large log files. In: Proceedings of the 2022 IEEE international conference on software maintenance and evolution (ICSME), pp 1–12
Serasinghe S, Shen H, Chen D (2017) ilse: An intelligent web-based system for log structuring and extraction. In: Proceedings of the 2017 asia-pacific software engineering conference (APSEC), pp 588–593
Setayeshfar O, Adkins C, Jones M, Lee KH, Doshi P (2021) Graalf: Supporting graphical analysis of audit logs for forensics. Softw Impacts 100068
Setianto F, Tsani E, Sadiq F, Domalis G, Tsakalidis D, Kostakos P (2021) Gpt-2c: A parser for honeypot logs using large pre-trained language models. In: Proceedings of the 2021 IEEE/ACM international conference on advances in social networks analysis and mining, pp 649–653
Shang W, Nagappan M, Hassan AE, Jiang ZM (2014) Understanding log lines using development knowledge. In: Proceedings of the 2014 IEEE international conference on software maintenance and evolution, pp 21–30
Shehu Y, Harper R (2022) Enhancements to language modeling techniques for adaptable log message classification. IEEE Trans Netw Serv Manag
Skopik F, Wurzenberger M, Landauer M (2021) Smart Log Data Analytics. Springer
Spillner J (2020) Comparison and model of compression techniques for smart cloud log file handling. In: Proceedings of the 2020 international conference on communications, computing, cybersecurity, and informatics (CCCI), pp 1–6
Sun J, Liu B, Hong Y (2020) Logbug: Generating adversarial system logs in real time. In: Proceedings of the 2020 ACM international conference on information & knowledge management, pp 2229–2232
Svacina J, Raffety J, Woodahl C, Stone B, Cerny T, Bures M, Shin D, Frajtak K, Tisnovsky P (2020) On vulnerability and security log analysis: A systematic literature review on recent trends. In: Proceedings of the 2020 international conference on research in adaptive and convergent systems, pp 175–180
Tak B, Han WS (2021) Lognroll: Discovering accurate log templates by iterative filtering. In: Proceedings of the 2021 international middleware conference, pp 273–285
Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2022) A tool for rejuvenating feature logging levels via git histories and degree of interest. In: Proceedings of the 2022 ACM/IEEE international conference on software engineering: companion proceedings, pp 21–25
Tang Y, Spektor A, Khatchadourian R, Bagherzadeh M (2022) Automated evolution of feature logging statement levels using git histories and degree of interest. Sci Comput Program 102724
Tao S, Meng W, Cheng Y, Zhu Y, Liu Y, Du C, Han T, Zhao Y, Wang X, Yang H (2022) Logstamp: Automatic online log parsing based on sequence labelling. ACM SIGMETRICS Perform Eval Rev 93–98
Tian R, Diao Z, Jiang H, Xie G (2022) Logdac: A universal efficient parser-based log compression approach. In: ICC 2022-IEEE international conference on communications, pp 3679–3684
Tovarnák D (2019) An algorithm for message type discovery in unstructured log data. In: Proceedings of the 2019 ICSOFT, pp 665–676
Tovarnák D, Vaekova A, Novák S, Pitner T (2013) Structured and interoperable logging for the cloud computing era: The pitfalls and benefits. In: Proceedings of the 2013 IEEE/ACM international conference on utility and cloud computing, pp 91–98
Tschudin PS, Lawall J, Muller G (2015) 3l: Learning linux logging. In: Proceedings of the 2015 Belgian-netherlands software evolution seminar (BENEVOL 2015)
Vaarandi R, Pihelgas M (2015) Logcluster - a data clustering and pattern mining algorithm for event logs. In: Proceedings of the 2015 International conference on network and service management (CNSM), pp 1–7
Varanda A, Santos L, Costa RL, Oliveira A, Rabadão C (2021) Log pseudonymization: Privacy maintenance in practice. J Inf Secur Appl 103021
Varanda A, Santos L, Costa RL, Oliveira A, Rabadão C (2021) The general data protection regulation and log pseudonymization. In: Proceedings of the 2021 international conference on advanced information networking and applications (AINA-2021), pp 479–490
Vervaet A, Chiky R, Callau-Zori M (2021) Ustep: Unfixed search tree for efficient log parsing. In: Proceedings of the 2021 IEEE international conference on data mining (ICDM), pp 659–668
Wagner T, Schkufza E, Wieder (2016) A sampling-based approach to accelerating queries in log management systems. In: Proceedings of the 2016 ACM SIGPLAN international conference on systems, programming, languages and applications: software for humanity, pp 37–38
Wang H, Yang D, Duan N, Guo Y, Zhang L (2018) Medusa: Blockchain powered log storage system. In: Proceedings of the 2018 IEEE International Conference on Software Engineering and Service Science (ICSESS), pp 518–521
Wang Y, Zheng Q (2021) A logging overhead optimization method based on anomaly detection model. In: Proceedings of the 2021 human centered computing international conference, pp 349–359
Weibin, M., Ying, L., Yichen, Z., Shenglin, Z., Dan, P., Yuqing, L., Yihao, C., Ruizhi, Z., Shimin, T., Pei, S., et al (2019) Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In: Proceedings of the 2019 international joint conference on artificial intelligence, pp 4739–4745
Wei J, Zhang G, Chen J, Wang Y, Zheng W, Sun T, Wu J, Jiang J (2023) Loggrep: Fast and cheap cloud log storage by exploiting both static and runtime patterns. IEEE Trans Softw Eng
Wei J, Zhang G, Wang Y, Liu Z, Zhu Z, Chen J, Sun T, Zhou Q (2021) On the feasibility of parser-based log compression in large-scale cloud systems. In: FAST, pp 249–262
Wen P, Zhang Z, Deng B (2020) Olmpt: research on online log parsing method based on prefix tree. In: Proceedings of the 2020 international conference on information technologies and electrical engineering, pp 55–59
Xiao T, Quan Z, Wang ZJ, Zhao K, Liao X (2020) Lpv: A log parser based on vectorization for offline and online log parsing. In: Proceedings of the 2020 IEEE international conference on data mining (ICDM), pp 1346–1351
Xie X, Wang Z, Xiao X, Lu Y, Huang S, Li T (2021) A confidence-guided evaluation for log parsers inner quality. Mobile Netw Appl 1638–1649
Xie Y, Yang K, Luo P (2021) Logm: Log analysis for multiple components of hadoop platform. IEEE Trans Softw Eng 73522–73532
Xu, Z., Kirk, R., Yu, L., Michael, S., Ding, Y., Yuanyuan, Z (2017) The game of twenty questions: Do you know where to log? In: Proceedings of the 2017 workshop on hot topics in operating systems, pp 125–131
Xu, N., Shanshan, L., Zhouyang, J., Shulin, Z., Wang, L., Xiangke, L (2018) Understanding the similarity of log revision behaviors in open source software. J Circuits Syst Comput 1887
Xu W, Huang L, Fox A, Patterson D, Jordan MI (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the 2009 symposium on operating systems principles, pp 117–132
Yang N, Cuijpers P, Hendriks D, Schiffelers R, Lukkien J, Serebrenik A (2023) An interview study about the use of logs in embedded software engineering. Empir Softw Eng 43
Yang S, Park SJ, Ousterhout J (2018) Nanolog: A nanosecond scale logging system. In: Proceedings of the 2018 \(\{\)USENIX\(\}\) Annual Technical Conference (\(\{\)USENIX\(\}\)\(\{\)ATC\(\}\) 18), pp 335–350
Yang R, Qu D, Qian Y, Dai Y, Zhu S (2019) An online log template extraction method based on hierarchical clustering. EURASIP J Wirel Commun Netw 1–12
Yang J, Zhang Y, Zhang S, He D (2013) Mass flow logs analysis system based on hadoop. In: Proceedings of the 2013 IEEE international conference on broadband network & multimedia technology, pp 115–118
Yao K, Li H, Shang W, Hassan AE (2020) A study of the performance of general compressors on log files. Empir Softw Eng 3043–3085
Yao K, Sayagh M, Shang W, Hassan AE (2021) Improving state-of-the-art compression techniques for log management tools. IEEE Trans Softw Eng
Yen S, Moh M (2021) Intelligent log analysis using machine and deep learning. In: Research anthology on artificial intelligence applications in security, pp 1154–1182
Yuan D, Park S, Huang P, Liu Y, Lee MM, Tang X, Zhou Y, Savage S (2012) Be conservative: Enhancing failure diagnosis with proactive logging. In: Proceedings of the 2012 \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{\)OSDI\(\}\) 12), pp 293–306
Zawoad S, Dutta AK, Hasan R (2013) Seclaas: Secure logging-as-a-service for cloud forensics. In: Proceedings of the 2013 ACM SIGSAC symposium on Information, computer and communications security, pp 219–230
Zeng Y, Chen J, Shang W, Chen TH (2019) Studying the characteristics of logging practices in mobile apps: A case study on f-droid. Empir Softw Eng 3394–3434
Zhang J, Li Z, Zhang X, Lin F, Wang C, Cai X (2022) Posbert: Log classification via modified bert based on part-of-speech weight. In: Proceedings of the 2022 international conference on pattern recognition and artificial intelligence (PRAI), pp 979–983
Zhang H, Tang Y, Lamothe M, Li H, Shang W (2022) Studying logging practice in test code. Empir Softw Eng 83
Zhang S, Wu G (2021) Efficient online log parsing with log punctuations signature. Appl Sci 11974
Zhang L, Xie X, Xie K, Wang Z, Lu Y, Zhang Y (2019) An efficient log parsing algorithm based on heuristic rules. In: Proceedings of the 2019 advanced parallel processing technologies: international symposium, pp 123–134
Zhao X, Rodrigues K, Luo Y, Stumm M, Yuan D, Zhou Y (2017) Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. In: Proceedings of the 2017 symposium on operating systems principles, pp 565–581
Zhao Z, Wang C, Rao W (2018) Slop: Towards an efficient and universal streaming log parser. In: Proceedings of the 2018 international conference on information and communications security, pp 325–341
Zhao Y, Wang X, Xiao H, Chi X (2018) Improvement of the log pattern extracting algorithm using text similarity. In: Proceedings of the 2018 IEEE international parallel and distributed processing symposium workshops (IPDPSW), pp 507–514
Zhi C, Deng S, Han J, Yin J (2022) Towards automatic detection and prioritization of pre-logging overhead: A case study of hadoop ecosystem. Autom Softw Eng 11
Zhi C, Yin J, Deng S, Ye M, Fu M, Xie T (2019) An exploratory study of logging configuration practice in java. In: Proceedings of the 2019 IEEE international conference on software maintenance and evolution (ICSME), pp 459–469
Zhi C, Yin J, Han J, Deng S (2020) A preliminary study on sensitive information exposure through logging. In: Proceedings of the 2020 Asia-Pacific software engineering conference (APSEC), pp 470–474
Zhong Y, Guo Y, Liu C (2018) Flp: a feature-based method for log parsing. Electron Lett 1334–1336
Zhou R, Hamdaqa M, Cai H, Hamou-Lhadj A (2020) Mobilogleak: A preliminary study on data leakage caused by poor logging practices. In: Proceedings of the 2020 IEEE international conference on software analysis, evolution and reengineering (SANER), pp 577–581
Zhu YQ, Deng JY, Pu JC, Wang P, Liang S, Wang W (2022) Ml-parser: An efficient and accurate online log parser. J Comput Sci Technol 1412–1426
Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D (2015) Learning to log: Helping developers make informed logging decisions. In: Proceedings of the 2015 IEEE/ACM IEEE international conference on software engineering, pp 415–425
Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu MR (2019) Tools and benchmarks for automated log parsing. In: Proceedings of the 2019 IEEE/ACM international conference on software engineering: software engineering in practice (ICSE-SEIP), pp 121–130
Zhu J, Rong G, Huang G, Gu S, Zhang H, Shao D (2019) Jllar: A logging recommendation plug-in tool for java. In: Proceedings of the 2019 asia-pacific symposium on internetware, pp 1–6
Zou F, Chen X, Luo Y, Huang T, Liao Z, Song K (2022) Spray: Streaming log parser for real-time analysis. Secur Commun Netw
Zuo Y, Zhu X, Qin J, Yao W (2021) Temporal relations extraction and analysis of log events for micro-service framework. In: Proceedings of the 2021 chinese control conference (CCC), pp 3391–3396
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by: Xin Peng.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Batoun, M.A., Sayagh, M., Aghili, R. et al. A literature review and existing challenges on software logging practices. Empir Software Eng 29, 103 (2024). https://doi.org/10.1007/s10664-024-10452-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-024-10452-w