[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

PLQ: An Efficient Approach to Processing Pattern-Based Log Queries

Published: 01 October 2022 Publication History

Abstract

As software systems grow more and more complex, extensive techniques have been proposed to analyze log data to obtain the insight of the system status. However, during log data analysis, tedious manual efforts are paid to search interesting or informative log patterns from a huge volume of log data, named pattern-based queries. Although existing log management tools and database management systems can also support pattern-based queries, they suffer from low efficiency. To deal with this problem, we propose a novel approach, named PLQ (Pattern-based Log Query). First, PLQ organizes logs into disjoint chunks and builds chunk-wise bitmap indexes for log types and attribute values. Then, based on bitmap indexes, PLQ finds candidate logs with a set of efficient bit-wise operations. Finally, PLQ fetches candidate logs and validates them according to the queried pattern. Extensive experiments are conducted on real-life datasets. According to experimental results, compared with existing log management systems, PLQ is more efficient in querying log patterns and has a higher pruning rate for filtering irrelevant logs. Moreover, in PLQ, since the ratio of the index size to the data size does not exceed 2.5% for log datasets of different sizes, PLQ has a high scalability.

References

[1]
Hamooni H, Debnath B, Xu J, Zhang H, Jiang G, Mueen A. LogMine: Fast pattern recognition for log analytics. In Proc. the 25th ACM International on Conference on Information and Knowledge Management, October 2016, pp.1573-1582.
[2]
He P, Zhu J, Zheng Z, Lyu M R. Drain: An online log parsing approach with fixed depth tree. In Proc. the 2017 IEEE International Conference on Web Services, June 2017, pp.33-40.
[3]
Du M, Li F, Zheng G, Srikumar V. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proc. the 2017 ACM SIGSAC Conference on Computer and Communications Security, October 30-November 3, 2017, pp.1285-1298.
[4]
Lou J G, Fu Q, Yang S, Li J, Wu B. Mining program workflow from interleaved traces. In Proc. the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 2010, pp.613-622.
[5]
Beschastnikh I, Brun Y, Schneider S, Sloan M, Ernst M D. Leveraging existing instrumentation to automatically infer invariant-constrained models. In Proc. the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, September 2011, pp.267-277.
[6]
Yuan D, Park S, Huang P, Liu Y, Lee M M, Tang X, Zhou Y, Savagé S. Be conservative: Enhancing failure diagnosis with proactive logging. In Proc. the 10th USENIX Symposium on Operating Systems Design and Implementation, October 2012, pp.293-306.
[7]
Nagaraj K, Killian C, Neville J. Structured comparative analysis of systems logs to diagnose performance problems. In Proc. the 9th USENIX Symposium on Networked Systems Design and Implementation, April 2012, pp.353-366.
[8]
Gao D, Jensen CS, Snodgrass RT, and Soo MD Join operations in temporal databases The VLDB Journal 2005 14 1 2-29
[9]
Comer DE Ubiquitous B-tree ACM Computing Surveys 1979 11 2 121-137
[10]
Garcia-Molina H, Ullman J, Widom J. Database Systems: The Complete Book (2nd edition). Pearson Education India, 2008.
[11]
Stonebraker M and Rowe LA The design of POSTGRES ACM SIGMOD Record 1986 15 2 340-355
[12]
Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu M R. Tools and benchmarks for automated log parsing. In Proc. the 41st IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice, May 2019, pp.121-130.
[13]
Wu K, Otoo E, Shoshani A. An efficient compression scheme for bitmap indices. Technical Report, Lawrence Berkeley National Laboratory, 2004. https://escholarship.org/uc/item/2sp907t5, November 2020.
[14]
Zhang H, Diao Y, Immerman N. On complexity and optimization of expensive queries in complex event processing. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.217-228.
[15]
Yu J and Sarwat M Two birds, one stone: A fast, yet lightweight, indexing scheme for modern database systems Proceedings of the VLDB Endowment 2016 10 4 385-396
[16]
He B, Hsiao H, Liu Z, Huang Y, and Chen Y Efficient iceberg query evaluation using compressed bitmap index IEEE Transactions on Knowledge and Data Engineering 2012 24 9 1570-1583
[17]
Nguyen X T, Nguyen H T, Hoang T T, Inoue K, Shimojo O, Murayama T, Tominaga K, Pham C K. An efficient FPGA-based database processor for fast database analytics. In Proc. the 2016 IEEE International Symposium on Circuits and Systems, May 2016, pp.1758-1761.
[18]
Demers A J, Gehrke J, Panda B, Riedewald M, Sharma V, White W M. Cayuga: A general purpose event monitoring system. In Proc. the 3rd Biennial Conference on Innovative Data Systems Research, January 2007, pp.412-422.
[19]
Ray M, Rundensteiner E A, Liu M, Gupta C, Wang S, Ari I. High-performance complex event processing using continuous sliding views. In Proc. the 2013 Joint EDBT/ICDT Conferences, March 2013, pp.525-536.
[20]
Duan L, Pang T, Nummenmaa J, Zuo J, Zhang P, and Tang C Bus-OLAP: A data management model for non-on-time events query over bus journey data Data Science and Engineering 2018 3 1 52-67

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Computer Science and Technology
Journal of Computer Science and Technology  Volume 37, Issue 5
Oct 2022
252 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 October 2022
Accepted: 30 November 2020
Received: 21 May 2020

Author Tags

  1. pattern query
  2. log analysis
  3. bitmap index
  4. log pattern

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media