[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1353343.1353420acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

BI batch manager: a system for managing batch workloads on enterprise data-warehouses

Published: 25 March 2008 Publication History

Abstract

Modern enterprise data warehouses have complex workloads that are notoriously difficult to manage. An important problem in workload management is to run these complex workloads 'optimally'. Traditionally this problem has been studied in the OLTP (Online Transaction Processing) context where MPL (Multi Programming Level) is used as a knob to achieve optimality. However, MPL is a tricky knob in a BI (Business Intelligence) scenario, since a low MPL can easily result in underload and a high MPL can easily result in overload and 'thrashing'.
In this work we present BI Batch Manager, a workload management system to run batches of queries 'optimally' on an Enterprise Data Warehouse (EDW). It is comprised of three components: an admission control component, a scheduler and an execution control component. In order to automatically avoid underload and overload, we introduce a novel execution control mechanism, PGM (Priority Gradient Multiprogramming). In PGM, a priority gradient is created for the workload, with each query running at a distinctly different priority level. We demonstrate that this stabilizes the execution of a workload across a wide operating range. We use memory as the controlling factor for our admission control policy -- admitting batches of queries such that their memory requirement equals the available memory on the system. Our scheduling policy of largest memory query as the highest priority query further stabilizes the execution.
We validate our BI Batch Manager using varying workloads on a commercial, enterprise class DBMS. We show that it effectively avoids underload and overload (thrashing) and can automatically run BI workloads with 'optimal' performance.

References

[1]
{BLAK82} Blake, R. 1982. Optimal control of thrashing. In Proceedings of the 1982 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems (Seattle, Washington, August 30 - September 01, 1982). SIGMETRICS '82. ACM Press, New York, NY, 1--10
[2]
{BOVE00} Daniel Bovet and Marco Cesati,"Understanding the Linux Kernel", O'Reilly, Oct 2000
[3]
{BROW93} Kurt Brown, Michael J. Carey, Miron Livny, "Managing Memory to Meet Multiclass Workload Response Time Goals", Proc of VLDB, p328--341, 1993
[4]
{BROW94} Kurt Brown et al, "Towards Automated Performance Tuning for Complex Workloads", Proc of VLDB, p 72--84, 1994
[5]
{CARE89} Cary, M, et al, "Priority in DBMS Resource Scheduling", Proc VLDB, Amsterdam, 1989
[6]
{CARE90} MJ Carey, et al. "Load control for locking: The 'half-and-half' approach", ACM Symposium on Principles of Database Systems, 1990.
[7]
{CARR81} Richard W. Carr and John Hennessey, "WSClock -- A Simple and Effective Algorithm for Virtual Memory Management", Proc. 8th Sumposium of Operating Systems, 15 pt 5, 1981
[8]
{CHEN01} Chen, X., et al. "An admission control scheme for predictable server response time for web accesses". In Proceedings of the 10th international Conference on World Wide Web (Hong Kong, Hong Kong, May 01--05, 2001). WWW '01.
[9]
{DENN68-WS} Denning, PJ, "The working set model for program behavior", Comm. ACM 11, 5, May 1968, P. 323--333
[10]
{DENN68-TH} Denning, PJ, "Thrashing: Its Causes and Prevention", Proc AFIPS 1968 FJCC 33, p. 915--922
[11]
{DENN76} Peter J. Denning et al, "Optimal Multiprogramming", Acta Informatica, Springer-Verlag, 1976
[12]
{DENN80} Peter J Denning, "Working Sets Past and Present", IEEE Trans Softwar Engrg, 1980
[13]
{DENN95} P. J. Denning, "A short theory of multiprogramming," mascots, p. 2, Third IEEE International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '95), 1995
[14]
{ELNI04} Sameh Elnikety et al, "A method for transparent admission control and request scheduling in e-commerce web sites", WWW2004, May 2004.
[15]
{HEIS91} HU Heiss and R Wagner, "Adaptive load control in transaction processing systems", Proc. VLDB, pages 47--54, 1991
[16]
{JIAN02} Song Jiang, Xiaodong Zhang, "TPF: a dynamic system thrashing protection facility", Soft. Pract. Exper. 2002; 32:295--318
[17]
{JOHN74} Johnson, D., Demers, A., Ullman, J., Garey, M., Graham, R.; Worst-case performance bounds for simple one-dimensional packaging algorithms. SIAM Journal on Computing 3 (December 1974) 299--325
[18]
{KAMR04} Kamra, A.; Misra, V.; Nahum, E. M., "Yaksha: a selftuning controller for managing the performance of 3-tiered Web sites," Quality of Service, 2004. IWQOS 2004. Twelfth IEEE International Workshop on, vol., no., pp. 47--56, 7--9 June 2004
[19]
{LIU03} Xue Liu, et al, "Online Response Rime optimization of Apache Web Server >>, IWQoS 2003, LNCS 2707, pp 461--478, 2003
[20]
{MOEN92} A Moenkeberg and G Weikum, "Performance evaluation of an adaptive and robust load control method for the avoidance of data contention thrashing, Proc of VLDB, p 432--443, 1992
[21]
{PANG94} HweeHwa Pang et al, "Managing Memory for realtime queries", ACM SIGMOD, p 221--232, 1994
[22]
{PANG95} HweeHwa Pang, Michael J. Carey, "Multiclass Query Scheduling in Real-Time Database Systems", IEEE Trans on Knowl. And Data Engrg, Vol 7, No 4, Aug 1995
[23]
{RODR73} Juan Rodriquez-Rosell, Jean-Pierre Dupuy, "The Design, Implementation, and evaluation of a Working Set Dispatcher, Comm 16, 4, April 1973
[24]
{SACC86} G. M. Sacco and M. Schkolnick, "Buffer management in relational database systems", ACM Trans. Database Systems, Vol 11, No 4, Dec 1986, 473--498
[25]
{SCHR06} Bianca Schroeder, et al, "How to Determine a Good Multi-Programming Level for External Scheduling," icde, p. 60, 22nd International Conference on Data Engineering (ICDE'06), 2006
[26]
{SILB05} Silberschatz, Galvin, Gagne, "Operating System Concepts", 7th edition, John Wiley and Sons, pg 315--370
[27]
{SMIT80} Alan Jay Smith, "Multiprogramming and Memory Contention", Software-Practice and Experience, Vol 10, 531--552, 1980
[28]
{WEIK02} Gerhard Weikum et al, "Self-Tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering", Proc of VLDB, 2002.

Cited By

View all
  • (2024)DB Workload Management Through Characterization and Idleness Detection2024 26th International Conference on Advanced Communications Technology (ICACT)10.23919/ICACT60172.2024.10471766(226-231)Online publication date: 4-Feb-2024
  • (2020)Autonomic performance prediction framework for data warehouse queries using lazy learning approachApplied Soft Computing10.1016/j.asoc.2020.106216(106216)Online publication date: Mar-2020
  • (2016)Multi-core column-store parallelization under concurrent workloadProceedings of the 12th International Workshop on Data Management on New Hardware10.1145/2933349.2933350(1-10)Online publication date: 26-Jun-2016
  • Show More Cited By

Index Terms

  1. BI batch manager: a system for managing batch workloads on enterprise data-warehouses

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      EDBT '08: Proceedings of the 11th international conference on Extending database technology: Advances in database technology
      March 2008
      762 pages
      ISBN:9781595939265
      DOI:10.1145/1353343
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 March 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Conference

      EDBT '08

      Acceptance Rates

      Overall Acceptance Rate 7 of 10 submissions, 70%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)25
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 20 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)DB Workload Management Through Characterization and Idleness Detection2024 26th International Conference on Advanced Communications Technology (ICACT)10.23919/ICACT60172.2024.10471766(226-231)Online publication date: 4-Feb-2024
      • (2020)Autonomic performance prediction framework for data warehouse queries using lazy learning approachApplied Soft Computing10.1016/j.asoc.2020.106216(106216)Online publication date: Mar-2020
      • (2016)Multi-core column-store parallelization under concurrent workloadProceedings of the 12th International Workshop on Data Management on New Hardware10.1145/2933349.2933350(1-10)Online publication date: 26-Jun-2016
      • (2015)Performance Prediction for Concurrent Workloads in Distributed Database SystemsAlgorithms and Architectures for Parallel Processing10.1007/978-3-319-27140-8_43(626-639)Online publication date: 16-Dec-2015
      • (2013)Performance and resource modeling in highly-concurrent OLTP workloadsProceedings of the 2013 ACM SIGMOD International Conference on Management of Data10.1145/2463676.2467800(301-312)Online publication date: 22-Jun-2013
      • (2012)Sort-aware query scheduling in database management systemsProceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research10.5555/2399776.2399778(2-10)Online publication date: 5-Nov-2012
      • (2011)Performance prediction for concurrent database workloadsProceedings of the 2011 ACM SIGMOD International Conference on Management of data10.1145/1989323.1989359(337-348)Online publication date: 12-Jun-2011
      • (2011)Predicting completion times of batch query workloads using interaction-aware models and simulationProceedings of the 14th International Conference on Extending Database Technology10.1145/1951365.1951419(449-460)Online publication date: 21-Mar-2011
      • (2011)Interaction-aware scheduling of report-generation workloadsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-011-0217-y20:4(589-615)Online publication date: 1-Aug-2011
      • (2008)Modeling and exploiting query interactions in database systemsProceedings of the 17th ACM conference on Information and knowledge management10.1145/1458082.1458109(183-192)Online publication date: 26-Oct-2008

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media