[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3318464.3386140acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Database Workload Capacity Planning using Time Series Analysis and Machine Learning

Published: 31 May 2020 Publication History

Abstract

When procuring or administering any I.T. system or a component of an I.T. system, it is crucial to understand the computational resources required to run the critical business functions that are governed by any Service Level Agreements. Predicting the resources needed for future consumption is like looking into the proverbial crystal ball. In this paper we look at the forecasting techniques in use today and evaluate if those techniques are applicable to the deeper layers of the technological stack such as clustered database instances, applications and groups of transactions that make up the database workload. The approach has been implemented to use supervised machine learning to identify traits such as reoccurring patterns, shocks and trends that the workloads exhibit and account for those traits in the forecast. An experimental evaluation shows that the approach we propose reduces the complexity of performing a forecast, and accurate predictions have been produced for complex workloads.

Supplementary Material

MP4 File (3318464.3386140.mp4)
Presentation Video

References

[1]
2019. Forecasting Time Series with Multiple Seasonalities using TBATS in Python. https://medium.com/intive-developers/forecastingtime- series-with-multiple-seasonalities-using-tbats-in-python- 398a00ac0e8a
[2]
George E. P. Box. 2008. Time series analysis forecasting and control (fourth edition. ed.). Hoboken, N.J.
[3]
RG Brown. 1959. Statistical forecasting for inventory control. http: //documents.irevues.inist.fr/handle/2042/28540.
[4]
Jason Brownlee. 2014. Machine learning mastery. URL: http://machinelearningmastery.com/discover-feature-engineeringhowtoengineer-features-and-how-to-getgood-at-it (2014).
[5]
R. N. Calheiros, E. Masoumi, R. Ranjan, and R. Buyya. 2015. Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications in QoS. IEEE Transactions on Cloud Computing 3, 4 (Oct 2015), 449--458. https://doi.org/10.1109/TCC.2014.2350475
[6]
M. Carvalho, D. MenascÃ, and F. Brasileiro. 2015. Prediction-Based Admission Control for IaaS Clouds with Multiple Service Classes. In 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom). 82--90. https://doi.org/10.1109/CloudCom.2015.16
[7]
S. Chaisiri, B. Lee, and D. Niyato. 2010. Robust cloud resource provisioning for cloud computing environments. In 2010 IEEE International Conference on Service-Oriented Computing and Applications (SOCA). 1--8. https://doi.org/10.1109/SOCA.2010.5707147
[8]
Jennie Duggan, Olga Papaemmanouil, Ugur Cetintemel, and Eli Upfal. 2014. Contender: A Resource Modeling Approach for Concurrent Query Performance Prediction. In EDBT. 109--120.
[9]
Dominic Giles. 2019. SwingBench 2.2 Reference and User Guide.
[10]
D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper. 2007. Capacity Management and Demand Prediction for Next Generation Data Centers. In IEEE International Conference on Web Services (ICWS 2007). 43--50. https://doi.org/10.1109/ICWS.2007.62
[11]
D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper. 2007.Workload Analysis and Demand Prediction of Enterprise Data Center Applications. In 2007 IEEE 10th International Symposium on Workload Characterization. 171--180. https://doi.org/10.1109/IISWC.2007.4362193
[12]
Antony Higginson, Norman W. Paton, Suzanne M. Embury, and Clive Bostock. 2017. DBaaS Cloud Capacity Planning - Accounting for Dynamic RDBMS System that Employ Clustering and Standby Architectures. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT. 687--698. https://doi.org/10.5441/002/edbt.2017.89
[13]
CE Holt. 1957. Forecasting seasonals and trends by exponentially weighted averages.
[14]
Rob. J. Hyndman. 2014. TBATS with regressors. https://robjhyndman.com/hyndsight/tbats-with-regressors.
[15]
Rob J. Hyndman, George Athanasopoulos, and OTexts.com. 2014 2014. Forecasting : principles and practice / Rob J Hyndman and George Athanasopoulos (print edition. ed.). 291 pages ; pages.
[16]
Brendan Jennings and Rolf Stadler. 2015. Resource Management in Clouds: Survey and Research Challenges. Journal of Network and Systems Management 23, 3 (01 Jul 2015), 567--619. https://doi.org/10.1007/s10922-014--9307--7
[17]
Y. Jiang, C. Perng, T. Li, and R. Chang. 2012. Self-Adaptive Cloud Capacity Planning. In 2012 IEEE Ninth International Conference on Services Computing. 73--80. https://doi.org/10.1109/SCC.2012.8
[18]
Stephan Kraft, Giuliano Casale, Diwakar Krishnamurthy, Des Greer, and Peter Kilpatrick. 2013. Performance models of storage contention in cloud environments. Software & Systems Modeling 12, 4 (01 Oct 2013), 681--704. https://doi.org/10.1007/s10270-012-0227--2
[19]
Stefan Krompass, Harumi Kuno, Umeshwar Dayal, and Alfons Kemper. 2007. Dynamic Workload Management for Very Large Data Warehouses: Juggling Feathers and Bowling Balls. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB '07). VLDB Endowment, 1105--1115. http://dl.acm.org/citation.cfm?id=1325851. 1325976
[20]
Alysha M. De Livera, Rob J. Hyndman, and Ralph D. Snyder. 2011. Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing. J. Amer. Statist. Assoc. 106, 496 (2011), 1513--1527.
[21]
Tania Lorido-Botran, Jose Miguel-Alonso, and Jose A. Lozano. 2014. A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments. Journal of Grid Computing 12, 4 (01 Dec 2014), 559--592. https://doi.org/10.1007/s10723-014--9314--7
[22]
S. Sakr and A. Liu. 2012. SLA-Based and Consumer-centric Dynamic Provisioning for Cloud Databases. In 2012 IEEE Fifth International Conference on Cloud Computing. 360--367. https://doi.org/10.1109/CLOUD.2012.11
[23]
J. Schaffner, B. Eckart, D. Jacobs, C. Schwarz, H. Plattner, and A. Zeier. 2011. Predicting in-memory database performance for automating cluster management tasks. In 2011 IEEE 27th International Conference on Data Engineering. 1264--1275. https://doi.org/10.1109/ICDE.2011.5767936
[24]
Yogesh Simmhan, Saima Aman, Alok Kumbhare, Rongyang Liu, Sam Stevens, Qunzhi Zhou, and Viktor Prasanna. 2013-07. Cloud-Based Software Platform for Big Data Analytics in Smart Grids. Computing in Science Engineering 15, 4 (2013-07), 38,47.
[25]
Grzegorz Skorupa. 2019. TBATS implementation in Python. https://github.com/intive-DataScience/tbats.
[26]
Flavio R. C. Sousa, Leonardo O. Moreira, JosÃ? S. Costa Filho, and Javam C. Machado. 2018. Predictive elastic replication for multi-tenant databases in the cloud. Concurrency and Computation: Practice and Experience 30, 16 (2018), e4437. e4437 cpe.4437.
[27]
V. G. Tran, V. Debusschere, and S. Bacha. 2012. Hourly server workload forecasting up to 168 hours ahead using Seasonal ARIMA model. In 2012 IEEE International Conference on Industrial Technology. 1127--1131. https://doi.org/10.1109/ICIT.2012.6210091
[28]
Norbert Wiener. 1950. Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications. http://hdl.handle.net/2027/uc1.b4062686
[29]
Wikipedia contributors. 2019. Makridakis Competitions - Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Makridakis_Competitions&oldid=903376442 [Online; accessed 1-August-2019].
[30]
PR Winters. 1960. Forecasting sales by exponentially weighted moving averages. In Management Science, Management Science (Ed.).

Cited By

View all
  • (2024)Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDBCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653381(255-268)Online publication date: 9-Jun-2024
  • (2024)One Teacher is Enough: A Server-Clueless Federated Learning With Knowledge DistillationIEEE Transactions on Services Computing10.1109/TSC.2024.341437217:5(2704-2718)Online publication date: Sep-2024
  • (2024)TPGraph: A Spatial-Temporal Graph Learning Framework for Accurate Traffic Prediction on Arterial RoadsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.333455825:5(3911-3926)Online publication date: May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
June 2020
2925 pages
ISBN:9781450367356
DOI:10.1145/3318464
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DBAAS
  2. database capacity planning
  3. forecasting
  4. machine learning
  5. supervised learning
  6. time series analysis

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)13
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDBCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653381(255-268)Online publication date: 9-Jun-2024
  • (2024)One Teacher is Enough: A Server-Clueless Federated Learning With Knowledge DistillationIEEE Transactions on Services Computing10.1109/TSC.2024.341437217:5(2704-2718)Online publication date: Sep-2024
  • (2024)TPGraph: A Spatial-Temporal Graph Learning Framework for Accurate Traffic Prediction on Arterial RoadsIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.333455825:5(3911-3926)Online publication date: May-2024
  • (2024)Robust Auto-Scaling with Probabilistic Workload Forecasting for Cloud Databases2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00308(4016-4029)Online publication date: 13-May-2024
  • (2024)Log Replaying for Real-Time HTAP: An Adaptive Epoch-Based Two-Stage Framework2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00167(2096-2108)Online publication date: 13-May-2024
  • (2024)Towards Exploratory Query Optimization for Template-Based SQL Workloads2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00019(151-164)Online publication date: 13-May-2024
  • (2024)Hammer: A General Blockchain Evaluation Framework2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00044(391-402)Online publication date: 23-Jul-2024
  • (2023)MLGNet: A Multi-Period Local and Global Temporal Dynamic Pattern Integration Network for Long-Term Forecasting2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC53992.2023.10394530(4028-4033)Online publication date: 1-Oct-2023
  • (2023)Deep Learning-based Workload Prediction in Cloud Computing to Enhance the Performance2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC)10.1109/ICSCCC58608.2023.10176790(635-640)Online publication date: 26-May-2023
  • (2023)DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00385(27-39)Online publication date: Apr-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media