Abstract
Integrating sophisticated statistical methods into database management systems is gaining more and more attention in research and industry in order to be able to cope with increasing data volume and increasing complexity of the analytical algorithms. One important statistical method is time series forecasting, which is crucial for decision making processes in many domains. The deep integration of time series forecasting offers additional advanced functionalities within a DBMS. More importantly, however, it allows for optimizations that improve the efficiency, consistency, and transparency of the overall forecasting process. To enable efficient integrated forecasting, we propose to enhance the traditional 3-layer ANSI/SPARC architecture of a DBMS with forecasting functionalities. This article gives a general overview of our proposed enhancements and presents how forecast queries can be processed using an example from the energy data management domain. We conclude with open research topics and challenges that arise in this area.
Similar content being viewed by others
References
PredictTimeSeries–Microsoft SQL server 2008 books online (2012). http://msdn.microsoft.com/en-us/library/ms132167.aspx
Agarwal D, Chen D, ji Lin L, Shanmugasundaram J, Vee E (2010) Forecasting high-dimensional data. In: SIGMOD conference, pp 1003–1012
Böhm M, Dannecker L, Doms A, Dovgan E, Filipic B, Fischer U, Lehner W, Pedersen TB, Pitarch Y, Siksnys L, Tusar T (2012) Data management in the MIRABEL smart grid system. In: EDBT/ICDT workshops, pp 95–102
Cohen J, Dolan B, Dunlap M, Hellerstein JM, Welton C (2009) MAD skills: new analysis practices for big data. Proc VLDB Endow 2(2):1481–1492
Dannecker L, Böhm M, Lehner W, Hackenbroich G (2011) Forcasting evolving time series of energy demand and supply. In: ADBIS, pp 302–315
Dannecker L, Böhm M, Lehner W, Hackenbroich G (2012) Partitioning and multi-core parallelization of multi-equation forecast models. In: SSDBM, pp 106–123
Dannecker L, Schulze R, Böhm M, Lehner W, Hackenbroich G (2011) Context-aware parameter estimation for forecast models in the energy domain. In: SSDBM, pp 491–508
Das S, Sismanis Y, Beyer KS, Gemulla R, Haas PJ, McPherson J (2010) Ricardo: Integrating R and hadoop. In: SIGMOD conference, pp 987–998
Deshpande A, Madden S (2006) MauveDB: supporting model-based user views in database systems. In: SIGMOD conference, pp 73–84
Duan S, Babu S (2007) Processing forecasting queries. In: VLDB’07, pp 711–722
Dunn D, Williams W, DeChaine T (1976) Aggregate versus subaggregate models in local area forecasting. J Am Stat Assoc 71:68–71
Faerber F, Cha SK, Primsch J, Bornhoevd C, Sigg S, Lehner W (2011) SAP HANA database—data management for modern business applications. SIGMOD Rec 40:45–51
Fischer U, Böhm M, Lehner W (2011) Offline design tuning for hierarchies of forecast models. In: BTW, pp 167–186
Fischer U, Rosenthal F, Böhm M, Lehner W (2010) Indexing forecast models for matching and maintenance. In: IDEAS, pp 26–31
Fischer U, Rosenthal F, Lehner W (2012) F2DB: the flash-forward database system. In: ICDE, pp 1245–1248
Ge T, Zdonik SB (2008) A skip-list approach for efficiently processing forecasting queries. Proc VLDB Endow 1(1):984–995
Gooijera JGD, Hyndman RJ (2006) 25 years of time series forecasting. Int J Forecast 22:443–473
Große P, Lehner W, Weichert T, Färber F, Li WS (2011) Bridging two worlds with RICE integrating R into the SAP in-memory computing engine. Proc VLDB Endow 4(12):1307–1317
Hyndman RJ, Ahmed RA, Athanasopoulos G, Shang HL (2011) Optimal combination forecasts for hierarchical time series. Comput Stat Data Anal 55(9):2579–2589
Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27:1–22
Hyndman RJ, Koehler AB, Snyder RD, Grose S (2000) A state space framework for automatic forecasting using exponential smoothing methods. Int J Forecast 18:439–454
Jeung H, Yiu ML, Zhou X, Jensen CS (2010) Path prediction and predictive range querying in road network databases. VLDB J 19(4):585–602
Koc ML, Ré C (2011) Incrementally maintaining classification using an RDBMS. Proc VLDB Endow 4(5):302–313
Lehner W (2003) Datenbanktechnologie für Data-Warehouse-Systeme. Konzepte und Methoden. dpunkt
Oracle (2012) Oracle OLAP DML reference: FORECAST–DML statement
Parisi F, Sliva A, Subrahmanian VS (2011) Embedding forecast operators in databases. In: Proceedings of the 5th international conference on scalable uncertainty management (SUM’11), pp 373–386
Ramanathan R, Engle R, Granger CWJ, Vahid-Araghi F, Brace C (1997) Short-run forecasts of electricity loads and peaks. Int J Forecast 13(2):161–174
Rosenthal F, Lehner W (2011) Efficient in-database maintenance of ARIMA models. In: SSDBM, pp 537–545
Rosenthal F, Volk PB, Hahmann M, Habich D, Lehner W (2009) Drift-Aware ensemble regression. In: Proceedings of the 6th international conference on machine learning and data mining in pattern recognition (MLDM’09), pp 221–235
Roussopoulos N (1982) The logical access path schema of a database. IEEE Trans Softw Eng 8:563–573
Sánchez I (2008) Adaptive combination of forecasts with application to wind energy. Int J Forecast 24(4):679–693
Taylor JW (2009) Triple seasonal methods for Short-term electricity demand forecasting. Eur J Oper Res 204:139–152
Winter R, Kostamaa P (2010) Large scale data warehousing: trends and observations. In: ICDE, p 1
Xu B, Wolfson O (2003) Time-Series prediction with applications to traffic and moving objects databases. In: Proceedings of the 3rd ACM international workshop on data engineering for wireless and mobile access (MobiDe’03), pp 56–60
Author information
Authors and Affiliations
Corresponding author
Additional information
Matthias Boehm is currently visiting IBM Almaden Research Center, San Jose, CA, USA.
Rights and permissions
About this article
Cite this article
Fischer, U., Dannecker, L., Siksnys, L. et al. Towards Integrated Data Analytics: Time Series Forecasting in DBMS. Datenbank Spektrum 13, 45–53 (2013). https://doi.org/10.1007/s13222-012-0108-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-012-0108-4