Abstract
All parallel query processing frameworks need to determine the optimality norms for column materialization. We investigate performance trade-off of alternative column materialization strategies. We propose a common parallel query processing approach that encapsulates varying column materialization strategies within exchange nodes in query execution plans. Our experimental observations confirm the theoretically deduced trade-offs that suggest optimality norms to be dependent on the scale of the cluster, data transmissions required for a query, and the predicate selectivities involved. Lastly, we have applied a probit statistical model to the experimental data in order to establish a systemdependent adhoc performance estimation method that can be used to select the optimal materialization strategy at runtime.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
DeWitt, D., Dray, J.: Parallel database systems: the future of high performance database systems. Comm. ACM 35, 85–98 (1992)
Anikiej, K.: Multi-core parallelization of vectorized queries. Master Thesis, University of Warsaw and VU University of Amsterdam (2010)
Thomson, A., et al.: Calvin: Fast Distributed Transactions for Partitioned Database Systems. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, Arizona, USA (2012)
Abadi, D., Myers, D.S., DeWitt, D.J., Samuel, R.M.: Materialization strategies in a column-oriented DBMS. In: IEEE 23rd International Conference on Data Engineering, pp. 466–475 (2007)
Lamb, A., Fuller, M., Varadarajan, R., Tran, N., Vandiver, B., Doshi, L., Bear, C.: The vertica analytic database C-store 7 years later. In: Proceedings of the 38th International Conference on Very Large Data Bases, pp. 1790–1801. VLDB Endowment (2012)
Larson, P., Hanson, E.N., Price, S.L.: Columnar Storage in SQL Server 2012. IEEE Data Eng. Bull. 35, 15–20 (2012)
Oracle, http://www.oracle.com/us/corporate/features/database-in-memory-option/index.html
Teradata, https://www.teradata.com/white-papers/Teradata-14-Hybrid-Columnar/
Infobright, https://www.infobright.com/index.php/Products/MySQL-Integration/
Boncz, P.A., Marcin, Z., Niels, N.: MonetDB/X100: Hyper-Pipelining Query Execution. In: CIDR, vol. 5, pp. 225–237 (2005)
Shrinivas, L., et al.: Materialization Strategies in the Vertica Analytic Database: Lessones Learned. In: IEEE 29th International Conference on Data Engineering, pp. 1196–1207. IEEE, Brisbane (2013)
MonetDB Kernel Modules, http://www.monetdb.org/Documentation/Manuals/MonetDB/Kernel/Modules
MonetDB MAL reference, http://www.monetdb.org/Documentation/Manuals/MonetDB/MALreferenceMonetDBstatement
Linux Container, http://lxc.sourceforge.net/
Idreos, S., et al.: MonetDB: Two Decades of Research In Column-Oriented Database Architectures. In: IEEE Data Engineering Bulletin, vol. 35, pp. 40–45 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ku, C., Liu, Y., Mortazavi, M., Cao, F., Chen, M., Shi, G. (2014). Optimization Strategies for Column Materialization in Parallel Execution of Queries. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8645. Springer, Cham. https://doi.org/10.1007/978-3-319-10085-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-10085-2_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10084-5
Online ISBN: 978-3-319-10085-2
eBook Packages: Computer ScienceComputer Science (R0)