[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Compression aware physical database design

Published: 01 July 2011 Publication History

Abstract

Modern RDBMSs support the ability to compress data using methods such as null suppression and dictionary encoding. Data compression offers the promise of significantly reducing storage requirements and improving I/O performance for decision support queries. However, compression can also slow down update and query performance due to the CPU costs of compression and decompression. In this paper, we study how data compression affects choice of appropriate physical database design, such as indexes, for a given workload. We observe that approaches that decouple the decision of whether or not to choose an index from whether or not to compress the index can result in poor solutions. Thus, we focus on the novel problem of integrating compression into physical database design in a scalable manner. We have implemented our techniques by modifying Microsoft SQL Server and the Database Engine Tuning Advisor (DTA) physical design tool. Our techniques are general and are potentially applicable to DBMSs that support other compression methods. Our experimental results on real world as well as TPC-H benchmark workloads demonstrate the effectiveness of our techniques.

References

[1]
Abadi, D., Madden, S., and Ferreira, M. Integrating compression and execution in column-oriented database systems. SIGMOD, 671--682, 2006.
[2]
Acharya, S., Gibbons, P. B., Poosala, V., and Ramaswamy, S. Join synopses for approximate query answering. SIGMOD, 275--286, 1999.
[3]
Agrawal, S., Chaudhuri, S., Kollar, L., Marathe, A., Narasayya, V., and Syamala, M. Database tuning advisor for microsoft SQL server 2005. VLDB, 1110--1121, 2004.
[4]
Agrawal, S., Chaudhuri, S., and Narasayya, V. Automated selection of materialized views and indexes in SQL databases. VLDB, 496--505, 2000.
[5]
Bhattacharjee, B., Lim, L., Malkemus, T. et al. Efficient index compression in DB2 LUW. VLDB, 1462--1473, 2009.
[6]
Charikar, M., Chaudhuri, S., Motwani, R., and Narasayya, V. Towards estimation error guarantees for distinct values. PODS, 268--279, 2000.
[7]
Chaudhuri, S. and Narasayya, V. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. VLDB, 146--155, 1997.
[8]
Chaudhuri, S. and Narasayya, V. Index merging. ICDE, 296--303, 1999.
[9]
Goodman, L. A. The variance of the product of K random variables. Journal of the American Statistical Association, 297, 54--60, 1962.
[10]
http://msdn.microsoft.com/en-us/library/cc280449.aspx. SQL Server 2008 R2 Books Online.
[11]
Idreos, S., Kaushik, R., Narasayya, V., and Ramamurthy, R. Estimating the compression fraction of an index using sampling. ICDE, 441--444, 2010.
[12]
Iyer, B. and Wilhite, D. Data Compression support in databases. VLDB, 695--704, 1994.
[13]
Mishra, S. Data Compression: Strategy, Capacity Planning and Best Practices. Microsoft, 2009. Whitepaper.
[14]
Pöss, M. and Potapov, D. Data compression in Oracle, VLDB, 937--947, 2003.
[15]
Zilio, D., Rao, J., Lightstone, S., Lohman, G., Storm, A., Arellano, C., and Fadden, S. DB2 design advisor: integrated automatic physical database design. VLDB, 1087--1097, 2004.

Cited By

View all
  • (2024)Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and TuningACM Computing Surveys10.1145/366532356:11(1-37)Online publication date: 17-May-2024
  • (2023)AWARE: Workload-aware, Redundancy-exploiting Linear AlgebraProceedings of the ACM on Management of Data10.1145/35886821:1(1-28)Online publication date: 30-May-2023
  • (2022)Budget-Conscious Fine-Grained Configuration Optimization for Spatio-Temporal ApplicationsProceedings of the VLDB Endowment10.14778/3565838.356585815:13(4079-4092)Online publication date: 1-Sep-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 4, Issue 10
July 2011
95 pages

Publisher

VLDB Endowment

Publication History

Published: 01 July 2011
Published in PVLDB Volume 4, Issue 10

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and TuningACM Computing Surveys10.1145/366532356:11(1-37)Online publication date: 17-May-2024
  • (2023)AWARE: Workload-aware, Redundancy-exploiting Linear AlgebraProceedings of the ACM on Management of Data10.1145/35886821:1(1-28)Online publication date: 30-May-2023
  • (2022)Budget-Conscious Fine-Grained Configuration Optimization for Spatio-Temporal ApplicationsProceedings of the VLDB Endowment10.14778/3565838.356585815:13(4079-4092)Online publication date: 1-Sep-2022
  • (2022)Robust and budget-constrained encoding configurations for in-memory database systemsProceedings of the VLDB Endowment10.14778/3503585.350358815:4(780-793)Online publication date: 14-Apr-2022
  • (2021)Good to the Last Bit: Data-Driven Encoding with CodecDBProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457283(843-856)Online publication date: 9-Jun-2021
  • (2020)FPGA-Accelerated compression of integer vectorsProceedings of the 16th International Workshop on Data Management on New Hardware10.1145/3399666.3399932(1-10)Online publication date: 15-Jun-2020
  • (2018)Compressed linear algebra for large-scale machine learningThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0478-127:5(719-744)Online publication date: 1-Oct-2018
  • (2016)Compressed linear algebra for large-scale machine learningProceedings of the VLDB Endowment10.14778/2994509.29945159:12(960-971)Online publication date: 1-Aug-2016
  • (2015)Efficient Compression and Storage of XML OLAP CubesInternational Journal of Data Warehousing and Mining10.5555/2795630.279563111:3(1-25)Online publication date: 1-Jul-2015
  • (2015)Resource Elasticity for Large-Scale Machine LearningProceedings of the 2015 ACM SIGMOD International Conference on Management of Data10.1145/2723372.2749432(137-152)Online publication date: 27-May-2015

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media