Abstract
Modern cloud-native OLAP databases adopt a storage-disaggregation architecture that separates the management of computation and storage. A major bottleneck in such an architecture is the network connecting the computation and storage layers. Computation pushdown is a promising solution to tackle this issue, which offloads some computation tasks to the storage layer to reduce network traffic. This paper presents FlexPushdownDB (FPDB), where we revisit the design of computation pushdown in a storage-disaggregation architecture, and then introduce several optimizations to further accelerate query processing. First, FPDB supports hybrid query execution, which combines local computation on cached data and computation pushdown to cloud storage at a fine granularity. Within the cache, FPDB uses a novel Weighted-LFU cache replacement policy that takes into account the cost of pushdown computation. Second, we design adaptive pushdown as a new mechanism to avoid throttling the storage-layer computation during pushdown, which pushes the request back to the computation layer at runtime if the storage-layer computational resource is insufficient. Finally, we derive a general principle to identify pushdown-amenable computational tasks, by summarizing common patterns of pushdown capabilities in existing systems, and further propose two new pushdown operators, namely, selection bitmap and distributed data shuffle. Evaluation on SSB and TPC-H shows each optimization can improve the performance by 2.2\(\times \), 1.9\(\times \), and 3\(\times \) respectively.
Similar content being viewed by others
References
Akka. https://akka.io/
Amazon Athena—Serverless interactive query service. https://aws.amazon.com/athena
Amazon Elastic Compute Cloud. https://aws.amazon.com/pm/ec2
Amazon Redshift Spectrum. https://docs.aws.amazon.com/redshift/latest/dg/c-using-spectrum.html
Amazon S3. https://aws.amazon.com/s3
Amazon S3 documentation—GetObject. https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html
Apache Arrow. https://arrow.apache.org
Apache Calcite. https://calcite.apache.org
Apache Hadoop. https://hadoop.apache.org
Apache Parquet. https://parquet.apache.org
AQUA (Advanced Query Accelerator) for Amazon Redshift. https://pages.awscloud.com/AQUA_Preview.html
Arrow Flight RPC. https://arrow.apache.org/docs/format/Flight.html
Arrow IPC Format. https://arrow.apache.org/docs/format/Columnar.html
AWS Nitro System. https://aws.amazon.com/ec2/nitro
Azure Data Lake Storage query acceleration. https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-query-acceleration
Ceph. https://ceph.io
DB2 Workload Manager. https://www.ibm.com/docs/en/db2/10.1.0?topic=manager-db2-workload
Dremio SQL query engine. https://www.dremio.com/platform/sql-query-engine
Gandiva: an LLVM-based arrow expression compiler. https://arrow.apache.org/blog/2018/12/05/gandiva-donation
gRPC. https://grpc.io
Intelligent query processing in SQL databases. https://learn.microsoft.com/en-us/sql/relational-databases/performance/intelligent-query-processing
Minio. https://min.io
Optimize performance with caching on Databricks. https://docs.databricks.com/en/optimizations/disk-cache.html
Presto. https://prestodb.io
Presto documentation—Alluxio cache service. https://prestodb.io/docs/current/cache/alluxio.html
S3 select and glacier select—retrieving subsets of objects. https://aws.amazon.com/blogs/aws/s3-glacier-select
Spark documentation—adaptive query execution. https://spark.apache.org/docs/latest/sql-performance-tuning.html
SQL Server documentation—resource governor. https://learn.microsoft.com/en-us/sql/relational-databases/resource-governor/resource-governor
TPC-H Benchmark. http://www.tpc.org/tpch
Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: How different are they really? In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 967–980 (2008)
Aboulnaga, A., Babu, S.: Workload management for big data analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 929–932 (2013)
Agha, G.: Actors: a model of concurrent computation in distributed systems. MIT Press (1986)
Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T., Franklin, M.J., Ghodsi, A., Zaharia, M.: Spark sql: Relational data processing in spark. In: SIGMOD, pp. 1383–1394 (2015)
Armenatzoglou, N., Basu, S., Bhanoori, N., Cai, M., Chainani, N., Chinta, K., Govindaraju, V., Green, T.J., Gupta, M., Hillig, S., Hotinger, E., Leshinksy, Y., Liang, J., McCreedy, M., Nagel, F., Pandis, I., Parchas, P., Pathak, R., Polychroniou, O., Rahman, F., Saxena, G., Soundararajan, G., Subramanian, S., Terry, D.: Amazon redshift re-invented. In: Proceedings of the 2022 International Conference on Management of Data, pp. 2205–2217 (2022)
Armstrong, J.: Erlang—a survey of the language and its industrial applications. In: Proc. INAP, vol. 96 (1996)
Belady, L.A.: A study of replacement algorithms for a virtual-storage computer. IBM Syst. J. 5(2), 78–101 (1966)
Charousset, D., Hiesgen, R., Schmidt, T.C.: Revisiting actor programming in C++. Comput. Lang. Syst. Struct. 45(C), 105–131 (2016)
Dageville, B., Cruanes, T., Zukowski, M., Antonov, V., Avanes, A., Bock, J., Claybaugh, J., Engovatov, D., Hentschel, M., Huang, J., Lee, A.W., Motivala, A., Munir, A.Q., Pelley, S., Povinec, P., Rahn, G., Triantafyllis, S., Unterbrunner, P.: The snowflake elastic data warehouse. In: SIGMOD, pp. 215–226 (2016)
Do, J., Kee, Y.S., Patel, J.M., Park, C., Park, K., DeWitt, D.J.: Query processing on smart ssds: Opportunities and challenges. In: SIGMOD, pp. 1221–1230 (2013)
Francisco, P.: The Netezza Data Appliance Architecture (2011)
Fushimi, S., Kitsuregawa, M., Tanaka, H.: An overview of the system software of a parallel relational database machine grace. In: VLDB, pp. 209–219 (1986)
Gao, M., Kozyrakis, C.: Hrl: Efficient and flexible reconfigurable logic for near-data processing. In: HPCA, pp. 126–137 (2016)
Ghomi, E.J., Rahmani, A.M., Qader, N.N.: Load-balancing algorithms in cloud computing: a survey. J. Netw. Comput. Appl. 88, 50–71 (2017)
Ghose, S., Hsieh, K., Boroumand, A., Ausavarungnirun, R., Mutlu, O.: Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions. arXiv preprint arXiv:1802.00320 (2018)
Gounaris, A., Paton, N.W., Fernandes, A.A.A., Sakellariou, R.: Adaptive query processing: a survey. In: Adv. Databases, pp. 11–25 (2002)
Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.J.: Quickly generating billion-record synthetic databases. SIGMOD Record 23(2), 243–252 (1994)
Gu, B., Yoon, A.S., Bae, D.H., Jo, I., Lee, J., Yoon, J., Kang, J.U., Kwon, M., Yoon, C., Cho, S., Jeong, J., Chang, D.: Biscuit: a framework for near-data processing of big data workloads. In: ISCA, pp. 153–165 (2016)
Gupta, A., Agarwal, D., Tan, D., Kulesza, J., Pathak, R., Stefani, S., Srinivasan, V.: Amazon redshift and the case for simpler data warehouses. In: SIGMOD, pp. 1917–1923 (2015)
Harmouch, H., Naumann, F.: Cardinality estimation: an experimental survey. Proc. VLDB Endow. 11(4), 499–512 (2017)
Hellerstein, J.M., Franklin, M., Chandrasekaran, S., Deshpande, A., Hildrum, K., Madden, S., Raman, V., Shah, M.A.: Adaptive query processing: technology in evolution. IEEE Data Eng. Bull. 23, 7–18 (2000)
Ives, Z.G., Halevy, A.Y., Weld, D.S., Florescu, D., Friedman, M.T.: Adaptive query processing for internet applications. IEEE Data(base) Eng. Bull. 23, 19–26 (2000)
Kabra, N., DeWitt, D.J.: Efficient mid-query re-optimization of sub-optimal query execution plans. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 106–117 (1998)
Kepe, T.R., de Almeida, E.C., Alves, M.A.Z.: Database processing-in-memory: an experimental study. VLDB 13(3), 334–347 (2019)
Kim, K., Jung, J., Seo, I., Han, W.S., Choi, K., Chong, J.: Learned cardinality estimation: an in-depth study. In: Proceedings of the 2022 International Conference on Management of Data, pp. 1214–1227 (2022)
Koo, G., Matam, K.K., I, T., Narra, H.V.K.G., Li, J., Tseng, H.W., Swanson, S., Annavaram, M.: Summarizer: trading communication with computing near storage. In: MICRO, pp. 219–231 (2017)
Lamb, A., Fuller, M., Varadarajan, R., Tran, N., Vandiver, B., Doshi, L., Bear, C.: The vertica analytic database: C-store 7 years later. VLDB 5(12), 1790–1801 (2012)
Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., Neumann, T.: How good are query optimizers, Really? Proc. VLDB Endow. 9(3), 204–215 (2015)
Lin, Y., Agrawal, D., Chen, C., Ooi, B.C., Wu, S.: Llama: Leveraging columnar storage for scalable join processing in the mapreduce framework. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, pp. 961–972 (2011)
Lloyd, W., Pallickara, S., David, O., Arabi, M., Rojas, K.: Mitigating resource contention and heterogeneity in public clouds for scientific modeling services. In: 2017 IEEE International Conference on Cloud Engineering (IC2E), pp. 159–166 (2017)
Niu, B., Martin, P., Powley, W.: Towards autonomic workload management in dbmss. J. Database Manage. (JDM) 20(3), 1–17 (2009)
O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact table indexing. In: Technology Conference on Performance Evaluation and Benchmarking, pp. 237–252 (2009)
Pang, H.H., Carey, M.J., Livny, M.: Memory-adaptive external sorting. University of Wisconsin-Madison Department of Computer Sciences, Tech. rep. (1993)
Pang, H.H., Carey, M.J., Livny, M.: Partially preemptible hash joins. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 59–68 (1993)
Polychroniou, O., Sen, R., Ross, K.A.: Track join: Distributed joins with minimal network traffic. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1483–1494 (2014)
Raman, V., Attaluri, G., Barber, R., Chainani, N., Kalmuk, D., KulandaiSamy, V., Leenstra, J., Lightstone, S., Liu, S., Lohman, G.M., Malkemus, T., Mueller, R., Pandis, I., Schiefer, B., Sharpe, D., Sidle, R., Storm, A., Zhang, L.: Db2 with blu acceleration: so much more than just a column store. Proc. VLDB Endow. 6(11), 1080–1091 (2013)
Rescorla, E.: The Transport Layer Security (TLS) Protocol Version 1.3. RFC 8446 (2018). https://doi.org/10.17487/RFC8446. https://www.rfc-editor.org/info/rfc8446
Sahu, S., Nain, P., Diot, C., Firoiu, V., Towsley, D.: On achievable service differentiation with token bucket marking for tcp. ACM SIGMETRICS Perform Eval Rev 28(1), 23–33 (2000)
Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., O’Neil, P., Rasin, A., Tran, N., Zdonik, S.: C-store: A column-oriented dbms. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 553–564 (2005)
Tan, J., Ghanem, T., Perron, M., Yu, X., Stonebraker, M., DeWitt, D., Serafini, M., Aboulnaga, A., Kraska, T.: Choosing a cloud dbms: architectures and tradeoffs. VLDB 12(12), 2170–2182 (2019)
Tang, P.P., Tai, T.Y.: Network traffic characterization using token bucket model. In: IEEE INFOCOM’99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No. 99CH36320), pp. 51–62 (1999)
Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., Murthy, R.: Hive—a petabyte scale data warehouse using hadoop. In: ICDE, pp. 996–1005 (2010)
Ubell, M.: The Intelligent Database Machine (IDM). In: Query Processing in Database Systems, pp. 237–247 (1985)
Vandiver, B., Prasad, S., Rana, P., Zik, E., Saeidi, A., Parimal, P., Pantela, S., Dave, J.: Eon mode: Bringing the vertica columnar database to the cloud. In: SIGMOD, pp. 797–809 (2018)
Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1–16 (2013)
Verbitski, A., Gupta, A., Saha, D., Brahmadesam, M., Gupta, K., Mittal, R., Krishnamurthy, S., Maurice, S., Kharatishvili, T., Bao, X.: Amazon aurora: Design considerations for high throughput cloud-native relational databases. In: SIGMOD, pp. 1041–1052 (2017)
Verbitski, A., Gupta, A., Saha, D., Corey, J., Gupta, K., Brahmadesam, M., Mittal, R., Krishnamurthy, S., Maurice, S., Kharatishvilli, T., et al.: Amazon aurora: On avoiding distributed consensus for i/os, commits, and membership changes. In: SIGMOD, pp. 789–796 (2018)
Vuppalapati, M., Miron, J., Agarwal, R., Truong, D., Motivala, A., Cruanes, T.: Building an elastic query engine on disaggregated storage. In: NSDI, pp. 449–462 (2020)
Weiss, R.: A technical overview of the oracle exadata database machine and exadata storage server. Oracle White Paper (2012)
Woods, L., István, Z., Alonso, G.: Ibex: an intelligent storage engine with support for advanced sql offloading. VLDB 7(11), 963–974 (2014)
Wu, W., Naughton, J.F., Singh, H.: Sampling-based query re-optimization (2016)
Xu, S., Bourgeat, T., Huang, T., Kim, H., Lee, S., Arvind, A.: Aquoman: an analytic-query offloading machine. In: MICRO, pp. 386–399 (2020)
Yang, Y., Youill, M., Woicik, M., Liu, Y., Yu, X., Serafini, M., Aboulnaga, A., Stonebraker, M.: Flexpushdowndb: hybrid pushdown and caching in a cloud dbms. VLDB 14(11), 2101–2113 (2021)
Yu, X., Youill, M., Woicik, M., Ghanem, A., Serafini, M., Aboulnaga, A., Stonebraker, M.: Pushdowndb: Accelerating a dbms using s3 computation. In: ICDE, pp. 1802–1805 (2020)
Zhang, M., Martin, P., Powley, W., Chen, J.: Workload management in database management systems: a taxonomy. IEEE Trans. Knowl. Data Eng. 30(7), 1386–1402 (2018)
Zhang, W., Larson, P.A.: Dynamic memory adjustment for external mergesort. In: VLDB, vol. 97, pp. 25–29 (1997)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, Y., Yu, X., Serafini, M. et al. FlexpushdownDB: rethinking computation pushdown for cloud OLAP DBMSs. The VLDB Journal 33, 1643–1670 (2024). https://doi.org/10.1007/s00778-024-00867-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-024-00867-8