[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3626246.3653381acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDB

Published: 09 June 2024 Publication History

Abstract

Modern cloud data warehouses are integral to processing heterogeneous query workloads, which range from quick online transactions to intensive ad-hoc queries and extract, transform, load (ETL) processes. The synchronization of heterogeneous workloads, particularly the blend of short and long-running queries, often degrades performance due to intricate concurrency controls and cooperative multi-tasking execution models. Additionally, the auto-scaling mechanisms for mixed workloads can lead to spikes in demand and underutilized resources, impacting both performance and cost-efficiency. This paper introduces the Flux, a cloud-native workload auto-scaling platform designed for Alibaba AnalyticDB, which implements a pioneering decoupled auto-scaling architecture. By separating the scaling mechanisms for short and long-running queries, Flux not only resolves performance bottlenecks but also harnesses the elasticity of serverless container instances for on-demand resource provisioning. Our extensive evaluations demonstrate Flux's superiority over traditional scaling methods, with up to a 75% reduction in query response time (RT), a 19.0% increase in resource utilization ratio, and a 77.8% decrease in cost overhead.

References

[1]
AWS. Amazon Redshift. https://aws.amazon.com/redshift
[2]
AWS. AWS Fargate. https://aws.amazon.com/fargate/
[3]
Ronnie Chaiken, Bob Jenkins, Per-Åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou. 2008. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. Proc. VLDB Endow. 1, 2 (aug 2008), 1265--1276. https://doi.org/10.14778/1454159.1454166
[4]
Chris Chatfield. 1978. The Holt-winters forecasting procedure. Journal of the Royal Statistical Society: Series C (Applied Statistics) 27, 3 (1978), 264--279.
[5]
Yitian Chen, Yanfei Kang, Yixiong Chen, and Zizhuo Wang. 2020. Probabilistic forecasting with temporal convolutional neural network. Neurocomputing 399 (2020), 491--501. https://doi.org/10.1016/j.neucom.2020.03.011
[6]
Alibaba Cloud. Alibaba Cloud. https://www.alibabacloud.com/
[7]
Alibaba Cloud. AnalyticDB for MySQL. https://www.alibabacloud.com/product/analyticdb-for-mysql
[8]
Alibaba Cloud. Elastic Container Instance (ECI). https://partners-intl.aliyun.com/vodafone/products/elastic-container-instance
[9]
Alibaba Cloud. Platform for AI. https://www.alibabacloud.com/en/product/machine-learning_p_lc=1
[10]
Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, Allison W. Lee, Ashish Motivala, Abdul Q. Munir, Steven Pelley, Peter Povinec, Greg Rahn, Spyridon Triantafyllis, and Philipp Unterbrunner. 2016. The Snowflake Elastic Data Warehouse. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 215--226. https://doi.org/10.1145/2882903.2903741
[11]
Yuanning Gao, Xiuqi Huang, Xuanhe Zhou, Xiaofeng Gao, Guoliang Li, and Guihai Chen. 2023. DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). 27--39. https://doi.org/10.1109/ICDE55515.2023.00385
[12]
Antony S. Higginson, Mihaela Dediu, Octavian Arsene, Norman W. Paton, and Suzanne M. Embury. 2020. Database Workload Capacity Planning Using Time Series Analysis and Machine Learning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 769--783. https://doi.org/10.1145/3318464.3386140
[13]
IBM. IBM Db2. https://www.ibm.com/products/db2
[14]
Snowflake inc. Multicluster Warehouses. https://docs.snowflake.com/en/user-guide/warehouses-multicluster
[15]
Snowflake Inc. SNOWFLAKE DATA CLOUD. https://www.snowflake.com
[16]
Kubernetes. Kubernetes. https://kubernetes.io/
[17]
Vincent Le Guen and Nicolas Thome. 2019. Shape and time distortion loss for training deep time series forecasting models. Advances in neural information processing systems 32 (2019).
[18]
Viktor Leis, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2014. Morsel- Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many- Core Age. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (Snowbird, Utah, USA) (SIGMOD '14). Association for Computing Machinery, New York, NY, USA, 743--754. https://doi.org/10.1145/2588555.2610507
[19]
Ji You Li, Jiachi Zhang, Wenchao Zhou, Yuhang Liu, Shuai Zhang, Zhuoming Xue, Ding Xu, Hua Fan, Fangyuan Zhou, and Feifei Li. 2023. Eigen: End-to-End Resource Optimization for Large-Scale Databases on the Cloud. Proc. VLDB Endow. 16, 12 (aug 2023), 3795--3807. https://doi.org/10.14778/3611540.3611565
[20]
Liang Lin, Yuhan Li, Bin Wu, Huijun Mai, Renjie Lou, Jian Tan, and Feifei Li. 2023. Anser: Adaptive Information Sharing Framework of AnalyticDB. Proc. VLDB Endow. 16, 12 (aug 2023), 3636--3648. https://doi.org/10.14778/3611540.3611553
[21]
Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, and Geoffrey J. Gordon. 2018. Query-Based Workload Forecasting for Self- Driving Database Management Systems. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 631--645. https://doi.org/10.1145/3183713.3196908
[22]
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, and Mohammad Alizadeh. 2019. Learning Scheduling Algorithms for Data Processing Clusters. In Proceedings of the ACM Special Interest Group on Data Communication (Beijing, China) (SIGCOMM '19). Association for Computing Machinery, New York, NY, USA, 270--288. https://doi.org/10.1145/3341302.3342080
[23]
Ryan Marcus and Olga Papaemmanouil. 2016. WiSeDB: A Learning-Based Workload Management Advisor for Cloud Databases. Proc. VLDB Endow. 9, 10 (jun 2016), 780--791. https://doi.org/10.14778/2977797.2977804
[24]
Themis Melissaris, Kunal Nabar, Rares Radut, Samir Rehmtulla, Arthur Shi, Samartha Chandrashekar, and Ioannis Papapanagiotou. 2022. Elastic cloud services: scaling snowflake's control plane. In Proceedings of the 13th Symposium on Cloud Computing (San Francisco, California) (SoCC '22). Association for Computing Machinery, New York, NY, USA, 142--157. https://doi.org/10.1145/3542929.3563483
[25]
Microsoft. SQL Server. https://www.microsoft.com/en-us/sql-server
[26]
Meikel Poess and Chris Floyd. 2000. New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Rec. 29, 4 (dec 2000), 64--71. https://doi.org/10.1145/369275.369291
[27]
Olga Poppe, Qun Guo, Willis Lang, Pankaj Arora, Morgan Oslake, Shize Xu, and Ajay Kalhan. 2022. Moneyball: Proactive Auto-Scaling in Microsoft Azure SQL Database Serverless. Proc. VLDB Endow. 15, 6 (feb 2022), 1279--1287. https://doi.org/10.14778/3514061.3514073
[28]
Haoran Qiu, Subho S. Banerjee, Saurabh Jha, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. 2020. FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 805--825. https://www.usenix.org/conference/osdi20/presentation/qiu
[29]
Chenhao Qu, Rodrigo N Calheiros, and Rajkumar Buyya. 2018. Auto-scaling web applications in clouds: A taxonomy and survey. ACM Computing Surveys (CSUR) 51, 4 (2018), 1--33.
[30]
Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, and John Wilkes. 2020. Autopilot: Workload Autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems (Heraklion, Greece) (EuroSys '20). Association for Computing Machinery, New York, NY, USA, Article 16, 16 pages. https://doi.org/10.1145/3342195.3387524
[31]
Ibrahim Sabek, Tenzin Samten Ukyab, and Tim Kraska. 2022. LSched: A Workload-Aware Learned Query Scheduler for Analytical Database Systems. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 1228--1242. https://doi.org/10.1145/3514221.3526158
[32]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 36, 3 (2020), 1181--1191.
[33]
Gaurav Saxena, Mohammad Rahman, Naresh Chainani, Chunbin Lin, George Caragea, Fahim Chowdhury, Ryan Marcus, Tim Kraska, Ippokratis Pandis, and Balakrishnan (Murali) Narayanaswamy. 2023. Auto-WLM: Machine Learning Enhanced Workload Management in Amazon Redshift. In Companion of the 2023 International Conference on Management of Data (Seattle, WA, USA) (SIGMOD '23). Association for Computing Machinery, New York, NY, USA, 225--237. https://doi.org/10.1145/3555041.3589677
[34]
Rathijit Sen, Alekh Jindal, Hiren Patel, and Shi Qiao. 2020. AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft. Proc. VLDB Endow. 13, 12 (aug 2020), 3326--3339. https://doi.org/10.14778/3415478.3415554
[35]
Raghav Sethi, Martin Traverso, Dain Sundstrom, David Phillips, Wenlei Xie, Yutian Sun, Nezih Yegitbasi, Haozhun Jin, Eric Hwang, Nileema Shingte, and Christopher Berner. 2019. Presto: SQL on Everything. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). 1802--1813. https://doi.org/10.1109/ICDE.2019.00196
[36]
Rebecca Taft, Nosayba El-Sayed, Marco Serafini, Yu Lu, Ashraf Aboulnaga, Michael Stonebraker, Ricardo Mayerhofer, and Francisco Andrade. 2018. P-Store: An Elastic Database System with Predictive Provisioning. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIG- MOD '18). Association for Computing Machinery, New York, NY, USA, 205--219. https://doi.org/10.1145/3183713.3190650
[37]
Chunqiang Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, Ben Christensen, Alex Gartrell, Maxim Khutornenko, Sachin Kulkarni, Marcin Pawlowski, Tuomas Pelkonen, Andre Rodrigues, Rounak Tibrewal, Vaishnavi Venkatesan, and Peter Zhang. 2020. Twine: A Unified Cluster Management System for Shared Infrastructure. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 787--803. https://www.usenix.org/conference/osdi20/presentation/tang
[38]
Teradata. Teradata. https://www.teradata.com/
[39]
Siqiao Xue, Chao Qu, Xiaoming Shi, Cong Liao, Shiyi Zhu, Xiaoyu Tan, Lintao Ma, Shiyu Wang, Shijun Wang, Yun Hu, Lei Lei, Yangfei Zheng, Jianguo Li, and James Zhang. 2022. A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD '22). Association for Computing Machinery, New York, NY, USA, 4290--4299. https://doi.org/10.1145/3534678.3539063
[40]
Chaoqun Zhan, Maomeng Su, Chuangxian Wei, Xiaoqiang Peng, Liang Lin, Sheng Wang, Zhe Chen, Feifei Li, Yue Pan, Fang Zheng, and Chengliang Chai. 2019. AnalyticDB: Real-Time OLAP Database System at Alibaba Cloud. Proc. VLDB Endow. 12, 12 (aug 2019), 2059--2070. https://doi.org/10.14778/3352063.3352124
[41]
Mingyi Zhang, Patrick Martin, Wendy Powley, and Jianjun Chen. 2018. Workload Management in Database Management Systems: A Taxonomy. IEEE Transactions on Knowledge and Data Engineering 30, 7 (2018), 1386--1402. https://doi.org/10.1109/TKDE.2017.2767044
[42]
Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, G. Edward Suh, and Christina Delimitrou. 2021. Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Virtual, USA) (ASPLOS '21). Association for Computing Machinery, New York, NY, USA, 167--181. https://doi.org/10.1145/3445814.3446693
[43]
Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 11106--11115.

Index Terms

  1. Flux: Decoupled Auto-Scaling for Heterogeneous Query Workload in Alibaba AnalyticDB

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data
      June 2024
      694 pages
      ISBN:9798400704222
      DOI:10.1145/3626246
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 June 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. auto-scaling
      2. cloud data warehouse
      3. heterogeneous workloads

      Qualifiers

      • Research-article

      Conference

      SIGMOD/PODS '24
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 305
        Total Downloads
      • Downloads (Last 12 months)305
      • Downloads (Last 6 weeks)43
      Reflects downloads up to 10 Dec 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media