CN117076106A

CN117076106A - Elastic telescoping method and system for cloud server resource management

Info

Publication number: CN117076106A
Application number: CN202310920219.2A
Authority: CN
Inventors: 温林峰; 徐敏贤; 叶可江; 须成忠
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2023-07-25
Filing date: 2023-07-25
Publication date: 2023-11-17

Abstract

The application relates to the field of cloud computing resource scheduling, in particular to an elastic expansion method and system for cloud server resource management. The method comprises the following steps: according to the elastic expansion method and the system for cloud server resource management, the load pressure line is generated in a random gradient descending and shifting mode, the load state of the load data is judged based on the load pressure line, and an applicable elastic expansion method is selected according to the judging result.

Description

An elastic scaling method and system for cloud server resource management

技术领域Technical field

本发明涉及云计算资源调度领域，具体而言，涉及一种云服务器资源管理的弹性伸缩方法及系统。The present invention relates to the field of cloud computing resource scheduling, and specifically to an elastic scaling method and system for cloud server resource management.

背景技术Background technique

随着云原生技术的发展和普及，云服务器的弹性伸缩策略已经成为了考验云服务器负载能力的重要指标之一。一种好的弹性伸缩策略应当兼顾资源的节约与服务的性能，即在使用最少资源的情况下，保障云服务器使用者的用户体验。并且在高变动的负载下，系统也具有很强的鲁棒性。就目前而言，主流的云服务器弹性伸缩策略主要有被动式弹性伸缩策略及主动式弹性伸缩策略两大类。With the development and popularization of cloud native technology, the elastic scaling strategy of cloud servers has become one of the important indicators to test the load capacity of cloud servers. A good elastic scaling strategy should take into account resource conservation and service performance, that is, ensuring the user experience of cloud server users while using the least resources. And the system is also highly robust under highly variable loads. For now, the mainstream cloud server elastic scaling strategies mainly include passive elastic scaling strategies and active elastic scaling strategies.

被动式弹性伸缩策略：针对于没有明显周期变化规律的云服务器在线应用，通常难以先验地预测下一时段的负载。被动式弹性伸缩策略主要是收集微服务最近一时段的负载数据，设置合适的目标阈值，一旦发现某些指标超过设定的阈值，就立即触发扩缩容操作，维持资源利用率的稳定状态。目前，大多数云数据中心使用被动式弹性伸缩策略，如使用Google开发的Kubernetes(Google开源的一个容器编排引擎)来自动扩展微服务。Passive elastic scaling strategy: For online cloud server applications that do not have obvious periodic changes, it is usually difficult to predict the load in the next period a priori. The passive elastic scaling strategy mainly collects the load data of microservices in the latest period and sets appropriate target thresholds. Once certain indicators are found to exceed the set thresholds, the expansion and contraction operations are immediately triggered to maintain a stable state of resource utilization. Currently, most cloud data centers use passive elastic scaling strategies, such as using Kubernetes (a container orchestration engine open sourced by Google) developed by Google to automatically expand microservices.

主动式弹性伸缩策略：针对于有明显周期变化规律的云服务器在线应用，通过收集历史负载数据，对负载进行数学建模和画像，预测下一时刻负载并分析资源的需求，及时分配或回收相应的资源，它可以很好地解决扩缩容存在的延时问题。现有的主动式弹性伸缩策略分为基于机器学习的预测模型、基于深度学习的预测模型和基于强化学习的预测模型，如Light-GBM、DNN、Q-learning等。Active elastic scaling strategy: For online applications of cloud servers with obvious cyclical changes, by collecting historical load data, mathematical modeling and portraits of the load are used to predict the load at the next moment and analyze the demand for resources, and allocate or recycle the corresponding resources in a timely manner. resources, it can well solve the delay problem of expansion and contraction. Existing active elastic scaling strategies are divided into prediction models based on machine learning, prediction models based on deep learning, and prediction models based on reinforcement learning, such as Light-GBM, DNN, Q-learning, etc.

采用被动式弹性伸缩策略将导致资源利用率普遍较低，冗余严重。例如，国内外云服务厂商所青睐的Kubernetes，由于水平伸缩设计过于简单，只能基于实时感知到的负载值和预定义资源水位阈值进行比较从而计算所需副本数。这种方式缺乏风险管控机制，并不适用于工业生产。被动式弹性伸缩策略只能进行实时响应，无法预测未来时刻的服务资源需求，因而云服务器的工作负载变化往往会早于其弹性伸缩调整，即使是响应时间很快的垂直伸缩也不可避免存在一定的延迟，而这种延迟将会导致服务质量降低，SLA违约率上升等问题。Adopting a passive elastic scaling strategy will result in generally low resource utilization and serious redundancy. For example, Kubernetes, which is favored by domestic and foreign cloud service vendors, has a too simple horizontal scaling design and can only calculate the required number of replicas based on a comparison between the real-time perceived load value and the predefined resource water level threshold. This method lacks a risk control mechanism and is not suitable for industrial production. The passive elastic scaling strategy can only respond in real time and cannot predict the service resource demand in the future. Therefore, the workload of the cloud server often changes earlier than its elastic scaling adjustment. Even vertical scaling with fast response time will inevitably have certain problems. Delay, and this delay will lead to reduced service quality, increased SLA default rate and other problems.

采用主动式弹性伸缩策略进行自动扩缩容不可避免会出现预测误差的问题，如果预测偏低则会导致扩容不足进而引起服务质量降低，用户的SLA违规率上升等问题，并且基于预测式的弹性伸缩策略在面对突发流量时的响应速度不如被动式弹性伸缩策略那么迅速。并且在策略执行初期，由于缺少大量的历史负载数据进行数学分析，难以高效地进行资源管理。Using a proactive elastic scaling strategy for automatic expansion and contraction will inevitably lead to the problem of prediction errors. If the prediction is too low, it will lead to insufficient expansion, which will lead to reduced service quality, increased user SLA violation rate, etc., and based on predictive elasticity The response speed of the scaling strategy when facing burst traffic is not as fast as the passive elastic scaling strategy. And in the early stages of policy execution, it is difficult to manage resources efficiently due to the lack of a large amount of historical load data for mathematical analysis.

因此，现有技术还存在不足，有待进一步改进。Therefore, the existing technology still has shortcomings and needs further improvement.

发明内容Contents of the invention

本发明实施例提供了一种云服务器资源管理的弹性伸缩方法及系统，以至少解决采用被动式弹性伸缩策略将导致资源利用率较低或采用主动式弹性伸缩策略会出现预测误差的技术问题。Embodiments of the present invention provide an elastic scaling method and system for cloud server resource management to at least solve the technical problem that adopting a passive elastic scaling strategy will lead to low resource utilization or using an active elastic scaling strategy will cause prediction errors.

根据本发明的一实施例，提供了一种云服务器资源管理的弹性伸缩方法，包括以下步骤：According to an embodiment of the present invention, an elastic scaling method for cloud server resource management is provided, including the following steps:

收集历史工作的负载数据；Collect historical workload data;

通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，负载状态包括稳定状态及非稳定状态；The load pressure line is generated through stochastic gradient descent and offset, and the load state of the load data is determined based on the load pressure line. The load state includes stable state and unstable state;

根据当前负载状态选择弹性伸缩策略进行资源分配，弹性伸缩策略包括垂直伸缩策略及水平伸缩策略；Select an elastic scaling strategy for resource allocation based on the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

基于分配的资源，通过云服务器执行具体的水平伸缩策略或垂直伸缩策略，实现资源分配管理。Based on the allocated resources, specific horizontal scaling strategies or vertical scaling strategies are implemented through the cloud server to achieve resource allocation management.

进一步地，通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，负载状态包括稳定状态及非稳定状态之前还包括：Furthermore, the load pressure line is generated through stochastic gradient descent and offset, and the load state of the load data is determined based on the load pressure line. The load state includes the stable state and the unstable state before including:

对负载数据进行数据预处理，数据预处理包括删除异常数据，计算具有相同时间戳的每个参数的平均值；Perform data preprocessing on load data. Data preprocessing includes deleting abnormal data and calculating the average value of each parameter with the same timestamp;

对负载数据进行监督学习转换，监督学习转换为使用时间窗口将负载数据转化为带标签的有监督学习序列。Supervised learning transformation is performed on the load data. Supervised learning transformation uses time windows to transform the load data into labeled supervised learning sequences.

进一步地，根据当前负载状态选择弹性伸缩策略进行资源分配，弹性伸缩策略包括垂直伸缩策略及水平伸缩策略包括：Further, an elastic scaling strategy is selected for resource allocation according to the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy including:

采用逐项差分的垂直伸缩策略的进行资源分配；Use item-by-item differential vertical scaling strategy for resource allocation;

采用逐项差分的垂直伸缩策略的资源分配式具体为：The specific resource allocation formula of the vertical scaling strategy using item-by-item differentiation is:

其中，C_N为资源分配量，α、β、γ、δ、ε五个系数分别为上个时间窗口内收集的五次资源使用数据C_L1、C_L2、C_L3、C_L4、C_L5的差分系数，ρ表示垂直伸缩中资源配置的裕度；Among them, C _N is the resource allocation amount, and the five coefficients α, β, γ, δ, and ε are respectively the five resource usage data C _L1 , C _L2 , C _L3 , C _L4 , and C _L5 collected in the previous time window. The difference coefficient, ρ, represents the margin of resource allocation in vertical scaling;

以上五个差分系数满足关系：The above five differential coefficients satisfy the relationship:

α+4λ＝β+3λ＝γ+2λ＝δ+λ＝εα+4λ＝β+3λ＝γ+2λ＝δ+λ＝ε

其中，λ为云服务器管理人员根据需求设置的差分数，λ设置为0即是Kubernetes的资源分配式，λ不为负数。Among them, λ is the difference number set by cloud server managers according to needs. Setting λ to 0 is the resource allocation formula of Kubernetes. λ is not a negative number.

采用基于负载数据预测的垂直伸缩策略；Adopt a vertical scaling strategy based on load data prediction;

采用基于负载数据预测的垂直伸缩策略具体为：The specific vertical scaling strategy based on load data prediction is as follows:

通过收集历史工作的负载数据，对负载数据进行数学建模和画像；By collecting historical workload load data, mathematical modeling and portraiture of load data are performed;

基于数学建模和画像，预测下一时刻负载数据；Based on mathematical modeling and profiling, predict the load data at the next moment;

基于下一时刻的负载数据分析资源的需求，及时分配或回收相应的资源。Based on the load data analysis of resource requirements at the next moment, the corresponding resources can be allocated or recycled in a timely manner.

进一步地，基于数学建模和画像，预测下一时刻负载数据具体为：Furthermore, based on mathematical modeling and profiling, the load data predicted at the next moment is specifically:

采用基于决策树算法的分布式梯度提升框架LightGBM进行预测下一时刻的负载数据。The distributed gradient boosting framework LightGBM based on the decision tree algorithm is used to predict the load data at the next moment.

采用水平伸缩策略协同垂直伸缩进行细粒度的负载数据管理；Use horizontal scaling strategy to coordinate vertical scaling for fine-grained load data management;

采用水平伸缩策略协同垂直伸缩进行细粒度的负载数据管理具体为：The horizontal scaling strategy is used to coordinate vertical scaling for fine-grained load data management. The details are as follows:

采用预配置的水平伸缩策略，并结合垂直伸缩进行细粒度的资源管理，水平方向上预配置后，垂直方向上以预设的衰退率进行资源分配，兼顾资源调整的灵活与成本的节约；Adopt a pre-configured horizontal scaling strategy and combine it with vertical scaling for fine-grained resource management. After pre-configuring in the horizontal direction, resources are allocated at a preset decline rate in the vertical direction, taking into account the flexibility of resource adjustment and cost savings;

基于指数回退法进行资源分配，指数回退法如下所示：Resource allocation is based on the exponential backoff method. The exponential backoff method is as follows:

式中，C_N为资源分配量，C_A为水平扩展后资源分配总量，φ为指数回退算法中的底数，n为需要回退的轮数，t为不同的时刻。In the formula, C _N is the amount of resource allocation, C _A is the total amount of resource allocation after horizontal expansion, φ is the base number in the exponential backoff algorithm, n is the number of rounds that need to be rolled back, and t is a different time.

进一步地，基于数学建模和画像，预测下一时刻负载数据包括：Furthermore, based on mathematical modeling and profiling, predicting load data at the next moment includes:

通过梯度提升框架方法进行负载数据预测，根据预测的结果，选择垂直伸缩策略或水平伸缩策略进行资源分配。Load data prediction is performed through the gradient boosting framework method, and according to the prediction results, a vertical scaling strategy or a horizontal scaling strategy is selected for resource allocation.

一种云服务器资源管理的弹性伸缩系统，包括：An elastic scaling system for cloud server resource management, including:

数据收集模块，用于收集历史工作的负载数据；Data collection module, used to collect load data of historical work;

负载状态判断模块，用于通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，负载状态包括稳定状态及非稳定状态；The load state determination module is used to generate a load pressure line through random gradient descent and offset, and determine the load state of the load data based on the load pressure line. The load state includes a stable state and an unstable state;

策略选择模块，用于根据当前负载状态选择弹性伸缩策略进行资源分配，弹性伸缩策略包括垂直伸缩策略及水平伸缩策略；The policy selection module is used to select an elastic scaling strategy for resource allocation based on the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

资源调度器，用于基于分配的资源，对云服务器执行具体的水平伸缩策略或垂直伸缩策略，实现资源分配管理。The resource scheduler is used to implement specific horizontal scaling strategies or vertical scaling strategies for cloud servers based on allocated resources to implement resource allocation management.

一种计算机可读介质，计算机可读存储介质存储有一个或者多个程序，一个或者多个程序可被一个或者多个处理器执行，以实现如上述任意一项的云服务器资源管理的弹性伸缩方法中的步骤。A computer-readable medium. The computer-readable storage medium stores one or more programs. The one or more programs can be executed by one or more processors to achieve elastic scaling of cloud server resource management as described above. steps in the method.

一种终端设备，包括：处理器、存储器及通信总线；存储器上存储有可被处理器执行的计算机可读程序；A terminal device includes: a processor, a memory and a communication bus; the memory stores a computer-readable program that can be executed by the processor;

通信总线实现处理器和存储器之间的连接通信；The communication bus realizes the connection and communication between the processor and the memory;

处理器执行计算机可读程序时实现如上述任意一项的云服务器资源管理的弹性伸缩方法中的步骤。When the processor executes a computer-readable program, the steps in the elastic scaling method for cloud server resource management as described above are implemented.

本发明实施例中的云服务器资源管理的弹性伸缩方法及系统，通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，根据判断结果来选择一种适用的弹性伸缩方法，相比于传统的微服务弹性伸缩策略，本方法解决了主动式弹性伸缩策略预测准确度不高、适用性不强的问题，且能够保障QoS(Quality ofService，服务质量)不损失的同时提高资源的利用率。The elastic scaling method and system for cloud server resource management in the embodiment of the present invention generates a load pressure line through random gradient descent and offset, determines the load state of the load data based on the load pressure line, and selects a method based on the judgment result. An applicable elastic scaling method. Compared with the traditional microservice elastic scaling strategy, this method solves the problems of low prediction accuracy and weak applicability of the active elastic scaling strategy, and can guarantee QoS (Quality of Service). ) without loss while improving resource utilization.

附图说明Description of the drawings

此处所说明的附图用来提供对本发明的进一步理解，构成本申请的一部分，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。在附图中：The drawings described here are used to provide a further understanding of the present invention and constitute a part of this application. The illustrative embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached picture:

图1为本发明云服务器资源管理的弹性伸缩方法的流程图；Figure 1 is a flow chart of the elastic scaling method of cloud server resource management according to the present invention;

图2为本发明云服务器资源管理的弹性伸缩方法的具体实施例流程图；Figure 2 is a flow chart of a specific embodiment of the elastic scaling method for cloud server resource management of the present invention;

图3为本发明的本申请的关键部分代码；Figure 3 is the key part of the code of this application of the present invention;

图4为本发明随机截取一段数据在Kubernetes集群中进行测试实验结果一；Figure 4 is the first experimental result of the present invention randomly intercepting a piece of data and testing it in the Kubernetes cluster;

图5为本发明随机截取一段数据在Kubernetes集群中进行测试实验结果二；Figure 5 shows the second experimental result of the present invention randomly intercepting a piece of data and testing it in the Kubernetes cluster;

图6为本发明随机截取一段数据在Kubernetes集群中进行测试实验结果三；Figure 6 shows the third experimental result of the present invention randomly intercepting a piece of data and testing it in the Kubernetes cluster;

图7为本发明随机截取一段数据在Kubernetes集群中进行测试实验结果四；Figure 7 shows the fourth experimental result of the present invention randomly intercepting a piece of data and testing it in the Kubernetes cluster;

图8为本发明的左图为各个方法的响应时间；Figure 8 shows the response time of each method on the left side of the present invention;

图9为本发明的各个方法的资源利用率；Figure 9 shows the resource utilization rates of various methods of the present invention;

图10为本发明采用了动态时间调整算法进行性能量化评分；Figure 10 shows that the present invention uses a dynamic time adjustment algorithm to perform quantitative performance scoring;

图11为发明云服务器资源管理的弹性伸缩系统的原理图；Figure 11 is a schematic diagram of the elastic scaling system for cloud server resource management invented by the invention;

图12为发明云服务器资源管理的弹性伸缩系统的具体实施例图；Figure 12 is a diagram of a specific embodiment of the elastic scaling system for cloud server resource management of the invention;

图13为发明的终端设备图。Figure 13 is a diagram of the terminal equipment of the invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the invention described herein are capable of being practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method, system, product, or apparatus that encompasses a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.

实施例1Example 1

根据本发明一实施例，提供了一种云服务器资源管理的弹性伸缩方法，参见图1，包括以下步骤：According to an embodiment of the present invention, an elastic scaling method for cloud server resource management is provided. Referring to Figure 1, it includes the following steps:

S100：收集历史工作的负载数据；S100: Collect historical work load data;

S200：通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，负载状态包括稳定状态及非稳定状态；S200: Generate a load pressure line through stochastic gradient descent and offset, and determine the load state of the load data based on the load pressure line. The load state includes a stable state and an unstable state;

S300:根据当前负载状态选择弹性伸缩策略进行资源分配，弹性伸缩策略包括垂直伸缩策略及水平伸缩策略；S300: Select an elastic scaling strategy for resource allocation based on the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

S400:基于分配的资源，通过云服务器执行具体的水平伸缩策略或垂直伸缩策略，实现资源分配管理。S400: Based on the allocated resources, the cloud server executes specific horizontal scaling strategies or vertical scaling strategies to realize resource allocation management.

本发明实施例中的云服务器资源管理的弹性伸缩方法，通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，根据判断结果来选择一种适用的弹性伸缩方法，相比于传统的微服务弹性伸缩策略，本方法解决了主动式弹性伸缩策略预测准确度不高、适用性不强的问题，且能够保障QoS(Quality of Service，服务质量)不损失的同时提高资源的利用率。The elastic scaling method of cloud server resource management in the embodiment of the present invention generates a load pressure line through random gradient descent and offset, determines the load state of the load data based on the load pressure line, and selects an appropriate method based on the judgment result. Compared with the traditional microservice elastic scaling strategy, this method solves the problems of low prediction accuracy and weak applicability of the active elastic scaling strategy, and can guarantee QoS (Quality of Service, Quality of Service) Improve resource utilization without loss.

具体地，本申请基于机器学习算法，提出了一种对微服务资源进行实时调节的高效智能弹性伸缩策略，主要采取了状态检测的方法对不同的微服务进行分析判断，根据分析结果执行不同的弹性伸缩策略。本申请的目的是利用这一智能的微服务弹性伸缩策略，解决云服务器中资源管理的难题。Specifically, this application proposes an efficient and intelligent elastic scaling strategy for real-time adjustment of microservice resources based on machine learning algorithms. It mainly adopts the state detection method to analyze and judge different microservices, and executes different actions based on the analysis results. Auto-scaling strategy. The purpose of this application is to use this intelligent microservice elastic scaling strategy to solve the problem of resource management in cloud servers.

具体地，根据负载压力线判断所述负载数据所处的负载状态。本是实施例中的压力线可以基于参数的调整实现负载的适配，为本实施例中提供的一种压力线函数的表示方法：Specifically, the load state of the load data is determined according to the load pressure line. The pressure line in this embodiment can realize load adaptation based on parameter adjustment, which is a method of expressing the pressure line function provided in this embodiment:

压力线为一条线性函数，本实施例中的压力线的函数通式为：The pressure line is a linear function. The general function formula of the pressure line in this embodiment is:

f(t)＝kt+b+αc_v f(t)＝kt+b+αc _v

式中，k为压力线斜率，t为时间自变量，b为压力线常数项，c_v为离散系数，是数据的标准差σ与其相应的平均数μ之比，表征样本区间内的离散程度，也是压力线预留的宽裕度，离散程度越大，则分配的缓冲空间越大，相反，则缓冲空间越小。α为压力线裕度的调整参数。In the formula, k is the slope of the pressure line, t is the time independent variable, b is the pressure line constant term, c _v is the dispersion coefficient, which is the ratio of the standard deviation σ of the data to its corresponding mean μ, which represents the degree of dispersion within the sample interval. , which is also the margin reserved for the pressure line. The greater the degree of dispersion, the larger the allocated buffer space. On the contrary, the smaller the buffer space. α is the adjustment parameter of pressure line margin.

压力线斜率与常数项的确定可使用多项式拟合进行，首先给定一组拟合数据：The determination of the pressure line slope and constant term can be performed using polynomial fitting. First, a set of fitting data is given:

(t_i,Load_i),其中：i＝0,1,2,....,m-1(t _i ,Load _i ), where: i=0,1,2,....,m-1

作一次拟合函数b+kt，转化为均方误差的极小问题，如果能有一组拟合系数使∈最小，则该组拟合系数就可以认为是优选的。均方误差计算公式如下：Making a fitting function b+kt is transformed into a minimum problem of mean square error. If there can be a set of fitting coefficients that minimize ∈, then this set of fitting coefficients can be considered optimal. The calculation formula of mean square error is as follows:

接着，可对∈分别求如下两次次偏导，并令每个偏导为0，联立方程组即可解出k,b。联立方程组为：Then, the following two partial derivatives can be obtained for ∈ respectively, and each partial derivative is set to 0, and the system of simultaneous equations can be solved for k and b. The system of simultaneous equations is:

此外，还需要确定的是压力线调整参数的取值，下面进行问题定义：In addition, what also needs to be determined is the value of the pressure line adjustment parameter. The problem is defined below:

假定不使用负载状态判断器进行判断，将有Π个时刻，负载被低估造成性能下降，目标即是找到这负载低估的Π个时刻。若使用负载状态判断器进行判断，有Ρ个时刻负载不稳定，即潜在的负载低估点。理想情况是Π个时刻与Ρ个时刻一一对应。假定Ρ个时刻内有M个点与Π中的M个点一一对应，那么负载状态判断器的判断准确率为负载低估点找出率为/>判断准确率代表负载状态判断器的精准度，如判断准确率较低，则系统会将许多不必要的负载点位判断为潜在低估点，导致不必要的资源浪费。找出率代表真实负载低估点的找出概率，如找出率过低，则系统不能将大部分负载低估点找出，导致性能的降低。负载状态判断器的目标即是同时保证负载低估点找出率/>和判断准确率/>处于高位状态。Assuming that the load status judger is not used for judgment, there will be Π moments when the load is underestimated, causing performance degradation. The goal is to find Π moments when the load is underestimated. If the load status judger is used for judgment, there are P moments when the load is unstable, which is the potential load underestimation point. The ideal situation is that Π moments correspond to P moments one-to-one. Assuming that there are M points in P moments corresponding to M points in Π, then the judgment accuracy of the load status judger is Load underestimation point finding rate/> The judgment accuracy represents the accuracy of the load status judger. If the judgment accuracy is low, the system will judge many unnecessary load points as potential underestimate points, resulting in unnecessary waste of resources. The discovery rate represents the probability of finding the true load underestimation point. If the discovery rate is too low, the system cannot find most of the load underestimation points, resulting in reduced performance. The goal of the load status judger is to simultaneously ensure the load underestimation point finding rate/> and judgment accuracy/> In a high state.

参数的选择会对结果产生很大的影响，因此需要确定合适的参数。基于大量的实验表明，找出率与准确率几乎是相反的变化趋势，在这种状态下难以找到两者同时处于高位的状态，应做出相应的取舍。参数越小，则准确率越高，找出率越低，该情况适合于激进的资源分配系统，为了实现更高的资源利用率而舍弃部分性能。参数越大，则准确率越低，找出率越高，这种情况适合于保守的资源分配系统，为了实现更高的性能而增加使用了许多资源。以上两种情况都属于极端，优选将参数α取中间值(20<α<40)，此数值范围综合考虑了对性能的影响与资源的使用，特别是在找出率与判断准确率相等时，实现了性能影响与资源使用的平衡。The choice of parameters can have a large impact on the results, so appropriate parameters need to be determined. Based on a large number of experiments, it has been shown that the finding rate and the accuracy rate have almost opposite trends. In this state, it is difficult to find a state where both are at a high level at the same time, and corresponding trade-offs should be made. The smaller the parameter, the higher the accuracy and the lower the finding rate. This situation is suitable for a radical resource allocation system that gives up some performance in order to achieve higher resource utilization. The larger the parameter, the lower the accuracy and the higher the finding rate. This situation is suitable for a conservative resource allocation system, which increases the use of many resources in order to achieve higher performance. The above two situations are extreme. It is best to take the parameter α to the middle value (20<α<40). This value range takes into account the impact on performance and the use of resources, especially when the finding rate and the judgment accuracy are equal. , achieving a balance between performance impact and resource usage.

步骤S100具体包括：Step S100 specifically includes:

历史工作负载信息，采用Kubernetes内置的资源使用情况收集器Metrics Server来收集历史工作负载数据。Metrics Server是Kubernetes集群监控和性能分析工具，可以收集节点上的指标数据。For historical workload information, Kubernetes' built-in resource usage collector Metrics Server is used to collect historical workload data. Metrics Server is a Kubernetes cluster monitoring and performance analysis tool that can collect metric data on nodes.

步骤S200之前具体还包括：Specific steps before step S200 also include:

S201：对负载数据进行数据预处理，数据预处理包括删除异常数据，计算具有相同时间戳的每个参数的平均值；S201: Perform data preprocessing on the load data. Data preprocessing includes deleting abnormal data and calculating the average value of each parameter with the same timestamp;

S202：对负载数据进行监督学习转换，监督学习转换为使用时间窗口将负载数据转化为带标签的有监督学习序列。S202: Perform supervised learning conversion on the load data. The supervised learning conversion uses a time window to convert the load data into a labeled supervised learning sequence.

通过删除异常数据，然后监督学习转换，使用时间窗口将数据转化为带标签的有监督学习序列，以提升预测准确率。By removing outliers and then supervised learning transformation, time windows are used to transform the data into labeled supervised learning sequences to improve prediction accuracy.

垂直伸缩策略具有响应时间快的优点，对于突增的负载，可以使用垂直扩展及时应对，以减少服务性能的下降所带来的影响。此外，使用垂直伸缩可以对云服务器进行更细粒度的资源管理。垂直伸缩策略包括逐项差分的垂直伸缩策略及基于负载预测的垂直伸缩策略。The vertical scaling strategy has the advantage of fast response time. For sudden load increases, vertical scaling can be used to respond promptly to reduce the impact of service performance degradation. In addition, using vertical scaling allows for more fine-grained resource management of cloud servers. Vertical scaling strategies include item-by-item differential vertical scaling strategies and load prediction-based vertical scaling strategies.

若微服务负载相对稳定，则选用较为激进的资源管理方法，实现资源的节约，如基于负载预测的机器学习方法LightGBM(Light Gradient Boosting Machine，轻量级梯度提升机器学习)。相反，若微服务负载相对不稳定，则采用较为保守的资源管理方法，以确保不违反SLA(Service-Level Agreement，服务等级协议)规则，如采用逐项差分法的响应式弹性伸缩策略。If the microservice load is relatively stable, a more radical resource management method can be used to save resources, such as the machine learning method LightGBM (Light Gradient Boosting Machine) based on load prediction. On the contrary, if the microservice load is relatively unstable, a more conservative resource management method will be adopted to ensure that SLA (Service-Level Agreement, Service Level Agreement) rules are not violated, such as the responsive elastic scaling strategy using the item-by-item differential method.

以上均作用于垂直伸缩，若垂直方向上资源难以支撑当前负载(超过某一特定阈值)，则触发水平伸缩策略，将资源利用率维持在一定的范围内。为了避免抖动，还设定一段冷却时间，防止频繁水平扩缩容影响服务性能。水平扩容幅度较大，通常情况下运行一段时间后服务转向平稳运行，或者峰值过去了就会导致资源过剩，因此水平扩容后，再进行垂直方向的资源回退微调，采用提出的指数回退法将垂直资源额度慢慢降下来，回收没有用到的资源。The above all apply to vertical scaling. If resources in the vertical direction are unable to support the current load (exceeding a certain threshold), the horizontal scaling strategy will be triggered to maintain resource utilization within a certain range. In order to avoid jitter, a cooling time is also set to prevent frequent horizontal expansion and contraction from affecting service performance. Horizontal expansion has a large amplitude. Usually, after running for a period of time, the service turns to smooth operation, or the peak value passes, resulting in excess resources. Therefore, after horizontal expansion, vertical resource rollback fine-tuning is performed, using the proposed exponential rollback method. Slowly reduce the vertical resource quota and recycle unused resources.

步骤S300具体包括：Step S300 specifically includes:

采用逐项差分的垂直伸缩策略的进行资源分配。Resource allocation is performed using item-by-item differential vertical scaling strategy.

采用逐项差分的垂直伸缩策略：是针对于没有明显周期变化规律的云服务器在线应用，通常难以先验地预测下一时段的负载，被动式弹性伸缩策略主要是收集微服务最近一时段的负载数据，设置合适的目标阈值，一旦发现某些指标超过设定的阈值，就立即触发扩缩容操作，维持资源利用率的稳定状态。Vertical scaling strategy using item-by-item differentiation: It is aimed at cloud server online applications that do not have obvious periodic changes. It is usually difficult to predict the load in the next period a priori. The passive elastic scaling strategy mainly collects the load data of microservices in the latest period. , set appropriate target thresholds, and once it is found that some indicators exceed the set thresholds, the expansion and contraction operations will be triggered immediately to maintain a stable state of resource utilization.

现有的云数据中心使用流行的工具和框架，如使用Google开发的Kubernetes来自动扩展微服务，Kubernetes通过监视过去时间窗中的历史资源使用量C_L，然后将下一个时间窗的资源分配量C_N设置为：Existing cloud data centers use popular tools and frameworks, such as Kubernetes developed by Google to automatically scale microservices. Kubernetes monitors the historical resource usage C _L in the past time window and then allocates resources in the next time window. C _N is set to:

C_N＝C_L(1+ρ)C _N =C _L (1+ρ)

其中，ρ为云服务器管理员根据需求灵活设置的安全系数。Among them, ρ is the security factor that the cloud server administrator can flexibly set according to needs.

显而易见，Kubernetes的伸缩策略设计过于简单，只能基于实时感知到的负载值和预定义资源水位阈值进行比较从而计算所需资源，导致资源利用率普遍较低，冗余严重，并且这种方式缺乏风险管控机制﹐并不适用于工业生产。Obviously, the scaling strategy design of Kubernetes is too simple. It can only calculate the required resources based on the comparison between the real-time perceived load value and the predefined resource water level threshold, resulting in generally low resource utilization and serious redundancy. This method lacks The risk management and control mechanism does not apply to industrial production.

在本申请中，将采用逐项差分的垂直伸缩策略的资源分配式，可在一定程度上缓解资源供应不足和过度供应所带来的影响。例如，滑动时间窗口设置为五，那么资源分配式为：In this application, the resource allocation formula of the item-by-item differential vertical scaling strategy can be used to alleviate the impact of insufficient and over-supply of resources to a certain extent. For example, if the sliding time window is set to five, then the resource allocation formula is:

其中，C_N为资源分配量，α、β、γ、δ、ε五个系数分别为上个时间窗口内收集的5次资源使用数据C_L1、C_L2、C_L3、C_L4、C_L5的差分系数。以上五个差分系数满足关系：Among them, C _N is the resource allocation amount, and the five coefficients α, β, γ, δ, and ε are respectively the five resource usage data C _L1 , C _L2 , C _L3 , C _L4 , and C _L5 collected in the previous time window. Differential coefficient. The above five differential coefficients satisfy the relationship:

α+4λ＝β+3λ＝γ+2λ＝δ+λ＝εα+4λ＝β+3λ＝γ+2λ＝δ+λ＝ε

其中，λ为云服务器管理人员根据需求设置的差分数，λ设置为0即是Kubernetes的资源分配式，λ不宜设置过大，若λ设置过大，会导致资源分配额与C_L5过度相关，λ也不宜设置为负数，否则会导致资源分配额与C_L1强相关。Among them, λ is the difference number set by the cloud server manager according to the needs. Setting λ to 0 is the resource allocation formula of Kubernetes. λ should not be set too large. If λ is set too large, the resource allocation amount will be excessively related to C _L5 . λ should not be set to a negative number, otherwise the resource allocation will be strongly correlated with C _L1 .

步骤S300具体包括：Step S300 specifically includes:

S301：通过收集历史工作的负载数据，对负载数据进行数学建模和画像；S301: By collecting historical work load data, perform mathematical modeling and portrait of the load data;

S302：基于数学建模和画像，预测下一时刻负载数据；S302: Based on mathematical modeling and portraits, predict the load data at the next moment;

S3021：通过梯度提升框架方法进行负载数据预测，根据预测的结果，选择垂直伸缩策略或水平伸缩策略进行资源分配；S3021: Predict load data through the gradient boosting framework method, and select a vertical scaling strategy or a horizontal scaling strategy for resource allocation based on the prediction results;

S303：基于下一时刻的负载数据分析资源的需求，及时分配或回收相应的资源。S303: Analyze resource requirements based on the load data at the next moment, and allocate or recycle corresponding resources in a timely manner.

采用基于负载预测的垂直伸缩策略：是针对于负载有明显周期变化规律的云服务器在线应用，通过收集历史负载数据，对负载进行数学建模和画像，预测下一时刻负载并分析资源的需求，及时分配或回收相应的资源，它可以很好地解决扩缩容存在的延时问题。本申请将采用基于决策树算法的分布式梯度提升框架LightGBM(Light Gradient BoostingMachine，轻量级梯度提升机器学习)进行负载预测分析。LightGBM是用于梯度提升的开源框架，是实现GBDT算法的框架之一，支持高效的并行训练。Adopting a vertical scaling strategy based on load prediction: It is aimed at online applications of cloud servers with obvious cyclic changes in load. By collecting historical load data, mathematical modeling and portraits of the load are performed to predict the load at the next moment and analyze resource requirements. By allocating or recycling corresponding resources in a timely manner, it can well solve the delay problem of expansion and contraction. This application will use the distributed gradient boosting framework LightGBM (Light Gradient Boosting Machine, lightweight gradient boosting machine learning) based on the decision tree algorithm for load prediction analysis. LightGBM is an open source framework for gradient boosting. It is one of the frameworks that implements the GBDT algorithm and supports efficient parallel training.

GBDT(梯度下降树)是机器学习中的模型，其主要思想是利用弱分类器(决策树)迭代训练以得到最优模型，该模型具有训练效果好、不易过拟合等优点。但GBDT在每一次迭代的时候，都需要遍历整个训练数据多次。如果把整个训练数据装进内存则会限制训练数据的大小；如果不装进内存，反复地读写训练数据又会消耗非常大的时间。尤其面对工业级海量的数据，普通的GBDT算法是不能满足其需求的。GBDT (Gradient Descent Tree) is a model in machine learning. Its main idea is to use weak classifiers (decision trees) to iteratively train to obtain the optimal model. This model has the advantages of good training effect and not easy to overfit. However, GBDT needs to traverse the entire training data multiple times in each iteration. If the entire training data is loaded into the memory, the size of the training data will be limited; if it is not loaded into the memory, repeatedly reading and writing the training data will consume a lot of time. Especially in the face of industrial-level massive data, the ordinary GBDT algorithm cannot meet its needs.

现有的GBDT工具XGBoost(一个优化的分布式梯度增强库)是一种基于预排序方法的决策树算法，这种构建决策树的算法缺点很明显：首先，空间消耗大；其次，时间上也有较大的消耗；最后，对cache(高速缓冲存储器)优化不友好。为了避免上述XGBoost的缺陷，并且能够在不损害准确率的条件下加快GBDT模型的训练速度，LightGBM在传统的GBDT算法上进行了如下优化：单边梯度采样、互斥特征捆绑、带深度限制的叶子生长策略、直接支持类别特征、支持高效并行、Cache命中率优化。在优化后，LightGBM可以在极少的时间开销下进行模型训练，并取得较高的预测准确度，可以让GBDT更好更快地用于云负载预测的实践中。The existing GBDT tool XGBoost (an optimized distributed gradient boosting library) is a decision tree algorithm based on the pre-sorting method. The disadvantages of this algorithm for building decision trees are obvious: first, it consumes a lot of space; second, it also consumes a lot of time. Large consumption; finally, it is not friendly to cache (cache memory) optimization. In order to avoid the above-mentioned defects of XGBoost and speed up the training of GBDT model without compromising accuracy, LightGBM has made the following optimizations on the traditional GBDT algorithm: unilateral gradient sampling, mutually exclusive feature bundling, and depth limitation. Leaf growth strategy, direct support for category features, support for efficient parallelism, and Cache hit rate optimization. After optimization, LightGBM can perform model training with very little time overhead and achieve high prediction accuracy, allowing GBDT to be used in cloud load prediction practice better and faster.

本申请使用LightGBM进行负载预测后，即可根据负载量提前进行资源的相关配置，并且LightGBM的模型定期进行维护更新。After this application uses LightGBM for load prediction, resources can be configured in advance according to the load, and the LightGBM model is regularly maintained and updated.

步骤S300具体包括：Step S300 specifically includes:

采用水平伸缩策略协同垂直伸缩进行细粒度的负载数据管理。Horizontal scaling strategy is used to coordinate vertical scaling for fine-grained load data management.

具体地，水平伸缩策略简单有效，因而在生产中被广泛应用，但水平伸缩系统的技术水平在设计上仍有一些缺点。首先，水平伸缩技术通过增加或减少服务副本的数量，以在未来一段时间内达到目标的资源利用状态，而当负载出现突发和波动时，就会导致效率低下，可能导致过度供应或供应不足；其次，水平伸缩的执行需要一定的时间，因而一些云计算资源管理系统需要设法在负载突发前进行水平扩展操作，这有极大的难度。一方面，如若错误判断提前进行了水平扩展，会导致资源利用率低，经济效益差；另一方面，如若未及时进行水平扩展，会导致服务性能下降甚至服务不可用。Specifically, the horizontal scaling strategy is simple and effective, so it is widely used in production. However, the technical level of the horizontal scaling system still has some shortcomings in the design. First of all, horizontal scaling technology increases or decreases the number of service replicas to achieve the target resource utilization state in the future. When load bursts and fluctuations occur, it will lead to inefficiency, which may lead to over-provisioning or under-provisioning. ; Secondly, the execution of horizontal scaling takes a certain amount of time, so some cloud computing resource management systems need to find ways to perform horizontal scaling operations before load bursts, which is extremely difficult. On the one hand, if you make a wrong judgment and perform horizontal expansion in advance, it will lead to low resource utilization and poor economic benefits; on the other hand, if you do not perform horizontal expansion in time, it will lead to service performance degradation or even service unavailability.

了解决上述问题问题，本申请在自动缩放系统中引入了冷却期的技术，在最近一次操作之后的一段时间内不再执行自动缩放操作，减少系统颠簸与抖动所带来的不良影响。其次，本申请中的水平伸缩策略将协同垂直伸缩策略进行细粒度的资源管理，主要是采用预配置的方法，并结合垂直伸缩策略进行细粒度的资源管理，水平方向上预配置后，垂直方向上以一定或预设的衰退率进行缩容操作，兼顾资源调整的灵活与成本的节约。In order to solve the above problems, this application introduces the cooling period technology in the automatic scaling system. The automatic scaling operation will no longer be performed within a period of time after the latest operation, thereby reducing the adverse effects caused by system turbulence and jitter. Secondly, the horizontal scaling strategy in this application will cooperate with the vertical scaling strategy for fine-grained resource management, mainly using the preconfiguration method, and combined with the vertical scaling strategy for fine-grained resource management. After preconfiguration in the horizontal direction, the vertical scaling strategy will The reduction operation is performed at a certain or preset decline rate, taking into account the flexibility of resource adjustment and cost savings.

本申请的技术方案具体阐述如下：The technical solution of this application is specifically described as follows:

参考图2所示，本专利方法的主要流程：Referring to Figure 2, the main process of this patented method is:

步骤一，历史工作负载信息或数据的收集：采用Kubernetes内置的资源使用情况收集器Metrics Server来收集数据。Step 1. Collection of historical workload information or data: Use Kubernetes' built-in resource usage collector Metrics Server to collect data.

步骤二，数据预处理：工作负载预处理，包括删除异常数据，计算具有相同时间戳的每个参数的平均值等。Step 2, data preprocessing: workload preprocessing, including deleting abnormal data, calculating the average value of each parameter with the same timestamp, etc.

步骤三，监督学习转换：通过使用时间窗口将数据转化为带标签的有监督学习序列，提升预测准确率。Step three, supervised learning conversion: improve prediction accuracy by using time windows to convert data into labeled supervised learning sequences.

步骤四，负载数据状态判定：通过随机梯度下降和偏移的方式生成负载压力线，判断负载数据是否处于稳定状态。Step 4: Load data status determination: Generate load pressure lines through stochastic gradient descent and offset to determine whether the load data is in a stable state.

步骤五，策略选择器：根据负载当前状态，选择合适的弹性伸缩策略，包括垂直伸缩策略及水平伸缩策略。主要思想见图3策略选择器伪代码。Step 5: Policy selector: Select an appropriate elastic scaling strategy based on the current status of the load, including vertical scaling strategies and horizontal scaling strategies. The main idea is shown in Figure 3 for the policy selector pseudocode.

负载预测：通过机器学习方法LightGBM进行负载预测，根据预测结果配置资源。Load prediction: Load prediction is performed through the machine learning method LightGBM, and resources are allocated based on the prediction results.

步骤六，响应式伸缩：通过逐项差分的垂直伸缩策略资源分配式进行资源分配，提升响应式弹性伸缩策略性能。Step 6: Responsive scaling: Allocate resources through item-by-item differential vertical scaling strategy resource allocation to improve the performance of the responsive scaling strategy.

步骤七，优化器：对云服务器执行具体的水平伸缩策略、垂直伸缩策略。Step seven, optimizer: implement specific horizontal scaling strategies and vertical scaling strategies for the cloud server.

本申请提出了一种基于机器学习与状态检测的云服务器弹性伸缩策略，能够有效的解决云服务器集群中的微服务的资源管理难题，在保障QoS不损失的同时提高资源分配使用率。This application proposes a cloud server elastic scaling strategy based on machine learning and status detection, which can effectively solve the resource management problem of microservices in cloud server clusters and improve resource allocation utilization while ensuring no loss of QoS.

为达到该目的，本申请所提出的方法对云服务器集群的负载进行状态检测，而后根据检测结果可以选用两种不同的调度方法：主动式策略、被动式策略，并采用两种不同的调度方式：水平伸缩、垂直伸缩。In order to achieve this goal, the method proposed in this application performs status detection on the load of the cloud server cluster, and then two different scheduling methods can be selected according to the detection results: active strategy and passive strategy, and two different scheduling methods are adopted: Horizontal expansion and vertical expansion.

并且，本申请提出了逐项差分的资源分配式进行响应式资源分配，合理利用时序数据的时间标签，为不同时刻的历史负载数据分配不同的权重，使响应式弹性伸缩策略带有部分预测效果。此外，本申请提出了水平伸缩预配置的方法，并配合垂直伸缩以某个的衰退率进行资源收缩，实现更加细粒度的资源管理。In addition, this application proposes an item-by-item differential resource allocation method for responsive resource allocation, rationally utilizing the time tags of time series data, and assigning different weights to historical load data at different moments, so that the responsive elastic scaling strategy has a partial prediction effect. . In addition, this application proposes a horizontal scaling pre-configuration method, and cooperates with vertical scaling to shrink resources at a certain decay rate to achieve more fine-grained resource management.

相比于现有技术，本申请采用了一种基于机器学习与微服务状态检测的弹性伸缩策略，该算法利用微服务负载时序图的内在特征进行状态检测，判断当前时刻负载是否难以预测，进而根据判断结果来选择一种适用的弹性伸缩方法。很好地融合了主动式与被动式弹性伸缩策略，相比于传统的微服务弹性伸缩策略，本方法解决了大部分主动式弹性伸缩策略预测准确度不高、适用性不强的问题，也解决了被动式弹性伸缩策略只能进行实时响应，无法预测未来时刻的集群资源需求的缺点。在一定程度上，能够保障QoS不损失的同时提高资源使用率。Compared with the existing technology, this application adopts an elastic scaling strategy based on machine learning and microservice status detection. This algorithm uses the inherent characteristics of the microservice load sequence diagram to perform status detection to determine whether the load at the current moment is difficult to predict, and then Choose an appropriate elastic scaling method based on the judgment results. It is a good integration of active and passive elastic scaling strategies. Compared with traditional microservice elastic scaling strategies, this method solves most of the problems of low prediction accuracy and weak applicability of active elastic scaling strategies. It also solves The passive elastic scaling strategy has the disadvantage that it can only respond in real time and cannot predict cluster resource needs in the future. To a certain extent, it can ensure that QoS is not lost while improving resource utilization.

本申请实验使用来自Alibaba云数据中心工作负载数据集。使用基于Locust(开源负载测试工具)的负载生成器与基于机器学习模型的云服务器集群调度器，并对比了目前调度领域常用的几种不同的调度算法。This application experiment uses workload data sets from Alibaba Cloud Data Center. A load generator based on Locust (an open source load testing tool) and a cloud server cluster scheduler based on a machine learning model were used, and several different scheduling algorithms commonly used in the current scheduling field were compared.

基于预测式的弹性伸缩策略在负载高度变化时难以预测准确，缓冲区设置较大会导致资源浪费，缓冲区设置较小会导致经常性违反SLA(Service-Level Agreement，服务等级协议)规则。通过随机梯度下降生成的负载压力线可以根据负载特征判断当前负载是否难以预测，并在负载低估点到来之前提前转变弹性伸缩策略，经过测试，负载低估点命中准确度最高可达71.03％。The predictive elastic scaling strategy is difficult to predict accurately when the load changes highly. A large buffer setting will lead to a waste of resources, and a small buffer setting will lead to frequent violations of SLA (Service-Level Agreement, Service Level Agreement) rules. The load pressure line generated by stochastic gradient descent can determine whether the current load is difficult to predict based on load characteristics, and change the elastic scaling strategy in advance before the load underestimate point arrives. After testing, the load underestimate point hit accuracy can reach up to 71.03%.

图4至图7为在阿里巴巴数据集随机截取一段数据在Kubernetes集群中进行测试，在相同负载情况下，执行四种不同方法，比较系统给微服务所分配的资源值以及每秒所处理的负载值，用以分析四种不同的方法对负载的拟合程度，其中图7为本申请的方法实验结果图，对比本方法与Hyscale、Showar、XGboost的垂直伸缩性能，实验结果见图4至图7。Figures 4 to 7 show a random section of data intercepted from the Alibaba data set and tested in the Kubernetes cluster. Under the same load, four different methods were executed to compare the resource values allocated by the system to the microservices and the number of data processed per second. The load value is used to analyze the degree of fitting of the load by four different methods. Figure 7 is the experimental result chart of the method of this application. It compares the vertical scaling performance of this method with Hyscale, Showar, and XGboost. The experimental results are shown in Figures 4 to Figure 7.

图8为各个方法的响应时间，在相同负载情况下，执行四种不同方法时，比较最大响应时间和平均响应时间，本申请方法平均响应时间比Hyscale少18.5％，比Showar少21.42％，比XGboost少15.38％。Figure 8 shows the response time of each method. Under the same load condition, when four different methods are executed, the maximum response time and the average response time are compared. The average response time of the method of this application is 18.5% less than Hyscale, 21.42% less than Showar, and 21.42% less than Showar. XGboost is 15.38% less.

图9为各个方法的资源利用率，在相同负载情况下，执行四种不同方法时CPU利用率的比较，本申请方法相比响应式策略有小幅提升，但因负载存在不稳定状态而采取保守的资源分配策略，资源利用率相比主动式策略有所下降。Figure 9 shows the resource utilization of each method. Under the same load condition, the CPU utilization is compared when executing four different methods. Compared with the responsive strategy, the method of this application has a slight improvement, but due to the unstable state of the load, a conservative approach is adopted. With the resource allocation strategy, the resource utilization rate has declined compared with the proactive strategy.

图10为根据图4至图7，采用DTW算法对负载的拟合程度进行量化比较。图10中采用了动态时间调整算法进行性能量化评分(分数越低，方法越好)，通过量化曲线的相似性来进行性能评估使得性能差别可视化，两条曲线上的各点并不是一一对应，存在一定的偏移，并且点的数量通常并不是一样的，所以通常的欧几里得距离方法并不能奏效，而动态时间调整算法(DTW算法)较为适用，效果比较好。动态时间调整算法(DTW)通常用于检测两条语音的相似程度，由于每次发言，每个字母发音的长短不同，会导致两条语音不会完全的吻合，动态时间调整算法，会对语音进行拉伸或者压缩，使得它们尽可能的对齐。评分结果为：本申请方法41.6分，Hyscale84.0分，Showar70.3分，XGboost55.3分。以上结果证明本专利在云服务器弹性伸缩领域优于现有方法。Figure 10 shows a quantitative comparison of the fitting degree of the load using the DTW algorithm based on Figures 4 to 7. In Figure 10, a dynamic time adjustment algorithm is used for quantitative performance scoring (the lower the score, the better the method). The performance evaluation is performed by quantifying the similarity of the curves to visualize the performance difference. The points on the two curves do not correspond one to one. , there is a certain offset, and the number of points is usually not the same, so the usual Euclidean distance method does not work, and the dynamic time adjustment algorithm (DTW algorithm) is more applicable and has better results. The dynamic time adjustment algorithm (DTW) is usually used to detect the similarity between two voices. Since the pronunciation of each letter is different each time you speak, the two voices will not completely match. The dynamic time adjustment algorithm will affect the speech. Stretch or compress them so that they are as aligned as possible. The scoring results are: 41.6 points for this application method, 84.0 points for Hyscale, 70.3 points for Showar, and 55.3 points for XGboost. The above results prove that this patent is superior to existing methods in the field of cloud server elastic scaling.

实施例2Example 2

根据本发明的另一实施例，提供了一种云服务器资源管理的弹性伸缩系统，参见图11，包括：According to another embodiment of the present invention, an elastic scaling system for cloud server resource management is provided. See Figure 11, which includes:

数据收集模块100，用于收集历史工作的负载数据；The data collection module 100 is used to collect historical work load data;

负载状态判断模块200，用于通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，负载状态包括稳定状态及非稳定状态；The load state determination module 200 is used to generate a load pressure line through random gradient descent and offset, and determine the load state of the load data based on the load pressure line. The load state includes a stable state and an unstable state;

策略选择模块300，用于根据当前负载状态选择弹性伸缩策略进行资源分配，弹性伸缩策略包括垂直伸缩策略及水平伸缩策略；The policy selection module 300 is used to select an elastic scaling strategy for resource allocation according to the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

资源调度器400，用于基于分配的资源，对云服务器执行具体的水平伸缩策略或垂直伸缩策略，实现资源分配管理。The resource scheduler 400 is used to implement specific horizontal scaling strategies or vertical scaling strategies for the cloud server based on the allocated resources to implement resource allocation management.

具体地，本申请的设计目标是提供一个高效而稳定的资源管理系统，服务提供商可以应用该模型来合理地配置资源。图12显示了本申请的主要组件，图3显示了本专利的关键部分代码。Specifically, the design goal of this application is to provide an efficient and stable resource management system, and service providers can apply this model to rationally allocate resources. Figure 12 shows the main components of this application, and Figure 3 shows the key part of the code of this patent.

如图12所示，本专利的系统模型包括了以下组成部分：负载生成器，工作负载分析器，集群调度器。As shown in Figure 12, the system model of this patent includes the following components: load generator, workload analyzer, and cluster scheduler.

负载生成器：负载生成器使用真实的负载数据进行仿真，能生成一系列HTTP访问请求，从而检验该系统的有效性。具体从Alibaba负载集选取部分数据，经过数据的预处理和转换后，借助Locust进行压力测试。Load generator: The load generator uses real load data for simulation and can generate a series of HTTP access requests to test the effectiveness of the system. Specifically, some data are selected from the Alibaba load set, and after data preprocessing and conversion, stress testing is performed with the help of Locust.

工作负载分析器：工作负载分析器的作用是分析负载特征，以便采取最优的资源调度策略，首先在负载预测图上挖掘出预测不准确点的特征，而后通过负载压力线的方式来判断当前负载所处状态。压力线使用回归和偏移的方式进行计算，相关参数可动态调整并自适应。Workload analyzer: The function of the workload analyzer is to analyze the load characteristics in order to adopt the optimal resource scheduling strategy. First, it digs out the characteristics of inaccurate prediction points on the load prediction map, and then judges the current situation through the load pressure line. The state of the load. The pressure line is calculated using regression and offset, and the relevant parameters can be dynamically adjusted and adaptive.

集群调度器：集群调度器负责收取工作负载分析器的资源调度指令，作用于云服务器集群，使资源配置更合理。调度策略主要有垂直伸缩与水平伸缩两种。Cluster scheduler: The cluster scheduler is responsible for receiving resource scheduling instructions from the workload analyzer and acting on the cloud server cluster to make resource allocation more reasonable. There are two main scheduling strategies: vertical scaling and horizontal scaling.

本发明实施例中的云服务器资源管理的弹性伸缩系统，通过随机梯度下降和偏移的方式生成负载压力线，基于负载压力线判断负载数据所处的负载状态，根据判断结果来选择一种适用的弹性伸缩方法，相比于传统的微服务弹性伸缩策略，本方法解决了主动式弹性伸缩策略预测准确度不高、适用性不强的问题，且能够保障QoS(Quality of Service，服务质量)不损失的同时提高资源的利用率。The elastic scaling system of cloud server resource management in the embodiment of the present invention generates a load pressure line through random gradient descent and offset, determines the load state of the load data based on the load pressure line, and selects an appropriate method based on the judgment result. Compared with the traditional microservice elastic scaling strategy, this method solves the problems of low prediction accuracy and weak applicability of the active elastic scaling strategy, and can guarantee QoS (Quality of Service, Quality of Service) Improve resource utilization without loss.

实施例3Example 3

基于上述云服务器资源管理的弹性伸缩方法，本实施例提供了一种计算机可读存储介质，计算机可读存储介质存储有一个或者多个程序，一个或者多个程序可被一个或者多个处理器执行，以实现如上述实施例的云服务器资源管理的弹性伸缩方法中的步骤。Based on the above elastic scaling method of cloud server resource management, this embodiment provides a computer-readable storage medium. The computer-readable storage medium stores one or more programs. The one or more programs can be processed by one or more processors. Execute the steps in the elastic scaling method for cloud server resource management as in the above embodiment.

实施例4Example 4

一种终端设备，包括：处理器、存储器及通信总线；存储器上存储有可被处理器执行的计算机可读程序；通信总线实现处理器和存储器之间的连接通信；处理器执行计算机可读程序时实现上述的云服务器资源管理的弹性伸缩方法中的步骤。A terminal device, including: a processor, a memory and a communication bus; the memory stores a computer-readable program that can be executed by the processor; the communication bus realizes connection and communication between the processor and the memory; the processor executes the computer-readable program Implement the steps in the above-mentioned elastic scaling method of cloud server resource management.

基于上述云服务器资源管理的弹性伸缩方法，本申请提供了一种终端设备，如图13所示，其包括至少一个处理器(processor)20；显示屏21；以及存储器(memory)22，还可以包括通信接口(Communications Interface)23和总线24。其中，处理器20、显示屏21、存储器22和通信接口23可以通过总线24完成相互间的通信。显示屏21设置为显示初始设置模式中预设的用户引导界面。通信接口23可以传输信息。处理器20可以调用存储器22中的逻辑指令，以执行上述实施例中的方法。Based on the above elastic scaling method of cloud server resource management, this application provides a terminal device, as shown in Figure 13, which includes at least one processor (processor) 20; a display screen 21; and a memory (memory) 22. It can also Including communications interface (Communications Interface) 23 and bus 24. Among them, the processor 20, the display screen 21, the memory 22 and the communication interface 23 can complete communication with each other through the bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. Communication interface 23 can transmit information. The processor 20 can call logical instructions in the memory 22 to execute the methods in the above embodiments.

此外，上述的存储器22中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。In addition, the above-mentioned logical instructions in the memory 22 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.

存储器22作为一种计算机可读存储介质，可设置为存储软件程序、计算机可执行程序，如本公开实施例中的方法对应的程序指令或模块。处理器20通过运行存储在存储器22中的软件程序、指令或模块，从而执行功能应用以及数据处理，即实现上述实施例中的方法。As a computer-readable storage medium, the memory 22 can be configured to store software programs, computer-executable programs, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 executes software programs, instructions or modules stored in the memory 22 to execute functional applications and data processing, that is, to implement the methods in the above embodiments.

存储器22可包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需的应用程序；存储数据区可存储根据终端设备的使用所创建的数据等。此外，存储器22可以包括高速随机存取存储器，还可以包括非易失性存储器。例如，U盘、移动硬盘、只读存储器(Read-Only Memory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等多种可以存储程序代码的介质，也可以是暂态存储介质。The memory 22 may include a program storage area and a data storage area, where the program storage area may store an operating system and at least one application program required for a function; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 22 may include high-speed random access memory, and may also include non-volatile memory. For example, there are many media that can store program code, such as U disk, mobile hard disk, read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, or they can also be temporary state storage media.

此外，上述存储介质以及终端设备中的多条指令处理器加载并执行的具体过程在上述方法中已经详细说明，在这里就不再一一陈述。In addition, the specific process of loading and executing the multiple instruction processors in the above storage medium and terminal device has been described in detail in the above method, and will not be described one by one here.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above are only preferred embodiments of the present invention. It should be noted that those skilled in the art can make several improvements and modifications without departing from the principles of the present invention. These improvements and modifications can also be made. should be regarded as the protection scope of the present invention.

Claims

1. An elastic scaling method for cloud server resource management, which is characterized by including the following steps:

Collect historical workload data;

Generate a load pressure line through stochastic gradient descent and offset, and determine the load state of the load data based on the load pressure line. The load state includes a stable state and an unstable state;

Select an elastic scaling strategy for resource allocation according to the current load status. The elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

Based on the allocated resources, the cloud server executes the specific horizontal scaling strategy or vertical scaling strategy to realize resource allocation management.

2. The elastic scaling method according to claim 1, characterized in that the load pressure line is generated by random gradient descent and offset, and the load state of the load data is determined based on the load pressure line. The load state includes stable state and unstable state, and also includes:

Perform data preprocessing on the load data. The data preprocessing includes deleting abnormal data and calculating the average value of each parameter with the same timestamp;

Supervised learning conversion is performed on the load data, and the supervised learning conversion uses a time window to convert the load data into a labeled supervised learning sequence.

3. The elastic scaling method according to claim 1, characterized in that the elastic scaling strategy is selected according to the current load status for resource allocation, and the elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy including:

Allocate resources using the vertical scaling strategy of item-by-item differential;

The resource allocation formula of the vertical scaling strategy using item-by-item difference is specifically:

Among them, C _N is the resource allocation amount, and the five coefficients α, β, γ, δ, and ε are respectively the five resource usage data C _L1 , C _L2 , C _L3 , C _L4 , and C _L5 collected in the previous time window. The difference coefficient, ρ, represents the margin of resource allocation in vertical scaling;

The above five differential coefficients satisfy the relationship:

α+4λ=β+3λ=γ+2λ=δ+λ=ε;

Among them, λ is the difference number set by cloud server managers according to needs. Setting λ to 0 is the resource allocation formula of Kubernetes. λ is not a negative number.

4. The elastic scaling method according to claim 1, characterized in that the elastic scaling strategy is selected according to the current load status for resource allocation, and the elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy including:

Adopt a vertical scaling strategy based on the load data prediction;

The vertical scaling strategy based on the load data prediction is specifically:

By collecting the load data of historical work, mathematically modeling and profiling the load data;

Based on the mathematical modeling and portrait, predict the load data at the next moment;

Based on the load data analysis resource demand at the next moment, the corresponding resources are allocated or recycled in a timely manner.

5. The elastic scaling method according to claim 4, characterized in that, based on the mathematical modeling and portrait, predicting the load data at the next moment is specifically:

The distributed gradient boosting framework LightGBM based on the decision tree algorithm is used to predict the load data at the next moment.

6. The elastic scaling method according to claim 4, characterized in that the elastic scaling strategy is selected according to the current load status for resource allocation, and the elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy including:

Use the horizontal scaling strategy to coordinate with the vertical scaling for fine-grained load data management;

The specific steps of using the horizontal scaling strategy to coordinate with the vertical scaling for fine-grained load data management include:

Adopt the preconfigured horizontal scaling strategy and combine it with the vertical scaling for fine-grained resource management. After preconfiguring in the horizontal direction, resources are allocated at a preset decline rate in the vertical direction, taking into account the flexibility and cost of resource adjustment. savings;

The resource allocation is performed based on an exponential backoff method, which is as follows:

In the formula, C _N is the amount of resource allocation, C _A is the total amount of resource allocation after horizontal expansion, φ is the base number in the exponential backoff algorithm, n is the number of rounds that need to be rolled back, and t is a different time.

7. The elastic scaling method according to claim 4, characterized in that, based on the mathematical modeling and portrait, predicting the load data at the next moment includes:

Load data prediction is performed through the gradient boosting framework method, and the vertical scaling strategy or the horizontal scaling strategy is selected for resource allocation based on the prediction results.

8. An elastic scaling system for cloud server resource management, which is characterized by including:

Data collection module, used to collect load data of historical work;

A load state determination module is used to generate a load pressure line through random gradient descent and offset, and determine the load state of the load data based on the load pressure line. The load state includes a stable state and an unstable state;

A policy selection module, configured to select an elastic scaling strategy for resource allocation according to the current load status, where the elastic scaling strategy includes a vertical scaling strategy and a horizontal scaling strategy;

The resource scheduler is used to execute the specific horizontal scaling strategy or vertical scaling strategy on the cloud server based on the allocated resources to implement resource allocation management.

9. A computer-readable medium, characterized in that the computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement the claims Steps in the elastic scaling method described in any one of 1-7.

10. A terminal device, characterized in that it includes: a processor, a memory, and a communication bus; the memory stores a computer-readable program that can be executed by the processor;

The communication bus implements connection communication between the processor and the memory;

When the processor executes the computer-readable program, the steps in the elastic scaling method according to any one of claims 1-7 are implemented.