CN117687760A

CN117687760A - LVC simulation-oriented intelligent scheduling method for container cloud resources

Info

Publication number: CN117687760A
Application number: CN202311788359.5A
Authority: CN
Inventors: 胡建刚; 李俊杰; 叶梦雅; 毛余琨; 万张博; 陈励; 卫榆松; 朱一飞; 彭玉怀
Original assignee: Hangzhou Zhiyuan Research Institute Co ltd
Current assignee: Hangzhou Zhiyuan Research Institute Co ltd
Priority date: 2023-12-22
Filing date: 2023-12-22
Publication date: 2024-03-12

Abstract

The invention discloses an intelligent scheduling method of container cloud resources for LVC simulation, which comprises the following steps: clustering and combining tasks with frequent interaction into the same Pod after containerization through a deep clustering algorithm; predicting the dynamically-changed load by using a load prediction model based on the combined model; the method and the device have the advantages that the scheduling sequence of the Pod is determined by using the preemptive scheduling scheme based on the dynamic priority, then the preemptive rule is combined, the normal scheduling of the low-priority Pod is ensured while the high-priority Pod is deployed and operated at first, and finally the HPA strategy is combined to finish the elastic expansion and contraction of cloud simulation resources.

Description

LVC simulation-oriented intelligent scheduling method for container cloud resources

Technical Field

The invention belongs to the field of full-automatic products, and relates to an intelligent scheduling method of container cloud resources for LVC simulation.

Background

Along with the accelerated evolution of computer and simulation technology, a low-cost and high-efficiency LVC (Virtual) distributed cloud simulation technology is highly focused by modeling simulation experts. The LVC cloud simulation system has complex tasks with different running requirements such as strong real-time, super real-time, non real-time and the like, and in a virtual-real fusion cloud simulation scene, the requirements on granularity, instantaneity, resource utilization rate and elastic expansion and contraction of resource scheduling in cloud simulation resource management and control are raised from fine-grained real-time perception and rendering of environment details such as light shadows and sound to severe and frequent large-scale communication and interoperation guarantee between virtual-real simulation main bodies, and the conventional cloud simulation resource scheduling strategy cannot meet the requirements. Therefore, it is needed to provide a fine-grained containerized container cloud resource intelligent scheduling method facing LVC simulation.

Currently, most containerized cloud simulation resource scheduling schemes execute simple strategy optimization on a Kubernetes native scheduling scheme, and the threshold-based response type or prediction-based scheduling method is combined with HPA (Horizontal Pod Autoscaler) strategy, so that the scheduling of cloud simulation resources and the flexible expansion and contraction of the number of Pods (Kubernetes minimum scheduling units) in the horizontal direction are completed by utilizing the system load condition. In other schemes, dynamic adjustment of Pod resource specifications in the vertical direction is realized by combining a statistical model or a neural network prediction method with a scheduling method and a VPA (Vertical Pod Autoscaler) strategy, and finally, flexible scheduling of cloud resources is completed.

In the cloud simulation scene facing the LVC, virtual-real interaction can generate a large number of simulation tasks with dynamic changes, different importance and real-time performance, and the scheduling service of the existing threshold-based response scheduling strategy is often behind the load change, so that the scheduling requirement is difficult to meet, the scheduling strategy based on resource requirement and load prediction does not fully utilize the data characteristics, only unilateral linear or nonlinear data is processed, and the prediction model is not accurate enough. Meanwhile, in the optimal stage of a scheduling strategy, most schemes only use two key resource indexes, certain unilateral performance exists, and the existing elastic expansion and contraction method based on HPA cannot meet the differentiated requirements of simulation training in resource utilization rate.

Disclosure of Invention

In order to solve the problems that high-frequency virtual-real interaction requirements exist in an LVC cloud simulation system, communication cost is large, resource scheduling cannot meet real-time requirements, and an existing cloud simulation resource scheduling scheme is difficult to meet simulation requirements of multiple entities, high concurrency, high network I/O, high throughput and the like in cloud simulation, the invention adopts the following technical scheme: an intelligent scheduling method of container cloud resources for LVC simulation comprises the following steps:

Clustering and combining tasks with frequent interaction into the same Pod after containerization through a deep clustering algorithm;

predicting the dynamically-changed load by using a load prediction model based on the combined model;

determining the scheduling sequence of the Pod by utilizing a preemptive scheduling scheme based on dynamic priority, and ensuring the normal scheduling of the low-priority Pod while ensuring the leading deployment and operation of the high-priority Pod by combining with preemptive rules;

and finally, combining with an HPA strategy to finish the elastic expansion and contraction of the cloud simulation resources.

Further: the process of clustering and combining the tasks with frequent interaction into the same Pod after containerization by a deep clustering algorithm is as follows:

establishing a cloud simulation task dependency model;

learning the dependency of the cloud simulation task based on the twin neural network;

SpectralNet based learning map F _θ Clustering combination is carried out to form the same Pod after mutually frequent task containerization.

Further: the process for establishing the cloud simulation task dependency model is as follows:

step 1-1: defining a cloud simulation task relation matrix;

assume that in a certain simulation process of LVC cloud simulation, n simulation tasks T exist _i (i=1, 2,3,., n) expressing the relationship between simulation tasks by an n x n relationship matrix, any row or column of the matrix representing a task, the matrix being denoted TCM:

TCM＝[C _ij ] _n×n

Step 1-2: defining cloud simulation task dependency strength C _ij ；

TCM matrix main diagonal elements represent tasks themselves, have no practical significance, C _ii ＝0，C _ij For task T _i And T _j Is defined as:

C _ij ＝r _ij +I _ij +O _ij (i，j＝1，2，...n，)

wherein r is _ij For task T _i And T _j Frequent information interaction of (2)Degree, I _ij ，O _ij Respectively task T _i And T _j Input/output similarity degree of (2);

step 1-3: computing cloud simulation task information interaction frequency r _ij ；

For one task, to obtain information amounts with different values from other tasks, including 0, if information transfer of each subtask is zero, task T _i The corresponding n information receptions, namely, the information demand data are recorded as:

I′ _i1 ，I′ _i2 ，...，I′ _ii ，...，I′ _in (i＝1，2，...，n)，I′ _ii ＝0

wherein I 'represents information amount, I' _in Representing task T _i Slave task T _n The amount of information obtained at the location;

each task may send messages to several tasks, where the amount of information they send to themselves is zero, then task T may be sent _k The corresponding k data are noted as:

I′ _1k ，I′ _2k ，…，I′ _kk ，…，I′ _nk (k＝1，2，…，n)，I′ _kk ＝0

wherein I' _nk Representing task T _k To task T _n The size of the amount of information conveyed;

then the task average received information quantity I 'is calculated' _k The formula of (2) is:

in this formula, N _k Representing task T _k The number of messages sent to the outside is represented by I in the formula _mk The expression is as follows:

By applying European distanceSeparation method, determination of subtask T _i And T _j The information interaction frequency degree of (1) is as follows:

step 1-4: calculating input and output similarity of cloud simulation tasks;

calculating the similarity between rows by using Jaccard similarity coefficient, using I _ij Representing the input support degree of the j-th row task of the TCM matrix to the i-th row task, namely two tasks T _i And T is _j Input similarity of (c):

two tasks T are obtained in the same way _i And T is _j Output similarity O of (2) _ij ：

Step 1-5: calculating the dependency relationship of cloud simulation tasks;

the dependency relationship of the task is as follows:

C _ij ＝r _ij +I _ij +O _ij ，i，j∈[1，N]

the task dependency relationship matrix is C _ij An n-order matrix TCM is formed.

Further: the process for learning the dependency degree of the cloud simulation task based on the twin neural network is as follows:

step 2-1: data input;

and inputting the historical task information and the characteristic data, namely the task information interaction frequency, input/output similarity and other data sets, and inputting the task data sets as training data according to a time sequence.

Step 2-2: and constructing a twin neural network model.

Using two co-ordinatesThree layers of simple RNN (RNN-network) circulating neural networks with the same structure and shared weight values construct a twin neural network structure, and the input of each network is two task sample characteristic data sequences x _1：T ＝(x ₁ ，x ₂ ，...，x _t ，...，x _T ) Wherein t represents a time step, and the input at time t is x _t When it conceals layer state h _t The updated formula of (2) is:

wherein W represents a learnable weight matrix, b represents a learnable bias vector, having E { ih, hh }, h _t-1 Represents the hidden layer state (t)>1) Or the initial hidden layer state t=1 at the beginning, typically h ₀ =0, f () represents the nonlinear activation function ReLU;

step 2-3: improving the loss function;

constructing a loss function of a twin network, replacing Euclidean distance between data points by using the dependence degree of a task, constructing a pair with larger dependence degree of the task, constructing a negative pair by the task with smaller dependence degree, training the twin network by taking learning self-adaptive nearest neighbor dependence degree measurement as a target, and mapping each data point xi into an embedding in a certain space by the twin neural networkThe loss function L is defined as:

wherein: d=c _ij Representing the dependence degree of two task samples, y is a label of whether the two samples are matched, y=1 represents that the two samples are similar or matched, y=0 represents that the two samples are not matched, and margin is a set threshold.

Further: the specific steps of the SpectralNet learning mapping are as follows:

step 3-1: setting spectraParameters of Net network, setting SpectralNet input as task sample The cluster number k of the clusters and the batch size m of the neural network; setting a SpectralNet loss function L _SpectralNet The method comprises the following steps:

wherein D is m×m diagonal matrix of task samples, D _i，i ＝∑ _j W _i，j W is the task dependency matrix TCM, Y= { Y ₁ ，y ₂ ，...，y _n }∈R ^k×n ，y _i ＝F _θ (x _i ) Representation of sample x _i Feature vector y calculated by deep neural network model F _i θ is a parameter of the deep neural network model;

step 3-2: constructing a neural network to map F _θ The BP neural network implemented as four layers performs an orthogonalization operation, minimizes a loss by mapping all points to the same output vector, and in order to prevent the objective function from getting a trivial solution, let y satisfy an orthogonalization condition, that is,

E[yy ^T ]＝I _k×k

in each iteration, small batches of m samples are randomly sampled and organized in an mxd matrix X, so that the orthogonality requirement in the small batches is:

step 3-3: sampling a random small batch X of size m, propagating X forward and computing an orthogonalization layerIs input to the computer;

step 3-4: calculating Cholesky decomposition:setting the weight of the orthogonalization layer asThe final layer of the neural network performs orthogonality constraint, according to QR decomposition, about->Setting the weight of the last layer;

step 3-5: performing gradient calculation step, for random small lot size x ₁ ，…，x _m Sampling, and calculating an m multiplied by m similarity matrix W by using a twin network; w is calculated as a task dependency matrix TCM, x is propagated through forward direction ₁ ，…，x _m Obtaining y ₁ ，…，y _m ；

Step 3-6: calculate loss L _SpectralNet (θ) use of L _SpectralNet (θ) to adjust all F except the weights of the output layers _θ Is used for the weight of the (c),

step 3-7: judgment of L _SpectralNet (θ) if not, returning to step 3-3;

step 3-8: forward propagation x ₁ ，...，x _m And obtain F _θ Is embedded layer feature output y of (2) ₁ ，...，y _m At y ₁ ，...，y _m The k-means clustering algorithm is operated on to obtain a classification result c ₁ ，...，c _n ，c _i ∈{1，...，k}。

Further: the process of predicting the dynamically changing load by using the load prediction model based on the combined model is as follows:

step 4-1: using a data set of hypothetical time series as L _t The original data is y _t Then the formula for time series data can be expressed as:

y _t ＝L _t +E _t

will L _t Seen as a linear relationship in the original data sequence, E _t Treated as a nonlinear switch in the original data sequenceTying;

step 4-2: establishing an ARIMA (p, d, q) autoregressive differential moving average model;

the ARIMA (p, d, q) model is a combination of d-th order differential and autoregressive moving average (ARMA (p, q)) on a jerky sequence, wherein the ARMA (p, q) model structure is:

x in the formula _t Representing the time series of the observed object, the interference value epsilon _t And epsilon _t-i Sequence value X _t-i And θ represents the moving average coefficient, p is the autoregressive model order, q is the moving average model order, and φ is the sum of ₀ At=0, the model is a centralised ARMA (p, q) model;

step 4-3: performing stable processing on input data of the ARIMA model;

step 4-4: identifying and grading an ARIMA model;

p and g are determined by ACF and PACF in time series, and the basic rules of model identification are: if the partial autocorrelation function is truncated and the autocorrelation function is trend decay, the model is an AR autoregressive model: if the trend of the partial correlation function is attenuated and the autocorrelation function is truncated, the model is an MA moving average model, when the partial autocorrelation function and the autocorrelation function are both attenuated, the model is an ARMA autoregressive moving average model; after determining the type of the stochastic model, the order of the model is observed by drawing an autocorrelation diagram and a partial autocorrelation diagram, and the AIC criterion is used for determining the optimal model order.

Step 4-5: checking ARIMA model;

and (3) using fitting detection and residual analysis to check the availability of the model, if the residual passes the white noise check, indicating that the model is effective, if the residual fails the check, turning to the step 4-4, continuing to select from the models meeting the precision requirement until the model passes the model check standard, predicting the acquired time sequence by using the checked model as the change rule of the observation object of the fitting model, analyzing the development trend of the observation object by the result of the predicted value, and calculating the prediction error if the error is smaller, thereby completing the prediction. The optimal model after the inspection is obtained through the steps, and the time sequence integration can be input into ARIMA for fitting prediction.

Step 4-6: performing linear prediction by using an ARIMA model;

using ARIMA model for linear relationship L in original data sequence _t Modeling to obtain a predicted value sequence L _t ′；

Step 4-7: obtaining a predicted value sequence L _t After' the predicted result and the actual expected result are compared and extracted into a residual value sequence of ARIMA, and the residual value sequence is taken as the input of a nonlinear partial GRU model;

step 4-8: establishing and training a GRU load prediction model based on an attention mechanism;

step 4-9: using a GRU model based on an attention mechanism to predict;

first, time sequence { x } of data residual values _t Total length n and window size w, time series data { x } _t The window size is divided into window size blocks. Then, proper GRU model parameters are set for training, and an output vector is obtained after each training round. And putting the output vector into a attention function for similarity calculation, normalizing to obtain a weight parameter of the corresponding vector, and then carrying out weighted sum summation on the obtained weight parameter and an output vector set trained by the GRU. Finally, a prediction result { E } _t ' the final error value prediction sequence is:

E _t ′＝f(e _t-1 ，e _t-2 ，…，e _t-n )+ξ _t

wherein: zeta type toy _t Representing GRU neural network error random values, E _t ' GRU network-to-nonlinear relationship E _t Is a predictive expression of (2);

Step 4-10: and superposing the prediction results to obtain a final prediction result, superposing the ARIMA model and the GRU neural network prediction result to obtain a prediction result of the resource usage amount, wherein the sequence expression of the prediction result is as follows.

y _t ′＝L _t ′+E _t ′

Further: the method comprises the steps of determining the scheduling sequence of the Pod by utilizing a preemptive scheduling scheme based on dynamic priority, and ensuring the normal scheduling of the low-priority Pod while ensuring the leading deployment and operation of the high-priority Pod by combining preemptive rules, wherein the process is as follows:

step 5-1: establishing a dynamic priority gradient model;

step 5-2: calculating a priority factor of the Pod;

step 5-3: setting a dynamic change rule of a priority factor P;

step 5-4: setting a preemptive scheduling rule;

step 5-5: selecting a preemption object;

step 5-6: optimizing a dispatching optimization stage scoring function;

step 5-7: calculating the availability of the resource k in the Node i;

step 5-8: calculating priority factors of the availability of the resource k in the Node i in the cluster;

step 5-9: the new evaluation function of the Node in the preferred stage is as follows:

the new evaluation function represents the sum of available resource priorities of Node nodes, and according to the Pod and the actual state of the Node nodes, the obtained idle condition of the Node resources represents that the higher the evaluation score is, the higher the adaptation degree of the idle resources of the Node and the Pod is;

Step 5-10: and selecting a Pod object to be scheduled according to the gradient priority of the Pod, selecting an object to be deployed according to the priority of the Node, and carrying out preemptive scheduling in combination with the step 5-4.

Further: the process of selecting the preemption object is as follows:

the Kubernetes scheduling strategy is divided into a pre-selection stage and a preferred stage, the pre-selection stage filters nodes which do not meet the conditions according to a set of screening rules, the preferred stage scores the nodes which meet the conditions, then the Node with the highest score is selected as an operation Node of the Pod, the pre-selection stage of the scheduling selects a preempted Pod object according to the difference delta P of the priority factors P, if Node nodes meeting the Pod resource requirements do not exist in the current state of the cluster, the Pod continues to wait until the Node nodes meet the conditions, the SLA is seriously violated in the process of the task facing strong real time/real time, and the delta P is calculated as follows:

ΔP＝P _i -P _j ，(i，j＝1，2，...，n)

selecting the Pod with the largest delta P value as the preempted object, expelling the selected Pod object and storing the site execution information thereof in the etcd for the subsequent re-creation of the execution providing information by the Pod.

Further: the process of calculating the priority factor of Pod is as follows:

the method comprises the steps that a Pod queue with scheduling is formed at the same moment, priority factors P are set according to the demands of tasks in the Pod on different resources, the queues to be scheduled of the Pod are ordered in descending order according to the size of P, after the ordering is finished, a Schduler scheduler schedules the Pod according to the sequence, and the specific calculation of the priority factors P of the Pod to be scheduled is as follows:

Wherein M refers to the demand of Pod on memory resources, C represents CPU, G represents GPU demand, MIO represents disk I/O demand, NIO represents network I/O demand, NCC represents communication resource demand, TW represents task waiting time demand, OC represents the number of times Pod is preempted and evicted by high-priority Pod, DL represents task deadline residual time, and the settings of the demands are derived from the load prediction module; l represents a priority gradient coefficient; lambda (lambda) _a ，λ _d ，λ _c ，λ _d ，λ _e ，λ _f The weight coefficient of the resource demand index can be set according to the task characteristics and the demand, and the weight coefficient cannot be changed after the initial setting;epsilon, alpha, beta are variable weight coefficients.

Further: the process of setting the dynamic change rule of the priority factor P is as follows:

for the Pod, alpha and beta weights of the L1 and L2 gradients are linearly increased along with the increase of TW and OC, and the beta weight is linearly increased along with the decrease of DL, because the Pod is preempted according to the difference value delta P of P is set, the Pod, epsilon, alpha and beta weights of the L3 levels are exponentially increased to ensure the smooth execution of the low-priority L3 task, the limitation of the gradient level can be broken through to the L2 level, and the priority factor of the L2 level Pod can not break through the limitation of the gradient level;

The set preemptive scheduling rules are as follows:

when the resources of the working Node are sufficient, pod preemption operation is not needed, pod deployment operation is directly scheduled, when the Node resources are insufficient, preemption is performed according to the maximum delta P, low-priority Pod is always evicted and released until the resource deployment requirements of high-priority Pod are met, and the Pod in the same gradient is prevented from generating system jitter for frequent preemption and does not perform preemption operation; for Pod with the same gradient and the same priority factor, the scheduling is performed by adopting a first come first serve principle.

Further: the process of calculating the availability of the resource k in the Node i is as follows:

let N Node nodes in kubeames cluster, denoted as n= {1, 2..the N }, k resources on each Node, denoted as r_t= {1, 2..the k }, each Pod may request m resources, denoted as r_r= {1, 2..the m }, and the set of pods to be scheduled is denoted as; according to the actual resource utilization rate of six resource evaluation indexes of CPU, GPU, memory, disk I/O and network I/O in the cluster, obtaining the priority weight value of the current Node;

calculating the availability of the resource k in the Node i:

Where a (i, k) represents the available resource k in Node i, and T (k) represents the total amount of resource k in Node i.

According to the container cloud resource intelligent scheduling method for LVC simulation, the relationship of cloud simulation tasks is analyzed, the tasks with frequent communication and interaction are containerized by utilizing modularized clusters to form Pod, then the resource load change of the Pod is predicted by a combined prediction model based on ARIMA and improved GRU (gate-controlled circulation unit), then the prediction result and the current load state are used as input of a preemptive scheduling algorithm, and cloud simulation resource elastic scheduling in the horizontal direction is realized according to the priority set for the Pod.

According to the scheme, delay and communication expenditure caused by high-frequency virtual-real interaction and communication in cloud simulation are reduced, the problems that important tasks cannot be guaranteed to be executed preferentially in the existing scheme, the resource utilization rate of a resource expansion and contraction method is low, delay is high and the like are overcome, the scheduling efficiency of cloud simulation resources is improved, task response time under QoS (quality of service) constraint is prolonged, and fine-grained efficient scheduling of complex heterogeneous resources with different real-time requirements in an LVC cloud simulation system is achieved.

The efficient container cloud simulation resource scheduling technology can reduce communication overhead, accurately predict load change, meet different real-time simulation demands and guarantee priority execution of important tasks.

Aiming at the problems that the existing scheduling scheme ignores complex correlation among cloud simulation tasks and directly executes a scheduling flow, a large number of cross-node communication phenomena are generated, and network communication expenditure is high, the clustering combination algorithm based on an improved SpectalNet deep neural network is provided.

The existing load prediction scheme does not fully utilize the linear and nonlinear characteristics of simulation data, so that the accuracy of a load change prediction model is insufficient to influence the effects of resource scheduling and elastic expansion and contraction, and the resource utilization rate is low.

In order to ensure the priority execution of important tasks and high-instantaneity tasks in a cloud simulation system facing LVC, a preemptive scheduling strategy based on dynamic priority gradients is provided, corresponding priority gradients are set for different instantaneity tasks, meanwhile, scoring screening is conducted only for a CPU and a memory in a preferred stage of a Kubernetes scheduling strategy, the problem of importance of node selection in LVC cloud simulation such as GPU, network I/O, disk I/O and bandwidth utilization rate is omitted, and a scoring function based on multiple resource indexes is adopted for improvement.

Aiming at the problems that the original Kubernetes expansion and contraction strategy based on threshold response has certain hysteresis on system load change, the high-dynamic and real-time requirements of cloud simulation resource scheduling cannot be met, and the like, an HPA strategy based on load change prediction is designed, different expansion and contraction cooling time is set for different real-time tasks, the resource use prediction quantity of a combined model is utilized, and further, advanced planning and scheduling are carried out, so that seamless and low-delay elastic resource expansion and contraction are realized, and the system cost is reduced.

The invention provides a distributed cloud simulation resource scheduling method for LVC, which can effectively solve the problem of variability caused by virtual-real combined LVC cloud simulation. The method comprises a task clustering method based on a SpectralNet deep neural network, a load prediction model based on a combined model, and a preemptive scheduling and dynamic expansion and contraction method based on dynamic priority.

In a task clustering combination method based on a SpectralNet deep neural network, a cloud simulation task dependency model and an improved SpectralNet neural network model are provided. The dependency model of the cloud simulation tasks defines the correlation degree of the tasks according to the information interaction frequency degree and the input and output similarity between the simulation tasks, and can effectively measure the dependency close relation of the tasks. The improved SpectralNet neural network model provided by the invention optimizes the original network loss function, improves the similarity of the original SpectralNet learning data into the dependence among learning tasks, forms a modularized combination of high intra-cluster dependence and low inter-cluster dependence, greatly reduces the container cross-node communication phenomenon, and reduces the communication delay and the cost.

In a load prediction module based on a combined model, the advantages of ARIMA and an improved GRU model are combined, a combined prediction model capable of comprehensively utilizing the linear and nonlinear characteristics of data is provided, the prediction precision and speed of load change are improved, meanwhile, the output of the model is used as the input of dynamic expansion and contraction of resources, and the hysteresis problem of a native response expansion and contraction strategy is effectively solved.

In a preemptive scheduling and dynamic expansion and contraction module based on dynamic priority, a dynamic priority gradient model is provided, different priority gradients are set for different real-time tasks, preemptive scheduling rules are set, and the first execution of high-priority important tasks in a system cloud simulation scene is effectively ensured; and an HPA strategy based on load change prediction is designed, corresponding expansion and contraction capacity cooling time is set for the task, the time delay of expansion and contraction capacity of elastic resources is shortened, the resource utilization rate is improved, and the system cost is reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.

FIG. 1 is a general architecture diagram of the method;

FIG. 2 is a graph of a twin neural network based learning cloud simulation task dependency matrix architecture;

FIG. 3 is a schematic diagram of a combined model-based load prediction model;

FIG. 4 is a block diagram of a GRU model;

FIG. 5 is an overall block diagram of a schedule;

FIG. 6 is a dynamic priority gradient model diagram;

FIG. 7 is a preemptive dispatch flow chart;

fig. 8 is a flow chart of HPA elastic scaling for real-time tasks.

Detailed Description

It should be noted that, without conflict, the embodiments of the present invention and features in the embodiments may be combined with each other, and the present invention will be described in detail below with reference to the drawings and the embodiments.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the invention, its application, or uses. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.

The relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise. Meanwhile, it should be clear that the dimensions of the respective parts shown in the drawings are not drawn in actual scale for convenience of description. Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate. In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.

In the description of the present invention, it should be understood that the azimuth or positional relationships indicated by the azimuth terms such as "front, rear, upper, lower, left, right", "lateral, vertical, horizontal", and "top, bottom", etc., are generally based on the azimuth or positional relationships shown in the drawings, merely to facilitate description of the present invention and simplify the description, and these azimuth terms do not indicate and imply that the apparatus or elements referred to must have a specific azimuth or be constructed and operated in a specific azimuth, and thus should not be construed as limiting the scope of protection of the present invention: the orientation word "inner and outer" refers to inner and outer relative to the contour of the respective component itself.

Spatially relative terms, such as "above … …," "above … …," "upper surface at … …," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial location relative to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "above" or "over" other devices or structures would then be oriented "below" or "beneath" the other devices or structures. Thus, the exemplary term "above … …" may include both orientations of "above … …" and "below … …". The device may also be positioned in other different ways (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.

In addition, the terms "first", "second", etc. are used to define the components, and are only for convenience of distinguishing the corresponding components, and the terms have no special meaning unless otherwise stated, and therefore should not be construed as limiting the scope of the present invention.

An intelligent scheduling method of container cloud resources for LVC simulation comprises the following steps:

s1, clustering and combining tasks with frequent interaction into the same Pod after containerization through a deep clustering algorithm;

s2, predicting the dynamically-changed load by using a load prediction model based on the combined model;

s3, determining the scheduling sequence of the Pod by utilizing a preemptive scheduling scheme based on dynamic priority, and ensuring normal scheduling of the low-priority Pod while ensuring that the high-priority Pod is deployed and operated in advance by combining with preemptive rules;

and S4, finally, combining with an HPA strategy to finish the elastic expansion and contraction of the cloud simulation resources.

The steps S1/S2/S3/S4 are sequentially executed;

the scheme of the invention is divided into a task clustering combination algorithm based on a SpectralNet deep neural network, a load prediction model based on a combination model, and a preemptive scheduling and dynamic expansion and contraction method based on dynamic priority. The overall architecture diagram is shown in fig. 1.

The process of packaging and combining the frequently-interacted tasks after being containerized by the deep clustering algorithm is as follows:

the clustering combination module design step mainly comprises three parts, which are respectively established for a cloud simulation task dependency model, and are based on a spectrum ralnet learning mapping F, wherein the cloud simulation task dependency model is learned based on a twin neural network _θ . The model can be used for carrying out rapid clustering combination on tasks with strong dependency relationship, namely close communication in LVC simulation, so that the tasks are packaged into the same Pod after being containerized later, and communication expenditure and communication delay generated by interaction of task execution units in a system are reduced.

(1) Cloud simulation task dependency model establishment

Before twin network learning, the information interaction frequency of two tasks is analyzed by utilizing an information theory and Euclidean distance method, the similarity degree of task input and output is measured by utilizing Jaccard similarity coefficients and the like, and a dependency relationship model of the tasks is defined by combining the two. The cloud simulation task dependency model is established by the following specific steps:

step 1-1: and defining a cloud simulation task relation matrix.

Assume that in a certain simulation process of LVC cloud simulation, n simulation tasks T exist _i (i=1, 2,3,., n) the relationship between simulation tasks can be expressed by an n x n relationship matrix, any row (column) of the matrix representing a task, the matrix being denoted TCM:

TCM＝[C _ij ] _n×n

step 1-2: defining cloud simulation task dependency strength C _ij 。

TCM matrix main diagonal elements represent tasks themselves, have no practical significance, C _ii ＝0。C _ij For task T _i And T _j Is defined as:

C _ij ＝r _ij +I _ij +O _ij (i，j＝1，2，...n，)

wherein r is _ij For task T _i And T _j Information interaction frequency of I _ij ，O _ij Respectively task T _i And T _j Input/output similarity degree of (c).

Step 1-3: computing cloud simulation task information interaction frequency r _ij 。

For one task, to acquire information amounts (including 0) with different values from other tasks, if information transfer of each subtask is zero, task T _i The corresponding n pieces of information reception (i.e., information demand) data can be recorded as

each task may send messages to several tasks, where the amount of information they send to themselves is zero, then task T may be sent _k The corresponding k data are recorded as;

in this formula, N _k Representing task T _k Number of messages sent to the outside. The average information content of the output message is represented by I _mk The expression is as follows;

determining subtasks T by using Euclidean distance method _i And T _j The information interaction frequency degree of (1) is as follows:

step 1-4: and calculating the input-output similarity of the cloud simulation task.

Step 1-5: and calculating the dependency relationship of the cloud simulation task.

The dependency relationship of the task is as follows:

C _ij ＝r _ij +I _ij +O _ij ，i，j∈[1，N]

(2) Cloud simulation task dependency learning based on twin neural network

The structure diagram of the cloud simulation task dependency matrix based on the twin neural network learning is shown in fig. 2. In the LVC cloud simulation system, relevant information and characteristic data of various cloud simulation tasks generated by system operation and users are used as input of a twin neural network, the network is trained by designing a twin network architecture and a loss function and utilizing historical task data as a training data set, a cloud simulation task dependency matrix TCM is calculated through the trained twin neural network and is used as a similarity matrix W of a SpectralNet, and the dependency relationship among the cloud simulation tasks is studied and evaluated in real time. The specific steps for learning the cloud simulation task dependency based on the twin neural network are as follows:

Step 2-1: and (5) inputting data.

The historical task information and the characteristic data are input into the data sets such as the task information interaction frequency degree, the input-output similarity and the like as described in the section above, and the task data sets are input as training data according to the time sequence.

Step 2-2: and constructing a twin neural network model.

And constructing a twin neural network structure by adopting two three layers of simple RNN circulating neural networks which share weights and have the same structure. The input of each network is two task sample characteristic data sequences x _1：T ＝(x ₁ ，x ₂ ，...，x _t ，...，x _T ) Wherein t represents a time step, and the input at time t is x _t When it conceals layer state h _t The updated formula of (2) is:

wherein W represents a learnable weight matrix, b represents a learnable bias vector, having E { ih, hh }, h _t-1 The hidden layer state (t > 1) at time t-1 or the initial hidden layer state (t=1) at the beginning is indicated, and there is typically h ₀ =0, f () represents the nonlinear activation function ReLU.

Step 2-3: improving the loss function.

And constructing a loss function of the twin network, replacing Euclidean distance among data points by utilizing the dependence degree of the task, constructing a pair of opposite direction with larger dependence degree of the task, constructing a negative pair by the task with smaller dependence degree, and training the twin network by taking learning self-adaptive nearest neighbor dependence degree measurement as a target. Embedding of each data point xi mapped into a space by a twin neural network The loss function is defined as:

wherein d=c _ij Representing the dependence degree of two task samples, y is a label of whether the two samples are matched, y=1 represents that the two samples are similar or matched, y=0 represents that the two samples are not matched, and margin is a set threshold.

(3) SpectralNet based learning map F _θ

In the LVC cloud simulation scene, task execution units with frequent interaction are clustered and combined in one cluster by using a task cluster combination model based on SpectralNet depth neural network to form high intra-cluster dependency and low inter-cluster dependencyAnd the communication overhead and delay generated by frequent virtual-real interaction are reduced by dividing and combining. SpectralNet based learning map F _θ The specific steps of (a) are as follows:

step 3-1: parameters of the SpectralNet network are set. Setting the input of SpectralNet as a task sampleThe cluster number k of the clusters and the batch size m of the neural network; the loss function of SpectralNet was set as:

wherein D is m×m diagonal matrix of task samples, D _i，i ＝∑ _j W _i，j W is the task dependency matrix TCM, Y= { Y ₁ ，y ₂ ，...，y _n }∈R ^k×n ，y _i ＝F _θ (x _i ) Representation of sample x _i Feature vector y calculated by deep neural network model F _i θ is a parameter of the deep neural network model.

Step 3-2: and constructing a neural network. Map F _θ And the orthogonalization operation is executed by the BP neural network which is realized as four layers. Loss can be minimized by mapping all points to the same output vector, in order to prevent the objective function from getting a trivial solution, let y satisfy the orthogonality condition, i.e.,

E[yy ^T ]＝I _k×k

in each iteration, small batches of m samples are randomly sampled and organized in an mxd matrix X. The orthogonal requirements in a small lot are thus:

step 3-4: calculating Cholesky decomposition:setting the weight of the orthogonalization layer asThe last layer of the neural network performs orthogonality constraints. Will +_according to QR decomposition>Set as the weight of the last layer.

Step 3-5: performing gradient calculation step, for random small lot size x ₁ ，...，x _m Sampling is performed. Calculating an m×m similarity matrix W using a twin network; the calculation of W is a task dependency matrix TCM. By forward propagation of x ₁ ，...，x _m Obtaining y ₁ ，...，y _m ；

Step 3-6: calculate loss L _SpectralNet (θ) use of L _SpectralNet (θ) to adjust all F except the weights of the output layers _θ Is a weight of (2).

Step 3-7: judgment of L _SpectralNet And (theta) whether the convergence is carried out, and if the convergence is not carried out, the step 3-3 is returned.

Step 3-8: forward propagation x ₁ ，...，x _m And obtain F _θ Is embedded layer feature output y of (2) ₁ ，...，y _m At y ₁ ，...，y _m The k-means clustering algorithm is operated on to obtain a classification result c ₁ ，...，c _n ，c _i ∈{1，...，k}；

The load prediction model based on the combined model is to predict the load change condition of Pod in the next moment or time period by adopting the combined prediction model aiming at linear and nonlinear data according to the historical use data and the current load condition.

Aiming at the situation that various resource usage data in Kubernetes in a simulation scene contain linearity and nonlinearity, an ARIMA time sequence model is adopted to predict a linear part in a sequence to obtain a prediction result. And comparing the ARIMA prediction result with the actual resource usage value to obtain a residual error value sequence of the linear part. At this time, the residual error is a hidden nonlinear relation of the white noise sequence, the residual value of nonlinearity is mined through a GRU model based on an attention mechanism, n inputs of the nonlinear sequence are subjected to network training, network parameters are continuously adjusted to obtain a predicted value, the predicted result of the combined model is formed by overlapping two model results, and a schematic diagram of the module design is shown in figure 3.

The specific design steps are as follows:

y _t ＝L _t +E _t

Since the original data are linear data and nonlinear data, L is _t Seen as a linear relationship in the original data sequence, E _t Treated as a non-linear relationship in the original data sequence.

Step 4-2: an ARIMA (p, d, q) autoregressive differential moving average model is established.

The ARIMA (p, d, q) model is a combination of d-th order differential and autoregressive moving average ARMA (p, q) on a jerky sequence. Wherein the ARMA (p, q) has the model structure:

x in the formula _t Representing the time series of the observed object, the interference value epsilon _t And epsilon _t-i Sequence value X _t-i And θ represents the moving average coefficient, p is the autoregressive model order, q is the moving average model order, and φ is the sum of ₀ At=0, the model is a centralised ARMA (p, q) model.

Step 4-3: and carrying out stable processing on the input data of the ARIMA model.

And judging the data stability. And checking and judging the stability of the data according to a scatter diagram drawn by the data, an autocorrelation coefficient ACF, a partial autocorrelation coefficient PACF and a unit root. And carrying out stabilization treatment on the non-stable sequence, and leading the original non-stable sequence to be weakly stable through difference. The parameter d in ARIMA (p, d, q) can be determined by the number of differences.

Step 4-4: the ARIMA model is identified and ranked.

P and g are determined by ACF and PACF in time series. The basic rules of model identification are: if the partial autocorrelation function is truncated and the autocorrelation function is trend decay, the model is an AR autoregressive model: if the trend of the partial correlation function is attenuated and the autocorrelation function is truncated, the model is an MA moving average model, and if the partial autocorrelation function and the autocorrelation function are both attenuated, the model is an ARMA autoregressive moving average model. After determining the type of the stochastic model, the order of the model is observed by drawing an autocorrelation diagram and a partial autocorrelation diagram, and the AIC criterion is used for determining the optimal model order.

Step 4-5: and (5) ARIMA model inspection.

Fitting detection and residual analysis are used to verify the availability of the model. If the residual error passes the white noise test, the model is valid, and if the residual error does not pass the test, the method goes to the step 4-4 to continue to select from the models meeting the precision requirement until the model passes the model test standard. And using the checked model as a fitting model to observe the change rule of the object. And predicting the acquired time sequence, analyzing the development trend of the observed object according to the predicted value result, and calculating a prediction error if the error is smaller, so as to finish the prediction. The optimal model after the inspection is obtained through the steps, and the time sequence integration can be input into ARIMA for fitting prediction.

Step 4-6: linear prediction was performed using ARIMA model.

Using ARIMA model for linear relationship L in original data sequence _t Modeling to obtain a predicted value sequence L _t ′。

Step 4-7: obtaining a predicted value sequence L _t After' the predicted result and the actual expected comparison are taken as inputs to the nonlinear partial GRU model, with the sequence of residual values of ARIMA extracted.

Step 4-8: and establishing and training a GRU load prediction model based on an attention mechanism, wherein a GRU model structure diagram is shown in fig. 4.

An attention mechanism is used after the GRU layer. Raw data from the monitoring module is preprocessed, including filling in gaps, data normalization, and generating formal data formats. Then, the number of hidden layers of the GRU, the number of units in each hidden layer and other important parameters are set. Then, a proper window length sequence is selected from the preprocessed input data to serve as an input vector, and the input vector is imported into a GRU network for training. The parameters are adjusted to obtain an initial output vector. The output vector serves as the input vector for the attention layer. And calculating a weight parameter corresponding to the GRU unit output vector at the time t according to the input information, and then carrying out weighted summation operation between the weight parameter vector and the input vector to obtain a final predicted value.

Step 4-9: predictions are made using a GRU model based on the attention mechanism. First, time sequence { x } of data residual values _t Total length n and window size w, time series data { x } _t The window size is divided into window size blocks. Then, proper GRU model parameters are set for training, and an output vector is obtained after each training round. And putting the output vector into a attention function for similarity calculation, normalizing to obtain a weight parameter of the corresponding vector, and then carrying out weighted sum summation on the obtained weight parameter and an output vector set trained by the GRU. Finally, a prediction result { E } _t '}. The final error value prediction sequence is:

E _t ′＝f(e _t-1 ，e _t-2 ，…，e _t-n )+ξ _t

wherein xi _t Representing GRU neural network error random values, E _t ' GRU network-to-nonlinear relationship E _t Is a predictive expression of (a).

Step 4-10: and superposing the prediction results to obtain a final prediction result. Overlapping the ARIMA model and the GRU neural network prediction result to obtain a prediction result of the resource usage, wherein the prediction result sequence expression is:

y _t ′＝L _t ′+E _t ′

further, the preemptive scheduling and the dynamic expansion and contraction based on the dynamic priority mainly comprise a preemptive scheduling scheme and a resource elastic expansion and contraction based on a dynamic priority gradient model, and the overall structure diagram of scheduling is shown in fig. 5.

In the background of the application, the simulation system has various tasks with different demands on various resources, the scheduling scheme is to set that when the resources of the simulated computing unit Node are sufficient, the Pod is directly scheduled for deployment operation by referring to the result of the prediction module, and when the Node resources are insufficient, the resources of the low-gradient priority Pod can be preempted according to the priority of the current Pod to be scheduled, so that the leading operation of the Pod where the high-priority task is located is ensured. Meanwhile, in order to avoid frequent preemption jitter and deadlock, pods with the same priority gradient are set to be mutually preempted, and in order to avoid the problem that Pods of low-priority tasks are always preempted and cannot be scheduled to run, the priority gradient is dynamically improved according to the preempted times and waiting time. The specific design steps are as follows:

step 5-1: and establishing a dynamic priority gradient model.

Classifying tasks according to the deadline of task completion from high to low, classifying the tasks into four classes of strong real-time, super real-time and non real-time tasks, according to the task characteristics of cloud simulation, combining key indexes of cloud resource scheduling such as CPU utilization rate, memory, network I/O which are more critical in the context of the present document, disk I/O, network bandwidth, communication overhead and the like, defining the strong real-time task as an L1 first gradient priority task, defining the real-time task as an L2 second gradient priority task, defining the super real-time and non real-time task as an L3 third gradient priority task, and dynamically changing the priority of the tasks according to the state of Pod and the killed/preempted times. The dynamic priority gradient model is shown in fig. 6.

Step 5-2: and calculating the priority factor of the Pod.

And setting priority factors P according to the demands of tasks in the Pod on different resources, wherein the queues to be scheduled in the Pod are ordered in descending order according to the size of P, and a Schduler scheduler schedules the Pod according to the sequence after the ordering is finished. The specific calculation of the priority factor P of the Pod to be scheduled is as follows:

wherein M refers to the demand of Pod on memory resources, C represents CPU, G represents GPU demand, MIO represents disk I/O demand, NIO represents network I/O demand, NCC represents communication resource demand, TW represents task waiting time demand, OC represents the number of times Pod is preempted and evicted by high-priority Pod, DL represents task deadline residual time, and the settings of the demands are derived from the load prediction module; l represents a priority gradient coefficient; lambda (lambda) _a ，λ _b ，λ _c ，λ _d ，λ _e ，λ _f The weight coefficient of the resource demand index can be set according to the task characteristics and the demand, and the weight coefficient cannot be changed after the initial setting; epsilon, alpha, beta are variable weight coefficients.

Step 5-3: a dynamic change rule of the priority factor P is set.

For the Pod, alpha and beta weights of the L1 and L2 gradients are linearly increased along with the increase of TW and OC, and the beta weights are linearly increased along with the decrease of DL, because the Pod is preempted according to the difference value delta P of P is set, the Pod, epsilon, alpha and beta weights of the L3 levels are exponentially increased to ensure the smooth execution of the low-priority L3 tasks, the limitation of the gradient level can be broken through to the L2 level, and the priority factor of the L2 level Pod cannot break through the limitation of the gradient level.

Step 5-4: and setting preemptive scheduling rules.

When the resources of the working Node are sufficient, the Pod preemption operation is not needed, the Pod deployment operation is directly scheduled, and when the Node resources are insufficient, preemption is performed according to the maximum delta P, the low-priority Pod is always evicted and released until the resource deployment requirement of the high-priority Pod is met. For Pod in the same gradient, system jitter is generated for preventing frequent preemption, and preemption operation is not performed; for Pod with the same gradient and the same priority factor, the scheduling is performed by adopting a first come first serve principle. The preemptive dispatch flow diagram is shown in fig. 7.

Step 5-5: a preemption object is selected.

The Kubemetes scheduling strategy is divided into a pre-selection stage and a preferred stage, the pre-selection stage filters nodes which do not meet the conditions according to a set of screening rules, the preferred stage scores the nodes which meet the conditions, and then the node with the highest score is selected as an operation node of the Pod. The preempted Pod objects are selected in accordance with the difference deltap of the priority factors P during a preselected phase of the schedule. If Node nodes meeting the Pod resource requirement do not exist in the current state of the cluster, the Pod continues to wait until the Node nodes meet the condition, the SLA is seriously violated in the process of facing the task in strong real time/real time, and the delta P is calculated as follows:

ΔP＝P _i -P _j ，(i，j＝1，2，...，n)

Step 5-6: optimizing the schedule preference stage scoring function.

Aiming at the problem that the resource scheduling requirement of a cloud simulation scene with high real-time requirement cannot be met by only scoring and screening two key indexes of a CPU (Central processing Unit) and a memory according to an evaluation function of a primary scheduling scheme in Kubemet, introducing disk I/O (input/output) and network I/O, taking network communication bandwidth and GPU (graphics processing Unit) indexes together as the basis of the evaluation function, and simultaneously providing a new evaluation function according to Node priority weights and Kubernetes default weights under the condition of taking cluster resource conditions, task resource requests and Node states into consideration.

Step 5-7: and calculating the availability of the resource k in the Node i.

Let N Node nodes in kubemes cluster, denoted as n= {1, 2..the N }, k resources on each Node, denoted as r_t= {1, 2..the k }, each Pod may request m resources, denoted as r_r= {1, 2..the m }, and the set of pods to be scheduled is denoted as pod= {1, 2..the q }. According to the actual resource utilization rate of six resource evaluation indexes of CPU, GPU, memory, disk I/O and network I/O in the cluster and network communication bandwidth, the priority weight value of the current Node can be obtained.

Calculating the availability of the resource k in the Node i:

Step 5-8: and calculating the priority factor of the availability of the resource k in the Node i in the cluster.

the new evaluation function represents the sum of the priority of the available resources of the Node, and according to the Pod and the actual state of the Node, the obtained idle condition of the Node resources represents that the higher the evaluation score is, the higher the adaptation degree of the idle resources of the Node and the Pod is.

Further, finally, the HPA strategy is combined to complete the elastic expansion and contraction of the cloud simulation resources.

(1) A resource elastic expansion and contraction process;

in Kubernetes, the native expansion and contraction strategy comprises an HPA strategy for horizontally scheduling Pod number based on threshold response and a VPA strategy for vertically adjusting Pod resource parameters, and the two strategies have certain hysteresis for the response of system load change, and cannot meet the requirements of cloud simulation resource scheduling for high concurrency, high dynamics and real-time, so that the HPA strategy based on load change prediction is designed according to the characteristics of simulation background, various resource usage prediction quantities in a future period of time output by the load change prediction module are further planned and scheduled in advance, pod number is increased and decreased in an automatic mode from horizontal dimension for the load of the cloud simulation scheduling system, seamless, high-quality and low-delay elastic resource expansion and contraction are realized, higher fault tolerance and availability service is provided for users, and system cost is reduced.

The HPA has an auto-expansion cooling time, typically 5 minutes by default, which refers to the time that Kubernetes waits after performing an expansion or contraction operation in order to observe the performance and stability of the new Pod copy. During the cooling time, kubernetes monitors the index of the new Pod copy and decides whether to continue the expansion or contraction operation according to the policy. The cooling time can be adjusted by adjusting parameters of the HPA. In the configuration of the HPA of the invention, different cooling times are designated for tasks with different real-time requirements according to the task deadlines thereof so as to meet the requirements of specific application programs. Assuming that the priority gradient set above is used to determine the Pod cutoff time as DL and the load change frequency LR, the expansion and contraction cooling times KST and SST for the Pod are:

and setting the expansion and contraction volume weight to ensure that enough time exists for observing the state of Pod and avoid frequently invalid expansion and contraction volume operation to a certain extent.

The HPA strategy is to utilize a horizontal expansion controller Horizontal Pod Autoscaler built in the Kubernetes, utilize default CPU utilization index thereof and measurement indexes such as GPU, memory, network bandwidth utilization, network I/O, disk I/O and the like based on Custom Metrics API, prometheus Adapter and the like, and realize the elastic expansion and contraction of the depth workload by adjusting the number of copies of deployed Pod. Because the VPA policy in Kubernetes needs to restart and reschedule after resetting the resource specification statement of Pod, so that waiting time from second to minute is generated, and the requirements of strong real-time and real-time tasks with ms and s levels on the deadline requirements of cloud simulation tasks cannot be met, the HPA policy that only the number of Pod copies needs to be increased or decreased to complete scheduling deployment can be adopted to cope with load change, and an HPA elastic expansion and contraction flow chart for real-time tasks is shown in fig. 8.

The method comprises the following specific steps:

and 6-1, setting a monitoring period T, and calculating horizontal expansion and contraction capacity cooling time HSST and HKST. The initial value defaults to 1, which means that the cooling can be performed and the expansion and contraction can be performed, the horizontal expansion and contraction cooling time HKST and HSST are two timers, after each expansion and contraction is performed, the HST_flag is set to 0, the timing counting is restarted, when the counting is 0, the automatic flag bit is set to 1, and when the HST_flag is 0, the expansion and contraction operation cannot be performed.

And 6-2, acquiring historical load data of k resource indexes of all the copies in each monitoring period T through a monitoring module, and inputting the historical load data into a combined prediction model to obtain a load prediction value of the next period T.

Step 6-3: and calculating the expected copy number ENT_Pod of the next period according to the predicted resource load index value, and simultaneously calculating the expected copy number ECT_Pod of the system at the current moment. The calculation method of the expected Pod copy number is as follows:

wherein currentMetricValue is the current resource index value in the cluster, desireMetricValue is the expected resource index value of the cluster, and C_pod is the copy number of the current Pod.

Step 6-4: and acquiring the currently deployed Pod copy number C_Pod by using a monitoring module.

Step 6-5: and judging whether the cooling zone bit HST_flag of the automatic expansion and contraction is 1, namely whether the expansion and contraction operation is allowed, if so, entering a step 6-6, and if not, ending the flow.

Step 6-6: if ECT_Pod > C_Pod, then the larger values of ENT_Pod and ECT_Pod are used as inputs to the HPA policy, otherwise step 6-7 is entered.

Step 6-7: if ENT_Pod > C_Pod, then the larger values of ENT_Pod and ECT_Pod are taken as input to the HPA policy, otherwise ECT_Pod is taken as input to the HPA policy.

Step 6-8: triggering dynamic expansion according to the input application copy number, and setting HST_flag to 0.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. An intelligent scheduling method of container cloud resources for LVC simulation is characterized by comprising the following steps: the method comprises the following steps:

2. The intelligent scheduling method for the container cloud resources for the LVC simulation according to claim 1, wherein the intelligent scheduling method for the container cloud resources for the LVC simulation is characterized by comprising the following steps: the process of clustering and combining the tasks with frequent interaction into the same Pod after containerization by a deep clustering algorithm is as follows:

establishing a cloud simulation task dependency model;

3. The intelligent scheduling method for the container cloud resources for LVC simulation according to claim 2, wherein the intelligent scheduling method is characterized by comprising the following steps: the process for establishing the cloud simulation task dependency model is as follows:

step 1-1: defining a cloud simulation task relation matrix;

TCM＝[C _ij ] _n×n

step 1-2: defining cloud simulation task dependency strength C _ij ；

C _ij ＝r _ij +I _ij +O _ij (i，j＝1，2，...n，)

wherein r is _ij For task T _i And T _j Information interaction frequency of I _ij ，O _ij Respectively task T _i And T _j Input/output similarity degree of (2);

For one task, the information quantity with different values is acquired from other tasks, including 0, the information transfer of each subtask is set to be zero,task T _i The corresponding n information receptions, namely, the information demand data are recorded as:

I′ _1k ，I′ _2k ，…，I′ _kk ，…，I′ _kk (k＝1，2，…，n)，I′ _kk ＝0

step 1-4: calculating input and output similarity of cloud simulation tasks;

Step 1-5: calculating the dependency relationship of cloud simulation tasks;

the dependency relationship of the task is as follows:

C _ij ＝r _ij +I _ij +O _ij ,i，j∈[1,N]

4. The intelligent scheduling method for the container cloud resources for LVC simulation according to claim 2, wherein the intelligent scheduling method is characterized by comprising the following steps: the process for learning the dependency degree of the cloud simulation task based on the twin neural network is as follows:

step 2-1: data input;

Step 2-2: and constructing a twin neural network model.

Two three-layer simple RNN (RNN-network) circulating neural networks sharing weight and having the same structure are adopted to construct a twin neural network structure, and the input of each network is two task sample characteristic data sequences x _1：T ＝(x ₁ ，x ₂ ，…，x _t ，…，x _T ) Wherein t represents a time step, and the input at time t is x _t When it conceals layer state h _t The updated formula of (2) is:

step 2-3: improving the loss function;

5. The intelligent scheduling method for the container cloud resources for LVC simulation according to claim 2, wherein the intelligent scheduling method is characterized by comprising the following steps: the specific steps of the SpectralNet learning mapping are as follows:

step 3-1: setting parameters of SpectralNet network, setting SpectralNetInput of (a) is a task sampleThe cluster number k of the clusters and the batch size m of the neural network; setting a SpectralNet loss function L _SpectralNet The method comprises the following steps:

E[yy ^T ]＝I _k×k

y: is an orthogonal matrix;

step 3-3: sampling a random small batch X of size m, propagating X forward and computing an orthogonalization layer Is input to the computer;

step 3-4:calculating Cholesky decomposition:the weight of the orthogonalization layer is set to +.>The final layer of the neural network performs orthogonality constraint, according to QR decomposition, about->Setting the weight of the last layer;

step 3-5: performing gradient calculation step, for random small lot size x ₁ ，...，x _m Sampling, and calculating an m multiplied by m similarity matrix W by using a twin network; w is calculated as a task dependency matrix TCM, x is propagated through forward direction ₁ ，...，x _m Obtaining y ₁ ，...，y _m ；

step 3-7: judgment of L _SpectralNet (θ) if not, returning to step 3-3;

6. The intelligent scheduling method for the container cloud resources for LVC simulation according to claim 2, wherein the intelligent scheduling method is characterized by comprising the following steps: the process of predicting the dynamically changing load by using the load prediction model based on the combined model is as follows:

step 4-1: using a data set of hypothetical time series as L _t The original data is y _t Then the formula of the time series dataCan be expressed as:

y _t ＝L _t +E _t

will L _t Seen as a linear relationship in the original data sequence, E _t The nonlinear relation in the original data sequence is regarded as;

step 4-3: performing stable processing on input data of the ARIMA model;

step 4-4: identifying and grading an ARIMA model;

Step 4-5: checking ARIMA model;

Step 4-6: performing linear prediction by using an ARIMA model;

step 4-9: using a GRU model based on an attention mechanism to predict;

E _t ′＝f(e _t-1 ,e _t-2 ,…,e _t-n )+ξ _t

y _t ′＝L _t ′+E _t ′

7. The intelligent scheduling method for the container cloud resources for the LVC simulation according to claim 1, wherein the intelligent scheduling method for the container cloud resources for the LVC simulation is characterized by comprising the following steps: the method comprises the steps of determining the scheduling sequence of the Pod by utilizing a preemptive scheduling scheme based on dynamic priority, and ensuring the normal scheduling of the low-priority Pod while ensuring the leading deployment and operation of the high-priority Pod by combining preemptive rules, wherein the process is as follows:

Step 5-1: establishing a dynamic priority gradient model;

step 5-2: calculating a priority factor of the Pod;

step 5-3: setting a dynamic change rule of a priority factor P;

step 5-4: setting a preemptive scheduling rule;

step 5-5: selecting a preemption object;

step 5-6: optimizing a dispatching optimization stage scoring function;

step 5-7: calculating the availability of the resource k in the Node i;

8. The intelligent scheduling method for container cloud resources for LVC simulation according to claim 6, wherein the method comprises the following steps: the process of selecting the preemption object is as follows:

ΔP＝P _i -P _j ，(i，j＝1，2，...，n)

9. The intelligent scheduling method for container cloud resources for LVC simulation according to claim 6, wherein the method comprises the following steps: the process of calculating the priority factor of Pod is as follows:

10. The intelligent scheduling method for container cloud resources for LVC simulation according to claim 6, wherein the method comprises the following steps: the process of setting the dynamic change rule of the priority factor P is as follows:

The set preemptive scheduling rules are as follows:

when the resources of the working Node are sufficient, pod preemption operation is not needed, pod deployment operation is directly scheduled, when the Node resources are insufficient, preemption is performed according to the maximum delta P, low-priority Pod is always evicted and released until the resource deployment requirements of high-priority Pod are met, and the Pod in the same gradient is prevented from generating system jitter for frequent preemption and does not perform preemption operation; for Pod with the same gradient and the same priority factor, scheduling is carried out by adopting a first come first serve principle;

the process of calculating the availability of the resource k in the Node i is as follows:

let N Node nodes in Kubernetes cluster, denoted as n= {1, 2..the N }, k resources on each Node, denoted as r_r= {1, 2..the k }, each Pod may request m resources, denoted as r_r= {1, 2..the m }, and the set of pods to be scheduled is denoted as; according to the actual resource utilization rate of six resource evaluation indexes of CPU, GPU, memory, disk I/O and network I/O in the cluster, obtaining the priority weight value of the current Node;

calculating the availability of the resource k in the Node i: