CN115033359A

CN115033359A - Internet of things agent multi-task scheduling method and system based on time delay control

Info

Publication number: CN115033359A
Application number: CN202210553893.7A
Authority: CN
Inventors: 张冀川; 王鹏; 郭屾; 林佳颖; 谭传玉; 白帅涛; 秦四军; 孙浩洋; 张明宇; 张治明; 姚志国; 吕琦; 张永芳
Original assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI; Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; China Electric Power Research Institute Co Ltd CEPRI; Electric Power Research Institute of State Grid Zhejiang Electric Power Co Ltd
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2022-09-09

Abstract

The invention provides a time delay control-based Internet of things agent multi-task scheduling method and a time delay control-based Internet of things agent multi-task scheduling system, wherein the time delay control-based Internet of things agent multi-task scheduling method comprises the following steps: acquiring information of a plurality of power businesses reaching a resource pool of an Internet of things agent; inputting the information into a pre-constructed optimization problem model of time delay controlled multi-task deployment and resource allocation to solve to obtain a deployment mode of the power service and a resource allocation mode of an internet of things agent; scheduling the power business according to the deployment mode of the power business and the resource allocation mode of the Internet of things agent; the optimization problem model of the time delay control multitask deployment and resource allocation is constructed based on the aim of minimizing the total time delay of a plurality of power services; by solving the optimization problem model of the multi-task deployment and the resource allocation of the time delay control, the invention completes the cooperative deployment of a plurality of power tasks and realizes the high-efficiency allocation of the resources facing the time delay control.

Description

A multi-task scheduling method and system for IoT agent based on delay control

技术领域technical field

本发明属于电力物联网技术领域，具体涉及一种基于时延控制的物联代理多任务调度方法和系统。The invention belongs to the technical field of the Internet of Things in electric power, and in particular relates to a multi-task scheduling method and system of the Internet of Things agent based on time delay control.

背景技术Background technique

随着电力物联网建设的不断深入推进，电力系统的规模不断扩大，智能终端的数量急剧增长，电力业务也呈现出多样性、实时性的趋势。伴随着物联网技术、新型传感器技术以及人工智能技术的应用日益广泛，电网运行向机器智能、感知智能和计算智能方向转变，由此产生海量的异构数据。传统云计算架构无法高效的满足智能电网的所有业务需求。边缘计算模型在网络设备上增加任务执行和数据缓存与分析的能力，将原有的云计算模型的部份或全部计算任务迁移到网络边缘设备上，能够降低云计算中心的计算负载，缓解网络带宽的压力，提高数据的处理效率，成为一种新的电力业务数据处理方案。With the continuous deepening of the construction of the power Internet of Things, the scale of the power system continues to expand, the number of intelligent terminals increases rapidly, and the power business also presents a trend of diversity and real-time. With the increasingly widespread application of Internet of Things technology, new sensor technology and artificial intelligence technology, the operation of power grids is shifting to machine intelligence, perception intelligence and computational intelligence, resulting in massive amounts of heterogeneous data. The traditional cloud computing architecture cannot efficiently meet all the business needs of the smart grid. The edge computing model increases the ability of task execution and data caching and analysis on network devices, and migrates some or all computing tasks of the original cloud computing model to network edge devices, which can reduce the computing load of the cloud computing center and ease the network. Bandwidth pressure, improve data processing efficiency, and become a new power business data processing solution.

然而，现阶段电力物联网中仍存如下诸多问题：1、在电力终端设备类型繁多、物理接口复杂多样，底层连接协议差异大，在带来业务应用开发周期长、扩展困难、对终端厂家依赖性强等问题的同时，也制约了不同业务系统之间的互联互通和数据共享；2、电力物联网所有业务终端采集的信息都会送到后台的业务系统进行处理，这种集中处理模式在造成一些时延敏感型应用性能劣化的同时，也会给骨干通信网络、后台业务系统带来巨大的处理压力，实时性业务的时延性能无法保证。However, at this stage, there are still many problems in the power Internet of Things: 1. There are many types of power terminal equipment, complex and diverse physical interfaces, and the underlying connection protocols are very different, which brings long business application development cycles, difficult expansion, and dependence on terminal manufacturers. At the same time, it also restricts the interconnection and data sharing between different business systems; 2. The information collected by all business terminals of the power Internet of Things will be sent to the background business system for processing. This centralized processing mode is causing While the performance of some delay-sensitive applications is degraded, it will also bring huge processing pressure to the backbone communication network and background business systems, and the delay performance of real-time services cannot be guaranteed.

发明号为CN112988285B的发明《任务卸载方法和装置、电子设备及存储介质》涉及任务卸载技术领域。任务卸载方法应用于电子设备，电子设备与任务卸载系统通信连接，任务卸载系统包括第二设备和至少一个第一设备，任务卸载方法包括：首先，获取至少一个第一设备的待处理任务；其次，将待处理任务输入预设的任务卸载模型，得到任务卸载策略；然后，将任务卸载策略发送至至少一个第一设备，以使至少一个第一设备基于任务卸载策略将目标任务卸载至第二设备，第二设备对目标任务进行执行处理。通过上述方法，可以提高任务卸载的效率。Invention No. CN112988285B "task offloading method and device, electronic device and storage medium" relates to the technical field of task offloading. The task offloading method is applied to an electronic device, the electronic device is connected in communication with a task offloading system, the task offloading system includes a second device and at least one first device, and the task offloading method includes: first, acquiring a task to be processed of at least one first device; secondly , input the task to be processed into a preset task offloading model to obtain a task offloading strategy; then, send the task offloading strategy to at least one first device, so that the at least one first device offloads the target task to the second device based on the task offloading strategy device, and the second device performs execution processing on the target task. Through the above method, the efficiency of task offloading can be improved.

发明号为CN113553165A的发明《一种基于博弈论的移动边缘计算任务卸载和资源调度方法》。该发明公开了一种基于博弈论的移动边缘计算任务卸载和资源调度方法，该方法以移动边缘计算服务器的能耗及用户延迟联合最小化为目标，将用户任务卸载及资源调度问题建模为特定的优化问题，构建具有不同任务卸载优先级的基于多用户任务卸载计算系统，以多用户任务卸载到一个边缘基站建立迁移模型。在特定的优化问题中，基于博弈论求解传输速率及成本系数为约束条件，以最小化服务器能耗为最终目标设计任务卸载方法。该方法能有效均衡用户及系统的利益，为在移动边缘计算系统中实施任务卸载提供保证。Invention No. CN113553165A "A Game Theory-Based Mobile Edge Computing Task Offloading and Resource Scheduling Method". The invention discloses a mobile edge computing task offloading and resource scheduling method based on game theory. The method aims to jointly minimize the energy consumption of the mobile edge computing server and the user delay, and models the user task offloading and resource scheduling problem as For a specific optimization problem, a computing system based on multi-user task offloading with different task offload priorities is constructed, and a migration model is established by offloading multi-user tasks to an edge base station. In a specific optimization problem, based on game theory to solve the transmission rate and cost coefficient as constraints, the task offloading method is designed with the ultimate goal of minimizing server energy consumption. This method can effectively balance the interests of users and systems, and provide guarantees for implementing task offloading in mobile edge computing systems.

发明号为CN113590232A的发明《一种基于数字孪生的中继边缘网络任务卸载方法》公开了一种基于数字孪生的中继边缘网络任务卸载方法，包括搭建中继边缘网络任务卸载策略模型；更新数字孪生体环境中对应的各个部分的状态；将数字孪生体的参数传入模拟任务卸载系统进行迭代训练，得到最优任务卸载策略模型；将最优任务卸载策略模型传输到模拟人工控制界面进行备份；将当前的数字孪生参数训练模型以及最优任务卸载策略模型传输到数字孪生体环境缓存，并由现实中的边缘服务器转发给每个中继节点，中继节点再转发给与其通信的用户终端；用户终端和中继节点根据最优任务卸载策略模型进行相应的任务卸载。本发明可以减少现实5G边缘计算技术在落地过程中的试错成本，提高了落地效率。Invention No. CN113590232A, "A Relay Edge Network Task Offloading Method Based on Digital Twin" discloses a digital twin-based relay edge network task offloading method, including building a relay edge network task offloading strategy model; updating digital The state of the corresponding parts in the twin environment; the parameters of the digital twin are transferred to the simulated task offloading system for iterative training to obtain the optimal task offloading strategy model; the optimal task offloading strategy model is transferred to the simulated manual control interface for backup ;Transfer the current digital twin parameter training model and the optimal task offloading strategy model to the digital twin environment cache, and forward it to each relay node by the real edge server, and the relay node forwards it to the user terminal that communicates with it. ; User terminals and relay nodes perform corresponding task offloading according to the optimal task offloading strategy model. The present invention can reduce the trial and error cost of the actual 5G edge computing technology in the process of landing, and improve the landing efficiency.

技术方案1首先获取至少一个第一设备的待处理任务；其次，将待处理任务输入预设的任务卸载模型，得到任务卸载策略；然后，将任务卸载策略发送至至少一个第一设备，以使至少一个第一设备基于任务卸载策略将目标任务卸载至第二设备，第二设备对目标任务进行执行处理。尽管该方案提高了任务卸载效率但是该方案仅仅考虑单一任务协同场景下的任务卸载。Technical solution 1 firstly obtains the task to be processed of at least one first device; secondly, input the task to be processed into a preset task offloading model to obtain a task offloading strategy; then, send the task offloading strategy to at least one first device, so that the At least one first device offloads the target task to the second device based on the task offloading policy, and the second device executes the target task. Although this scheme improves the efficiency of task offloading, it only considers task offloading in a single-task collaboration scenario.

技术方案2该方法以移动边缘计算服务器的能耗及用户延迟联合最小化为目标，将用户任务卸载及资源调度问题建模为特定的优化问题，该方案并未考虑多任务之间的数据路由优化问题。Technical solution 2 This method aims to jointly minimize the energy consumption of the mobile edge computing server and the user delay, and models the user task offloading and resource scheduling problem as a specific optimization problem. This solution does not consider the data routing between multiple tasks. Optimization.

技术方案3将当前的数字孪生参数训练模型以及最优任务卸载策略模型传输到数字孪生体环境缓存，并由现实中的边缘服务器转发给每个中继节点，中继节点再转发给与其通信的用户终端；用户终端和中继节点根据最优任务卸载策略模型进行相应的任务卸载。与基于方案1一样，该方案同样仅考虑单一任务协同场景下的任务，缺乏针对特定场景的普适性。Technical solution 3: The current digital twin parameter training model and the optimal task offloading strategy model are transmitted to the digital twin environment cache, and are forwarded by the real edge server to each relay node, and then the relay node is forwarded to the communication with it. User terminal; the user terminal and the relay node perform corresponding task offloading according to the optimal task offloading strategy model. Like the scheme based on scheme 1, this scheme also only considers tasks in a single-task collaborative scenario, and lacks universality for specific scenarios.

发明内容SUMMARY OF THE INVENTION

为克服上述现有技术的不足，本发明提出一种基于时延控制的物联代理多任务调度方法，包括：In order to overcome the above-mentioned deficiencies of the prior art, the present invention proposes a multi-task scheduling method for IoT agents based on delay control, including:

获取多个电力业务到达物联代理的资源池的信息；Obtain the information that multiple power services arrive at the resource pool of the IoT agent;

将所述信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式；Inputting the information into a pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model to solve, to obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

根据电力业务的部署方式以及物联代理的资源分配方式，对电力业务进行调度；According to the deployment method of the power service and the resource allocation method of the IoT agent, the power service is scheduled;

其中，所述时延控制的多任务部署及资源分配的最优化问题模型基于多个电力业务总时延最小为目标构建的。Wherein, the multi-task deployment and resource allocation optimization problem model of delay control is constructed based on the goal of minimizing the total delay of multiple power services.

优选的，所述时延控制的多任务部署及资源分配的最优化问题模型的构建，包括：Preferably, the construction of the optimization problem model of the delay-controlled multi-task deployment and resource allocation includes:

以多个电力业务在各物联代理的资源池中部署及资源分配方式下的总时延最小为目标构建目标函数；The objective function is constructed with the goal of minimizing the total delay in the deployment of multiple power services in the resource pool of each IoT agent and the resource allocation mode;

以物联代理的资源约束构建约束条件；Build constraints based on resource constraints of IoT agents;

基于所述目标函数和约束条件构建时延控制的多任务部署及资源分配的最优化问题模型；Build an optimization problem model of delay-controlled multi-task deployment and resource allocation based on the objective function and constraints;

其中，所述资源包括计算资源和带宽资源；所述信息至少包括下述中的一种或多种：电力业务请求所需的资源池中的容器序列、电力业务的优先级、电力业务的数据包离开资源池的时间间隔分布对应的泊松过程的参数、资源池的单位计算资源对电力业务的处理速度和资源池的单位带宽资源对电力业务的传输速度。Wherein, the resources include computing resources and bandwidth resources; the information includes at least one or more of the following: the sequence of containers in the resource pool required by the power service request, the priority of the power service, and the data of the power service The time interval distribution of packets leaving the resource pool corresponds to the parameters of the Poisson process, the processing speed of the unit computing resource of the resource pool to the power service, and the transmission speed of the unit bandwidth resource of the resource pool to the power service.

优选的，所述目标函数的计算式如下：Preferably, the calculation formula of the objective function is as follows:

式中，F表示多个电力业务在各物联代理的资源池中部署与资源分配方式下的总时延，K_S表示同一资源池在同一时间段的电力业务的数量，ω_s表示电力业务s的优先级，D_s表示电力业务s的总时延，S表示所以电力业务构成的集合；In the formula, F represents the total delay of the deployment and resource allocation of multiple power services in the resource pool of each IoT agent, K _S represents the number of power services in the same resource pool in the same time period, ω _s represents the power service The priority of s, D _s represents the total delay of the power service s, and S represents the set of all power services;

电力业务s的总时延D_s的计算式如下：The calculation formula of the total delay D _s of the power service s is as follows:

式中，|N_s|表示资源池中对电力业务s进行处理的容器的数量，D_s,z表示电力业务s在资源池z处的平均时延，

表示电力业务s在离开资源池z后的链路时延；In the formula, |N _s | represents the number of containers in the resource pool that process the power service s, D _s,z represents the average delay of the power service s at the resource pool z,

represents the link delay of the power service s after leaving the resource pool z;

所述电力业务s在资源池z处的平均时延D_s,z的计算式如下：The calculation formula of the average delay D _s,z of the power service s at the resource pool z is as follows:

式中，c_s,1表示进行资源分配后的针对电力业务s的处理速度，c_s,2表示进行资源分配后的针对电力业务s的传输速度，P_s,z表示电力业务s在资源池z的处理队列中的非空概率，λ_s表示电力业务s的数据包离开资源池的时间间隔分布对应的泊松过程的参数；In the formula, c _s,1 represents the processing speed of the power service s after resource allocation, c _s,2 represents the transmission speed of the power service s after resource allocation, and P _s,z represents the power service s in the resource pool. The non-empty probability in the processing queue of z, λ _s represents the parameters of the Poisson process corresponding to the time interval distribution of the data packets of the power service s leaving the resource pool;

所述电力业务s在离开资源池z后的链路时延

的计算式如下：The link delay of the power service s after leaving the resource pool z

The calculation formula is as follows:

式中，n_z表示资源池z处待传输的数据规模。In the formula, n _z represents the size of the data to be transmitted in the resource pool z.

优选的，所述资源约束的计算式如下：Preferably, the calculation formula of the resource constraint is as follows:

式中，K_B表示部署在资源池上的电力业务总数，κ_z,i表示资源池z分配给电力业务i的计算资源的比例，ν_z,i表示资源池z分配给电力业务i的带宽资源的比例，σ表示电力业务在资源池中可以得到的最大资源量，

表示电力业务在资源池中可获得的资源量集合，

表示电力业务在资源池z中的容器使用所有计算资源时的处理速度，

表示电力业务在资源池z中的容器使用所有带宽资源时的传输速度。In the formula, _KB represents the total number of power services deployed on the resource pool, κ _z,i represents the proportion of computing resources allocated by resource pool z to power service i, and ν _z,i represents the bandwidth resources allocated by resource pool z to power service i , σ represents the maximum amount of resources available to the power business in the resource pool,

Represents the set of resources available for the power business in the resource pool,

represents the processing speed of the power business when the containers in the resource pool z use all computing resources,

Indicates the transmission speed of the power service when all the bandwidth resources are used by the containers in the resource pool z.

优选的，所述将所述信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式，包括：Preferably, the information is input into a pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model for solving, and the deployment mode of the power service and the resource allocation mode of the IoT agent are obtained, including:

将所述信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型，采用基于动态信息素挥发系数的改进蚁群算法对所述时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到多个电力任务在物联代理的资源池中各容器的部署方式；The information is input into a pre-built optimization problem model of delay-controlled multi-task deployment and resource allocation, and an improved ant colony algorithm based on dynamic pheromone volatility coefficient is used to optimize the delay-controlled multi-task deployment and resource allocation. The optimization problem model is solved to obtain the deployment mode of each container in the resource pool of the IoT agent for multiple power tasks;

基于多个电力任务在物联代理的资源池中各容器的部署方式，采用近端策略优化算法对所述时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到物联代理的资源池向各容器的资源分配方式。Based on the deployment methods of multiple power tasks in the resource pool of the IoT agent, the near-end strategy optimization algorithm is used to solve the optimization problem model of the multi-task deployment and resource allocation for delay control, and the IoT agent is obtained. The resource allocation method of the resource pool to each container.

优选的，所述采用基于动态信息素挥发系数的改进蚁群算法对所述时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到多个电力任务在物联代理的资源池中各容器的部署方式，包括：Preferably, the improved ant colony algorithm based on the dynamic pheromone volatility coefficient is used to solve the optimization problem model of the multi-task deployment and resource allocation of delay control, so as to obtain the resource pool of the IoT agent for multiple power tasks The deployment method of each container in , including:

采用物联代理的资源池中的计算资源初始化蚁群算法的信息素；Use the computing resources in the resource pool of the IoT agent to initialize the pheromone of the ant colony algorithm;

基于动态信息素挥发系数计算各资源池的信息素轨迹消失；Calculate the disappearance of the pheromone trajectory of each resource pool based on the dynamic pheromone volatilization coefficient;

基于所述信息素轨迹消失计算选择各资源池进行任务部署的概率并基于所述概率选择下一个部署电力任务的资源池；Calculate the probability of selecting each resource pool for task deployment based on the disappearance of the pheromone trajectory, and select the next resource pool to deploy the power task based on the probability;

基于所述信息素轨迹消失和选择的下一个部署电力任务的资源池采用蚁群进行迭代计算，得到蚁群路径规划中最频繁的规划作为多个电力任务在物联代理的资源池中各容器的部署方式。Based on the disappearance of the pheromone trajectory and the selected resource pool for the next deployment power task, the ant colony is used for iterative calculation, and the most frequent planning in the ant colony path planning is obtained as multiple power tasks in each container in the resource pool of the IoT agent deployment method.

优选的，所述初始化蚁群算法的信息素的计算式如下：Preferably, the calculation formula of the pheromone of the initialization ant colony algorithm is as follows:

式中，τ_0,j表示资源池j对应的初始化的信息素，r'_m,j表示资源池j的可用内存，下标m表示内存，r_m,j表示资源池j的总内存，ψ_m表示所有资源池的总内存，r'_p,j表示资源池j的可用CPU，下标p表示CPU，r_p,j表示资源池j的总CPU，ψ_p表示所有资源池的总CPU。In the formula, τ _0,j represents the initialized pheromone corresponding to resource pool j, r' _m,j represents the available memory of resource pool j, subscript m represents memory, r _m,j represents the total memory of resource pool j, ψ _m represents the total memory of all resource pools, r' _p,j represents the available CPU of resource pool j, subscript p represents CPU, r _p,j represents the total CPU of resource pool j, ψ _p represents the total CPU of all resource pools.

优选的，所述信息素轨迹消失的计算式如下：Preferably, the calculation formula for the disappearance of the pheromone track is as follows:

式中，τ(t,j)表示t时刻资源池j对应的信息素轨迹消失，τ_0,j表示资源池j对应的初始化的信息素，Δ表示信息素的变化，P(k)表示第k次迭代时可选择的资源池，ρ表示动态信息素挥发系数；In the formula, τ(t,j) represents the disappearance of the pheromone trajectory corresponding to resource pool j at time t, τ _0,j represents the initialized pheromone corresponding to resource pool j, Δ represents the change of pheromone, and P(k) represents the first pheromone. The resource pool that can be selected in k iterations, ρ represents the dynamic pheromone volatility coefficient;

所述动态信息素挥发系数ρ的计算式如下：The calculation formula of the dynamic pheromone volatilization coefficient ρ is as follows:

基于同一发明构思，本申请还提供了一种基于时延控制的物联代理多任务调度系统，包括：数据采集模块、求解模块和调度模块；Based on the same inventive concept, the present application also provides an IoT agent multi-task scheduling system based on delay control, including: a data acquisition module, a solution module and a scheduling module;

所述数据采集模块，用于获取多个电力业务到达物联代理的资源池的信息；The data acquisition module is used to acquire the information that multiple power services reach the resource pool of the IoT agent;

所述求解模块，用于将所述信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式；The solving module is used for inputting the information into a pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model for solving, to obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

所述调度模块，用于根据电力业务的部署方式以及物联代理的资源分配方式，对电力业务进行调度；The scheduling module is used to schedule the power service according to the deployment mode of the power service and the resource allocation mode of the IoT agent;

优选的，所述求解模块，还用于以多个电力业务在各物联代理的资源池中部署及资源分配方式下的总时延最小为目标构建目标函数；Preferably, the solving module is further configured to construct the objective function with the goal of minimizing the total delay in the deployment of multiple power services in the resource pool of each IoT agent and the resource allocation mode;

优选的，所述求解模块构建的目标函数的计算式如下：Preferably, the calculation formula of the objective function constructed by the solving module is as follows:

所述电力业务s在离开资源池z后的链路时延

The calculation formula is as follows:

优选的，所述求解模块构建的资源约束的计算式如下：Preferably, the calculation formula of the resource constraint constructed by the solving module is as follows:

表示电力业务在资源池中可获得的资源量集合，

优选的，所述求解模块，具体用于：Preferably, the solving module is specifically used for:

优选的，所述求解模块采用基于动态信息素挥发系数的改进蚁群算法对所述时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到多个电力任务在物联代理的资源池中各容器的部署方式，包括：Preferably, the solving module uses an improved ant colony algorithm based on dynamic pheromone volatilization coefficient to solve the optimization problem model of the multi-task deployment and resource allocation of time delay control, and obtains the results of multiple power tasks in the IoT agent. The deployment method of each container in the resource pool, including:

优选的，所述求解模块初始化蚁群算法的信息素的计算式如下：Preferably, the calculation formula of the pheromone of the ant colony algorithm initialized by the solving module is as follows:

优选的，所述求解模块计算信息素轨迹消失的计算式如下：Preferably, the calculation formula for calculating the disappearance of the pheromone trajectory by the solving module is as follows:

本发明还提供一种计算机设备，包括：一个或多个处理器；The present invention also provides a computer device, comprising: one or more processors;

存储器，用于存储一个或多个程序；memory for storing one or more programs;

当所述一个或多个程序被所述一个或多个处理器执行时，实现如前所述的基于时延控制的物联代理多任务调度方法。When the one or more programs are executed by the one or more processors, the aforementioned multitask scheduling method for IoT agents based on delay control is implemented.

本发明还提供一种计算机可读存储介质，其上存有计算机程序，所述计算机程序被执行时，实现如前所述的基于时延控制的物联代理多任务调度方法。与最接近的现有技术相比，本发明具有的有益效果如下：The present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed, implements the aforementioned method for scheduling multiple tasks of an IoT agent based on delay control. Compared with the closest prior art, the present invention has the following beneficial effects:

本发明提供了一种基于时延控制的物联代理多任务调度方法和系统，包括：获取多个电力业务到达物联代理的资源池的信息；将所述信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式；根据电力业务的部署方式以及物联代理的资源分配方式，对电力业务进行调度；其中，所述时延控制的多任务部署及资源分配的最优化问题模型基于多个电力业务总时延最小为目标构建的；通过求解时延控制的多任务部署及资源分配的最优化问题模型，本发明完成多个电力任务的协同部署，实现面向时延控制的资源的高效分配。The present invention provides a method and system for multi-task scheduling of IoT agents based on delay control, including: acquiring information about the arrival of multiple power services to the resource pool of IoT agents; inputting the information into a pre-built delay-controlled Solve the optimization problem model of multi-task deployment and resource allocation, and obtain the deployment method of the power service and the resource allocation method of the IoT agent; according to the deployment method of the power service and the resource allocation method of the IoT agent, the power service is scheduled; Wherein, the optimization problem model of delay-controlled multi-task deployment and resource allocation is constructed based on the goal of minimizing the total delay of multiple power services; by solving the optimization problem model of delay-controlled multi-task deployment and resource allocation , the present invention completes the coordinated deployment of multiple power tasks and realizes efficient allocation of resources oriented to time delay control.

附图说明Description of drawings

图1为本发明提供的一种基于时延控制的物联代理多任务调度方法流程示意图；1 is a schematic flowchart of a method for scheduling multiple tasks of IoT agents based on delay control provided by the present invention;

图2为本发明提供的多任务部署与资源分配算法的设计流程示意图；Fig. 2 is the design flow schematic diagram of the multi-task deployment and resource allocation algorithm provided by the present invention;

图3为本发明提供的一种基于时延控制的物联代理多任务调度系统结构示意图；3 is a schematic structural diagram of a multi-task scheduling system for IoT agents based on delay control provided by the present invention;

图4为本发明提供的一个电力物联代理调度系统示例的结构示意图。FIG. 4 is a schematic structural diagram of an example of a power IoT agent scheduling system provided by the present invention.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式做进一步的详细说明。The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

实施例1：Example 1:

本发明提供的一种基于时延控制的物联代理多任务调度方法流程示意图如图1所示，包括：A schematic flowchart of a multi-task scheduling method for IoT agents based on delay control provided by the present invention is shown in FIG. 1 , including:

步骤1：获取多个电力业务到达物联代理的资源池的信息；Step 1: Obtain the information that multiple power services reach the resource pool of the IoT agent;

步骤2：将信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式；Step 2: Input the information into the pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model to solve, and obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

步骤3：根据电力业务的部署方式以及物联代理的资源分配方式，对电力业务进行调度；Step 3: according to the deployment mode of the power service and the resource allocation mode of the IoT agent, schedule the power service;

其中，时延控制的多任务部署及资源分配的最优化问题模型基于多个电力业务总时延最小为目标构建的。Among them, the multi-task deployment of delay control and the optimization problem model of resource allocation are constructed based on the goal of minimizing the total delay of multiple power services.

本发明利用轻量级容器的高度抽象性，可以实现硬件和软件的高度复用，物联代理作为有限的资源，可以分时扮演多种角色，实现不同的功能，如通信，计算等，一个用作通信的虚拟交换机软件或者一个虚拟网络功能都可以作为子任务放在远程仓库中，各个边缘物联代理下载即可子任务并运行，来支撑边缘电力业务。为方便描述容器的资源分配过程，本发明首先构建了边缘节点资源分配模型，并基于该模型建立了一个时延控制的多任务部署及资源分配的最优化问题模型，针对该模型，本发明提出了一个MTDRA(Multi-taskDeployment and Resource Allocation，多任务部署与资源分配)算法，该算法的设计流程如图2所示，具体来说，包括：构建时延控制的多任务部署及资源分配的最优化问题模型；将最优化问题模型的求解分解为两个顺序求解的子问题；基于动态信息素挥发系数设计改进蚁群算法求解多任务部署结果；根据任务部署结果，基于PPO(Proximal PolicyOptimization,近端策略优化)设计资源分配算法，分配物联代理的计算资源和通信资源(即带宽资源)。The invention utilizes the high abstraction of the lightweight container, and can realize the high reuse of hardware and software. As a limited resource, the IoT agent can play various roles in time-sharing and realize different functions, such as communication, computing, etc. The virtual switch software used for communication or a virtual network function can be placed in the remote warehouse as subtasks, and each edge IoT agent can download and run the subtasks to support edge power services. In order to conveniently describe the resource allocation process of the container, the present invention firstly constructs an edge node resource allocation model, and based on the model, establishes a delay-controlled multi-task deployment and resource allocation optimization problem model. For this model, the present invention proposes A MTDRA (Multi-task Deployment and Resource Allocation) algorithm is proposed. The design process of the algorithm is shown in Figure 2. Specifically, it includes: building delay-controlled multi-task deployment and resource allocation. The optimization problem model; the solution of the optimization problem model is decomposed into two sub-problems to be solved sequentially; the improved ant colony algorithm is designed based on the dynamic pheromone volatility coefficient to solve the multi-task deployment results; end strategy optimization) to design a resource allocation algorithm to allocate computing resources and communication resources (ie, bandwidth resources) of IoT agents.

首先，对本发明中用到的模型进行说明。First, the model used in the present invention will be described.

I-电力边缘网络模型。I-Power Edge Network Model.

本发明将边缘物联代理资源分成若干个公共资源池，本发明使用z代表资源池，K_Z代表资源池数量。由于资源池之间通过不同的数据链路连接，存在不同的上下行带宽，分别使用C_z，B_z分别代表资源池z拥有的计算和带宽资源量。在实际生产情况下，不同时间段业务请求组合会发生变化，当电力业务(简称业务)请求s到达调度系统时，使用K_S代表同一资源池在同一时间段的业务请求数量。The present invention divides the edge IoT agent resources into several public resource pools. The present invention uses z to represent the resource pool, and K _Z to represent the number of resource pools. Since the resource pools are connected by different data links, there are different uplink and downlink bandwidths, and C _z and B _z are respectively used to represent the amount of computing and bandwidth resources owned by the resource pool z. In the actual production situation, the combination of service requests in different time periods will change. When the power service (referred to as service) request s arrives in the dispatching system, K _S is used to represent the number of service requests in the same resource pool in the same time period.

本发明假设请求在不同时间间隔内随机到达资源池，本发明使用Ω_s代表业务s到达资源池的信息，其中Ω_s＝(N_s,λ_s,C_s,1,C_s,2,ω_s)，在此阶段，本发明将业务请求流数据包的到达过程看成是参数为λ_s的泊松过程，其中N_s代表业务请求所需的容器序列，|N_s|代表容器数量，ω_s表示电力业务s的优先级，C_s,1代表资源池中单位计算资源对业务s数据包的处理速度，C_s,2表示资源池中单位带宽资源对s数据的传输速度，其中

The present invention assumes that requests arrive at the resource pool randomly in different time intervals, and the present invention uses Ω _s to represent the information that the service s arrives in the resource pool, where Ω _s =(N _s ,λ _s ,C _s,1 ,C _s,2 ,ω _s ), at this stage, the present invention regards the arrival process of service request flow data packets as a Poisson process with parameter λ _s , where N _s represents the sequence of containers required by the service request, |N _s | represents the number of containers, ω _s represents the priority of the power service s, C _s,1 represents the processing speed of the unit computing resource in the resource pool to the service s data packet, C _s,2 represents the transmission speed of the unit bandwidth resource in the resource pool to the s data, where

本发明使用

代表业务s在资源池z中的容器使用所有资源所得到的处理速度，使用

代表s使用资源池z中带宽资源得到的数据包传输速度。为了找到资源池资源分配的近似最优解，通过离散资源分配策略对资源池计算及其带宽资源进行分配，分配策略π(s,a)＝{κ_z,ν_z}，其中：Use of the present invention

Represents the processing speed obtained by using all the resources of the container of the business s in the resource pool z, using

Represents the data packet transmission speed obtained by s using the bandwidth resources in resource pool z. In order to find the approximate optimal solution of resource pool resource allocation, the resource pool calculation and its bandwidth resources are allocated through a discrete resource allocation strategy, the allocation strategy π(s,a)={κ _z ,ν _z }, where:

上式中，K_S代表同一资源池在同一时间段的业务请求数量，K_B表示部署在资源池上的电力业务总数，κ_z,ν_z分别代表资源池z对业务请求链的资源分配情况，κ_z,i表示资源池z分配给电力业务i的计算资源的比例，ν_z,i表示资源池z分配给电力业务i的带宽资源的比例。其中

是一个有限的离散集合，表示电力业务在资源池中可获得的资源量集合，具体来说，将资源分配划分为10个离散的资源块，这时

这使得在定义资源分配动作时大大减少动作空间的大小，降低算法收敛难度。σ代表业务请求s在资源池可以得到的最大资源量，这样做的目的使避免某一请求占用过多资源影响其他业务的运行。根据资源分配策略可得到当前业务请求的处理和传输速率。In the above formula, K _S represents the number of service requests in the same resource pool in the same time period, _KB represents the total number of power services deployed on the resource pool, κ _z , ν _z represent the resource allocation of resource pool z to the service request chain, respectively, κ _z,i represents the ratio of computing resources allocated by resource pool z to power service i, and ν _z,i represents the ratio of bandwidth resources allocated by resource pool z to power service i. in

is a limited discrete set, which represents the set of resources available for power services in the resource pool. Specifically, the resource allocation is divided into 10 discrete resource blocks. At this time,

This greatly reduces the size of the action space and reduces the difficulty of algorithm convergence when defining resource allocation actions. σ represents the maximum amount of resources available to the service request s in the resource pool. The purpose of this is to prevent a request from occupying too many resources and affecting the operation of other services. According to the resource allocation strategy, the processing and transmission rate of the current service request can be obtained.

上式中，

表示进行资源分配后资源池z中的容器针对业务s的处理速度，

表示进行资源分配后资源池z中的容器针对业务s的传输速度。In the above formula,

Indicates the processing speed of the container in the resource pool z for the business s after resource allocation,

Indicates the transmission speed of the container in the resource pool z for the service s after resource allocation.

同一业务请求中，在同一资源池的数据包处理速度和传输速度要尽可能满足c_s,1＝c_s,2，只有这样数据包才能在经过资源池处理后可以在无排队时间的情况下通过数据链路传输出去，在资源分配时既要考虑资源分配最大化也要考虑资源分配的均衡性，避免无谓的资源浪费，所以在进行资源池资源分配时要满足条件

In the same service request, the data packet processing speed and transmission speed in the same resource pool should satisfy c _s,1 =c _s,2 as much as possible. Only in this way, the data packet can be processed in the resource pool without queuing time. It is transmitted through the data link. In resource allocation, both the maximization of resource allocation and the balance of resource allocation should be considered to avoid unnecessary waste of resources. Therefore, conditions must be met when allocating resources in resource pools.

II-时延模型，即时延控制的多任务部署及资源分配的最优化问题模型。II-Delay model, which is an optimization problem model for multi-task deployment and resource allocation of delay control.

为准确描述多任务协同场景下的端到端时延，本发明首先基于串联排队模型建立了一个时延模型，提高了网络时延评估的准确性。In order to accurately describe the end-to-end delay in the multi-task collaboration scenario, the present invention first establishes a delay model based on the serial queuing model, which improves the accuracy of network delay evaluation.

本发明将分开讨论业务请求的数据包经过容器1和经过后续容器的过程。本发明首先分析第一个容器的离开过程，在这一过程中，本发明重点研究同一业务中的两个相邻数据包离开的时间间隔。当业务请求流数据包的到达率远小于资源池处理速率时，数据包离开资源池的时间间隔符合参数为λ_s的泊松过程时，当到达率等于资源池处理速率时，数据包离开资源池的时间间隔近似于一个参数为c_s,1的确定过程。本发明通过分析业务请求流在容器1中的处理时延，传输时延，以及可能由于到达率和处理速率不同导致的排队积压时延得到了完整的容器数据包处理和数据包传输的串联排队模型，业务请求流S在第一个容器处的数据包平均时延如下：The present invention will separately discuss the process that the data packet of the service request passes through the container 1 and passes through the subsequent containers. The present invention first analyzes the leaving process of the first container, and in this process, the present invention focuses on the time interval when two adjacent data packets in the same service leave. When the arrival rate of service request flow data packets is much smaller than the processing rate of the resource pool, the time interval of the data packets leaving the resource pool conforms to the Poisson process with parameter λ _s , and when the arrival rate is equal to the processing rate of the resource pool, the data packets leave the resource pool. The time interval of the pool is approximated by a deterministic process with parameter c _s,1 . The present invention obtains the complete container data packet processing and serial queuing of data packet transmission by analyzing the processing delay, transmission delay, and queuing backlog delay caused by different arrival rates and processing rates in the container 1. In the model, the average delay of data packets at the first container of service request flow S is as follows:

式中，c_s,1表示进行资源分配后的针对电力业务s的处理速度，c_s,2表示进行资源分配后的针对电力业务s的传输速度，P_s,1表示电力业务s在容器1的处理队列中的非空概率。In the formula, c _s,1 represents the processing speed of the power service s after resource allocation, c _s,2 represents the transmission speed of the power service s after resource allocation, and P _s,1 represents the power service s in the container 1 The probability of not being empty in the processing queue.

基于上述分析，本发明继续分析数据离开容器1后的链路传输时延，如果容器1与容器2在同一资源池内，则不需要考虑数据链路传输时延，若容器2在另一个资源池内，则传输时延主要由链路所经过的交换机和链路数量决定，定义K_n为所经过的交换机数量，链路数量也为K_n，则链路传输时延为：Based on the above analysis, the present invention continues to analyze the link transmission delay after data leaves container 1. If container 1 and container 2 are in the same resource pool, the data link transmission delay does not need to be considered. If container 2 is in another resource pool , then the transmission delay is mainly determined by the switches and the number of links passed by the link. Define K _n as the number of switches passed through, and the number of links is also K _n , then the link transmission delay is:

式中，n₁表示容器1处待传输的数据规模。In the formula, n ₁ represents the size of the data to be transmitted at container 1.

接下来本发明研究数据在经过容器1后的容器中的数据包平均时延总和，由于数据包在容器2的到达过程受到容器1处理速率和传输速率的影响。很难分析业务流s数据包连续到达容器2的时间间隔Z_i。为了消除每个流数据包在容器1处的处理和传输速率对到达容器2的过程的影响，本发明将某业务的数据流S到达容器2的过程近似为到达率为λ_s的泊松过程，并且建立相同的M/D/1排队模型。根据得到的数据包到达过程，数据包在容器2处的平均时延总和如下式所示：Next, the present invention studies the sum of the average delay of data packets in the container after passing through the container 1, because the arrival process of the data packet in the container 2 is affected by the processing rate and the transmission rate of the container 1. It is difficult to analyze the time interval Z _i during which the data packets of the traffic flow s continuously arrive at the container 2 . In order to eliminate the influence of the processing and transmission rate of each stream data packet at container 1 on the process of reaching container 2, the present invention approximates the process of data flow S of a service reaching container 2 as a Poisson process with arrival rate λ _s , and build the same M/D/1 queuing model. According to the obtained data packet arrival process, the average delay sum of data packets at container 2 is as follows:

式中，c_s,1表示进行资源分配后的针对电力业务s的处理速度，c_s,2表示进行资源分配后的针对电力业务s的传输速度，当c_s,1＜c_s,2时，由于数据包在容器2的处理速率较慢，造成在容器2处的排队积压时延，当c_s,1≥c_s,2时，数据包以c_s,1的处理速率通过容器2，并以c_s,2的传输速率离开边缘计算节点。因此，总时延可由业务请求数据流经过的资源池和数据链路分别求出：In the formula, c _s,1 represents the processing speed of the power service s after resource allocation, and c _s,2 represents the transmission speed of the power service s after resource allocation, when c _s,1 <c _s,2 , due to the slow processing rate of data packets in container 2, resulting in a backlog of queue delay at container 2, when c _s,1 ≥c _s,2 , the data packets pass through container 2 at the processing rate of c _s,1 , and leaves the edge computing node at a transmission rate of c _s,2 . Therefore, the total delay can be calculated separately from the resource pool and data link through which the service request data flow passes:

上式中D_s,z表示资源池z的处理速率和发送速率(即电力业务s在资源池z处的平均时延)，

表示资源池z后的数据链路传输时延，D_s,z和

的计算式如下：In the above formula, D _s,z represents the processing rate and transmission rate of the resource pool z (that is, the average delay of the power service s at the resource pool z),

represents the data link transmission delay after resource pool z, D _{s, z} and

The calculation formula is as follows:

综上所述，本发明目的是在于优化容器集群业务请求流数据包的端到端时延，本发明建立的最优化问题模型如下式所示：To sum up, the purpose of the present invention is to optimize the end-to-end delay of the container cluster service request flow data packets. The optimization problem model established by the present invention is shown in the following formula:

III-时延控制的多任务部署与CPU计算资源动态分配算法。III-Delay Controlled Multitask Deployment and Algorithm for Dynamic Allocation of CPU Computing Resources.

针对电力物联网的复杂环境和不同业务，本发明首先基于改进蚁群算法提出了时延控制的多任务部署算法。Aiming at the complex environment and different services of the power Internet of things, the present invention firstly proposes a multi-task deployment algorithm of delay control based on the improved ant colony algorithm.

业务由一个或多个任务组成，用户将通过向调度模块提交一些任务来启动业务。本发明设计改进蚁群算法(Improved Ant Colony Optimization，IACO)，并通过调度模块选择一组符合指定约束的节点，并将任务部署到这些节点(即资源池中的容器)。A business consists of one or more tasks, and the user will start the business by submitting some tasks to the scheduling module. The present invention designs and improves the ant colony algorithm (Improved Ant Colony Optimization, IACO), and selects a group of nodes that meet the specified constraints through the scheduling module, and deploys tasks to these nodes (ie, containers in the resource pool).

调度模块的目标是将任务放到可用资源上。在每次调度时候会使用可用资源。一个人工蚂蚁通过观察每个资源的信息素轨迹来随机寻找资源。某个节点的每个计算资源的计算公式为：The goal of the scheduling module is to place tasks on available resources. Available resources are used at each scheduling time. An artificial ant randomly searches for resources by observing the pheromone trajectory of each resource. The calculation formula of each computing resource of a node is:

上式中：R(j)为节点j的资源，r'_m,j为节点j的可用内存，r_m,j为节点的总内存，r_p,j'为节点的可用CPU，r_p,j为节点的总CPU，ψ_m为所有节点的内存大小，ψ_p为所有节点的CPU大小，

其中，n为待分配内存的任务个数，ψ_i表示分配内存的比率。In the above formula: R(j) is the resource of node j, r' _m,j is the available memory of node j, r _m,j is the total memory of the node, rp _,j ' is the available CPU of the node, rp _{, j} is the total CPU of the node, ψ _m is the memory size of all nodes, ψ _p is the CPU size of all nodes,

Among them, n is the number of tasks to be allocated memory, and ψ _i represents the ratio of allocated memory.

为了初始化每个节点的信息素轨迹，在循环贪婪算法种使用R(j)，其简单的将每个任务放在循环模式的一个节点上，τ_0,j＝R(j)为每个节点的起始信息素。选择当前节点进行任务部署的概率的计算公式为：To initialize the pheromone trajectory of each node, R(j) is used in the cyclic greedy algorithm, which simply places each task on a node in the cyclic pattern, τ _0,j =R(j) for each node the starting pheromone. The calculation formula for the probability of selecting the current node for task deployment is:

上式中，η_e为节点e的启发值，α是信息启发式因子，β为期望启发式因子，m为节点总个数。In the above formula, η _e is the heuristic value of node e, α is the information heuristic factor, β is the expected heuristic factor, and m is the total number of nodes.

计算每个节点的进行部署概率p(t,j)，选择下一个节点j。Calculate the deployment probability p(t,j) of each node, and select the next node j.

其中j∈P(k)，q₀为预设探索率阈值，q为蚁群算法的探索率。P(k)表示第k次迭代时可选择部署任务的资源节点。N为所有节点的集合。where j∈P(k), q ₀ is the preset exploration rate threshold, and q is the exploration rate of the ant colony algorithm. P(k) represents the resource nodes that can choose to deploy tasks at the k-th iteration. N is the set of all nodes.

下一步计算信息素消失。每个节点的信息素轨迹消失计算公式为：The next step is to calculate the disappearance of the pheromone. The formula for calculating the disappearance of the pheromone trajectory of each node is:

式中，τ(t,j)表示t时刻资源池j对应的信息素轨迹消失，τ_0,j表示资源池j对应的初始化的信息素，Δ表示信息素的变化，ρ表示动态信息素挥发系数。当一个任务被调度程序放置到一个特定的资源，Δ的值总是小于0。信息素利用上式进行计算，为了显著减小所选节点的信息素水平，使其对下一个任务不那么重要，使得剩余的任务分配到整个资源。In the formula, τ(t,j) represents the disappearance of the pheromone trajectory corresponding to resource pool j at time t, τ _0,j represents the initialized pheromone corresponding to resource pool j, Δ represents the change of pheromone, and ρ represents the volatilization of dynamic pheromone coefficient. When a task is placed on a specific resource by the scheduler, the value of Δ is always less than 0. Pheromone is calculated using the above formula, in order to significantly reduce the pheromone level of the selected node, making it less important for the next task, so that the remaining tasks are allocated to the whole resource.

蚁群算法中信息素挥发系数的取值对算法的收敛以及搜索的性能都起着关键的作用，传统蚁群算法中ρ的取值一般是0到1之间的一个固定值。因为路径上的信息素浓度在刚开始迭代时比较小，随着迭代次数的增加即使路径上是的信息素每次都挥发但是信息素浓度还是会不断的增加，所以如果使用固定的信息挥发系数会导致前期路径信息素挥发过多而后期路径信息素挥发较少，这将严重制约着蚁群优化算法的性能，为了解决此问题本发明引入常被用作神经网络激活函数的Sigmoid函数，该函数是单调递增的，而且其会将ρ的取值映射到(0,1)之间，这样会使得前期削弱对路径信息素浓度的挥发而后期加大对信息素浓度的挥发。ρ的取值公式变为：The value of the pheromone volatility coefficient in the ant colony algorithm plays a key role in the convergence of the algorithm and the performance of the search. The value of ρ in the traditional ant colony algorithm is generally a fixed value between 0 and 1. Because the pheromone concentration on the path is relatively small at the beginning of the iteration, as the number of iterations increases, even if the pheromone on the path is volatilized every time, the pheromone concentration will continue to increase, so if a fixed information volatilization coefficient is used It will lead to excessive volatilization of pheromone in the early stage and less volatilization of pheromone in the later stage, which will seriously restrict the performance of the ant colony optimization algorithm. The function is monotonically increasing, and it maps the value of ρ to (0,1), which will weaken the volatilization of the path pheromone concentration in the early stage and increase the volatilization of the pheromone concentration in the later stage. The value formula of ρ becomes:

上式中，k表示算法的迭代次数。In the above formula, k represents the number of iterations of the algorithm.

最佳的规划采用下式选出最频繁的规划。The best plan uses the following formula to select the most frequent plan.

式中，x为p(k)取得最大时的值，并将这个值赋值给p_w。In the formula, x is the value when p(k) reaches the maximum value, and this value is assigned to p _w .

通过对传统蚁群算法进行改进，本发明解决了传统蚁群算法使用固定的信息挥发系数导致的较差性能，完成多任务的协同部署。By improving the traditional ant colony algorithm, the invention solves the poor performance caused by the traditional ant colony algorithm using a fixed information volatilization coefficient, and completes the coordinated deployment of multiple tasks.

其次，在完成多任务业务在容器上的部署后，本发明设计了一个面向时延控制的资源分配算法，来指导物联代理中CPU计算资源和带宽资源的动态分配，进而保证了业务时延的需求。Secondly, after completing the deployment of multi-task services on the container, the present invention designs a resource allocation algorithm oriented to delay control to guide the dynamic allocation of CPU computing resources and bandwidth resources in the IoT agent, thereby ensuring the service delay. demand.

由于边缘计算网络的环境是复杂多变的，为了在这个具有挑战性的环境中学习，有必要使用可靠且扩展性强的学习算法，由于PPO算法通过将参数更新的范围绑定到信任区域来确保稳定性，因此，本发明考虑使用该算法完成资源的分配。为了能够使用深度强化学习算法解决上一节得到的问题，本发明首先将该问题转化为一个MDP(Markov decisionprocesses，马尔可夫决策)过程。其中包含四个元素：智能体，输入，动作，奖励。Since the environment of edge computing network is complex and changeable, in order to learn in this challenging environment, it is necessary to use a reliable and highly scalable learning algorithm, since the PPO algorithm works by binding the range of parameter updates to the trust region. To ensure stability, the present invention considers using this algorithm to complete resource allocation. In order to be able to use the deep reinforcement learning algorithm to solve the problem obtained in the previous section, the present invention first converts the problem into an MDP (Markov decision processes, Markov decision) process. It contains four elements: agent, input, action, reward.

智能体：在本发明种智能体为边缘物联代理。Agent: In the present invention, the agent is the edge IoT agent.

输入：在环境信息发生变化后，系统状态表示为x_t＝[N_t,z,A_t],将当前系统状态输入到模型中，其中N_z,t表示资源池z的容器流组

式中，以

为例，

表示资源池z在t时刻针对任务1的资源分配，

为所有资源池的资源分配策略。Input: After the environmental information changes, the system state is expressed as x _t =[N _t,z ,A _t ], the current system state is input into the model, where N _z,t represents the container flow group of resource pool z

In the formula, with

For example,

represents the resource allocation of resource pool z for task 1 at time t,

Resource allocation policy for all resource pools.

动作：在时刻t时，资源池z只能使用一种资源分配策略a_t,z＝[κ_t,z,ν_t,z]，其中

Action: At time t, resource pool z can only use one resource allocation strategy a _t,z =[κ _t,z ,ν _t,z ], where

奖励函数：当智能体选择某一动作时，对应系统状态将发生变化并且得到策略的奖励值，在本发明中，使用当前状态和下一状态的容器业务请求流的数据包端到端时延差作为奖励。如下式所示：Reward function: When the agent chooses a certain action, the corresponding system state will change and the reward value of the strategy will be obtained. In the present invention, the end-to-end delay of data packets of the container service request flow in the current state and the next state is used. difference as a reward. As shown in the following formula:

上式中，S，S′分别代表当前状态和下一状态资源池业务的容器组处理的电力业务。当下一状态获得的端到端时延更低时，得到一个正的奖励值，反之则得到一个负的奖励值。In the above formula, S and S' represent the power services processed by the container groups of the resource pool services in the current state and the next state, respectively. When the end-to-end delay obtained by the next state is lower, a positive reward value is obtained, otherwise, a negative reward value is obtained.

PPO算法是一种基于演员-评论家算法框架的深度强化学习算法，本发明设计的PPO架构中包含两个Actor网络，Actor1和Actor2。Actor1代表当前最新的策略π并与边缘网络环境进行交互，该网络基于当前环境状态选择任务部署动作。评论家根据执行部署动作后获得的奖励对当前策略进行评判，并通过损失函数的反向传播实现对评论家网络中的参数进行更新。Actor2代表旧策略π_old，智能体每训练一段时间，就使用Actor1中的参数对Actor2进行更新，重复上述过程，直至PPO算法收敛，此时就得到了一个训练好的基于AC框架的边缘多任务部署模型。The PPO algorithm is a deep reinforcement learning algorithm based on the actor-critic algorithm framework. The PPO architecture designed by the present invention includes two Actor networks, Actor1 and Actor2. Actor1 represents the current up-to-date policy π and interacts with the edge network environment, which selects tasks to deploy actions based on the current environment state. The critic evaluates the current policy based on the reward obtained after performing the deployment action, and updates the parameters in the critic network through back-propagation of the loss function. Actor2 represents the old strategy π _old . Every time the agent is trained for a period of time, it uses the parameters in Actor1 to update Actor2, and the above process is repeated until the PPO algorithm converges. At this time, a trained edge multitasking based on the AC framework is obtained. Deploy the model.

PPO算法需对新旧策略之间的相似度值r_t(θ)进行限制。The PPO algorithm needs to limit the similarity value r _t (θ) between the old and new strategies.

L^CLIP(θ)＝E_t[min(r_t(θ)B_t,clip(r_t(θ)),1-ε,1+ε)B_t]L ^CLIP (θ)=E _t [min(r _t (θ)B _t ,clip(r _t (θ)),1-ε,1+ε)B _t ]

上式中，ε∈[0,1]是一个超参数，clip()将r_t(θ)的值约束在区间[1-ε,1+ε]内。In the above formula, ε∈[0,1] is a hyperparameter, and clip() constrains the value of r _t (θ) within the interval [1-ε,1+ε].

表示新旧策略的相似度，上式中，π_θ(a|s)为使用策略π在电力业务s对应状态下采取动作a的概率，在代入计算时，将r_t(θ)带入上式。π_θ(a|s)无具体的计算公式，具体结果由神经网络输出；B_t为t时刻的优势函数，E_t为t时刻的期望值。

Represents the similarity between the old and new strategies. In the above formula, π _θ(a|s) is the probability of using the strategy π to take action a in the corresponding state of the power business s. When substituting it into the calculation, bring r _t (θ) into the above formula . There is no specific calculation formula for π _θ(a|s) , and the specific results are output by the neural network; B _t is the advantage function at time t, and E _t is the expected value at time t.

边缘网络中可调度资源包含边缘物联代理可用计算资源和可用带宽资源载算法通过奖励函数对采取某动作的智能体产生一个即时奖励，算法根据得到的奖励不断学习，从而得到最优的资源分配策略。The schedulable resources in the edge network include the available computing resources and available bandwidth resources of the edge IoT agent. The algorithm generates an instant reward for the agent that takes a certain action through the reward function, and the algorithm continuously learns according to the reward obtained to obtain the optimal resource allocation. Strategy.

PPO算法的演员网络由两个神经网络Actor1和Actor2组成，Actor1指导智能体与环境交互，获得转移样本，并将其缓存。Actor2中的策略参数表示旧策略，每经过一段迭代，都会使用Actor1中的参数对Actor2中的参数进行更新。评论家网络由一个神经网络组成。任务部署模型的具体训练步骤如下：The actor network of the PPO algorithm consists of two neural networks Actor1 and Actor2. Actor1 guides the agent to interact with the environment, obtain transfer samples, and cache them. The strategy parameter in Actor2 represents the old strategy. After each iteration, the parameters in Actor1 will be used to update the parameters in Actor2. The critic network consists of a neural network. The specific training steps of the task deployment model are as follows:

a)将当前状态输入到Actor1网络中，智能体基于策略π_old选择一个动作，即a_l＝π(s_l)。重复上述过程，智能体持续与边缘网络交互T个时间步，收集历史交互信息并缓存。a) Input the current state into the Actor1 network, the agent selects an action based on the policy π _old , ie a _l =π(s _l ). Repeating the above process, the agent continuously interacts with the edge network for T time steps, collects historical interaction information and caches it.

b)使用下式计算每个时间步的优势函数，其中γ为折扣因子，V为状态值函数，φ为评论家网络参数，l为时间步。b) Calculate the advantage function for each time step using the following equation, where γ is the discount factor, V is the state value function, φ is the critic network parameter, and l is the time step.

B_t＝∑_l＞tγ^l-trw_l-V_φ(s_t)B _t =∑ _l>t γ ^lt rw _l -V _φ (s _t )

c)利用下式计算评论家网络的损失函数，并根据该函数反向传播更新评论家网络参数φ。c) Calculate the loss function of the critic network using the following formula, and update the critic network parameter φ according to the back-propagation of this function.

d)利用L^CLIP(θ)与优势函数对演员网络的参数进行更新。d) Use L ^CLIP (θ) and the advantage function to update the parameters of the actor network.

e)重复步骤d，一定步骤后使用Actor1中的网络参数更新Actor2的参数。e) Repeat step d, and use the network parameters in Actor1 to update the parameters of Actor2 after a certain step.

f)循环a-f步骤。f) Cycle a-f steps.

得到基于Actor-Critic框架的PPO算法，然后智能体按照给定的任务队列根据演员网络输出下一步动作，评论家网络给出相应的评价，不断迭代，直至完成面向时延敏感任务的资源的分配。The PPO algorithm based on the Actor-Critic framework is obtained, and then the agent outputs the next action according to the actor network according to the given task queue, and the critic network gives the corresponding evaluation, and iterates continuously until the allocation of resources for time-sensitive tasks is completed. .

本发明采用基于PPO的资源分配算法，实现面向时延控制的计算和通信资源的高效分配。The present invention adopts the resource allocation algorithm based on PPO to realize the efficient allocation of time delay control-oriented calculation and communication resources.

实施例2：Example 2:

基于同一发明构思，本发明还提供了一种基于时延控制的物联代理多任务调度系统，该系统结构如图3所示，包括：数据采集模块、求解模块和调度模块；Based on the same inventive concept, the present invention also provides a multi-task scheduling system for IoT agents based on time delay control.

其中，数据采集模块，用于获取多个电力业务到达物联代理的资源池的信息；Among them, the data collection module is used to obtain the information that multiple power services reach the resource pool of the IoT agent;

求解模块，用于将信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到电力业务的部署方式以及物联代理的资源分配方式；The solving module is used to input the information into the pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model for solving, and obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

调度模块，用于根据电力业务的部署方式以及物联代理的资源分配方式，对电力业务进行调度；The scheduling module is used to schedule the power service according to the deployment method of the power service and the resource allocation method of the IoT agent;

其中，所述求解模块，还用于以多个电力业务在各物联代理的资源池中部署及资源分配方式下的总时延最小为目标构建目标函数；Wherein, the solving module is further configured to construct an objective function with the goal of minimizing the total time delay under the deployment of multiple power services in the resource pool of each IoT agent and the resource allocation mode;

基于目标函数和约束条件构建时延控制的多任务部署及资源分配的最优化问题模型；Build an optimization problem model for delay-controlled multi-task deployment and resource allocation based on objective functions and constraints;

其中，资源包括计算资源和带宽资源；信息至少包括下述中的一种或多种：电力业务请求所需的资源池中的容器序列、电力业务的优先级、电力业务的数据包离开资源池的时间间隔分布对应的泊松过程的参数、资源池的单位计算资源对电力业务的处理速度和资源池的单位带宽资源对电力业务的传输速度。The resources include computing resources and bandwidth resources; the information includes at least one or more of the following: the sequence of containers in the resource pool required by the power service request, the priority of the power service, and the data packets of the power service leaving the resource pool The time interval distribution corresponds to the parameters of the Poisson process, the processing speed of the power service by the unit computing resource of the resource pool, and the transmission speed of the power service by the unit bandwidth resource of the resource pool.

其中，求解模块构建的目标函数的计算式如下：Among them, the calculation formula of the objective function constructed by the solving module is as follows:

电力业务s在资源池z处的平均时延D_s,z的计算式如下：The calculation formula of the average delay D _s,z of the power service s at the resource pool z is as follows:

电力业务s在离开资源池z后的链路时延

的计算式如下：Link delay of power service s after leaving resource pool z

The calculation formula is as follows:

其中，求解模块构建的资源约束的计算式如下：Among them, the calculation formula for solving the resource constraints constructed by the module is as follows:

表示电力业务在资源池中可获得的资源量集合，

其中，求解模块，具体用于：Among them, the solving module is specifically used for:

将信息输入预先构建的时延控制的多任务部署及资源分配的最优化问题模型，采用基于动态信息素挥发系数的改进蚁群算法对时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到多个电力任务在物联代理的资源池中各容器的部署方式；Input the information into the pre-built delay-controlled multi-task deployment and resource allocation optimization problem model, and use the improved ant colony algorithm based on the dynamic pheromone volatility coefficient to optimize the delay-controlled multi-task deployment and resource allocation problem model. Solve to get the deployment mode of each container in the resource pool of the IoT agent for multiple power tasks;

基于多个电力任务在物联代理的资源池中各容器的部署方式，采用近端策略优化算法对时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到物联代理的资源池向各容器的资源分配方式。Based on the deployment methods of multiple power tasks in the resource pool of the IoT agent, the near-end policy optimization algorithm is used to solve the optimization problem model of the multi-task deployment and resource allocation of delay control, and the resources of the IoT agent are obtained. How the pool allocates resources to each container.

其中，求解模块采用基于动态信息素挥发系数的改进蚁群算法对时延控制的多任务部署及资源分配的最优化问题模型进行求解，得到多个电力任务在物联代理的资源池中各容器的部署方式，包括：Among them, the solving module uses the improved ant colony algorithm based on the dynamic pheromone volatility coefficient to solve the optimization problem model of the multi-task deployment and resource allocation of delay control, and obtains the multiple power tasks in each container in the resource pool of the IoT agent. deployment methods, including:

基于信息素轨迹消失计算选择各资源池进行任务部署的概率并基于概率选择下一个部署电力任务的资源池；Calculate the probability of selecting each resource pool for task deployment based on the disappearance of the pheromone trajectory, and select the next resource pool to deploy the power task based on the probability;

基于信息素轨迹消失和选择的下一个部署电力任务的资源池采用蚁群进行迭代计算，得到蚁群路径规划中最频繁的规划作为多个电力任务在物联代理的资源池中各容器的部署方式。Based on the disappearance of pheromone trajectory and the selection of the resource pool for the next deployment power task, the ant colony is used for iterative calculation, and the most frequent plan in the ant colony path planning is obtained as the deployment of each container in the resource pool of the IoT agent as multiple power tasks Way.

其中，求解模块初始化蚁群算法的信息素的计算式如下：Among them, the calculation formula of the pheromone of the solving module to initialize the ant colony algorithm is as follows:

其中，求解模块计算信息素轨迹消失的计算式如下：Among them, the calculation formula of the solving module to calculate the disappearance of the pheromone trajectory is as follows:

动态信息素挥发系数ρ的计算式如下：The calculation formula of dynamic pheromone volatilization coefficient ρ is as follows:

实施例3：Example 3:

本发明构建的电力物联代理调度系统如图4所示。业务终端包括各类智能电表、光伏板及充电桩等电力业务设备。电力通信网包括工业以太网、EPON(Ethernet PassiveOptical Network，以太网无源光网络)、2G/3G/4G、无线局域网、无线广域网等。物联代理层由软件层和硬件层组成，用于连接业务终端与平台层，实现终端的数据采集、消息适配、边缘计算、传输等多种功能。物联代理基于轻量级容器技术，通过内置资源调度引擎完成边缘侧虚拟化计算和网络资源的灵活调度，实现统一接入、多维感知、统一建模、资源编排。物联管控平台通过Restful/OpenFlow/SNMP对物联代理进行管控。The power IoT agent dispatching system constructed by the present invention is shown in FIG. 4 . Business terminals include various smart meters, photovoltaic panels and charging piles and other power business equipment. The power communication network includes industrial Ethernet, EPON (Ethernet Passive Optical Network, Ethernet Passive Optical Network), 2G/3G/4G, wireless local area network, wireless wide area network, etc. The IoT agent layer is composed of a software layer and a hardware layer, which is used to connect the service terminal and the platform layer, and realize various functions such as data collection, message adaptation, edge computing, and transmission of the terminal. Based on the lightweight container technology, the IoT agent completes the flexible scheduling of edge-side virtualized computing and network resources through the built-in resource scheduling engine, and realizes unified access, multi-dimensional perception, unified modeling, and resource scheduling. The IoT management and control platform manages and controls the IoT agent through Restful/OpenFlow/SNMP.

实施例4：Example 4:

基于同一种发明构思，本发明还提供了一种计算机设备，该计算机设备包括处理器以及存储器，所述存储器用于存储计算机程序，所述计算机程序包括程序指令，所述处理器用于执行所述计算机存储介质存储的程序指令。处理器可能是中央处理单元(CentralProcessing Unit，CPU)，还可以是其他通用处理器、数字信号处理器(Digital SignalProcessor、DSP)、专用集成电路(Application SpecificIntegrated Circuit，ASIC)、现成可编程门阵列(Field-Programmable GateArray，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等，其是终端的计算核心以及控制核心，其适于实现一条或一条以上指令，具体适于加载并执行计算机存储介质内一条或一条以上指令从而实现相应方法流程或相应功能，以实现上述实施例中一种基于时延控制的物联代理多任务调度方法的步骤。Based on the same inventive concept, the present invention also provides a computer device, the computer device includes a processor and a memory, the memory is used for storing a computer program, the computer program includes program instructions, and the processor is used for executing the Program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array ( Field-Programmable GateArray, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computing core and control core of the terminal, which are suitable for implementing one or more instructions, specifically suitable for loading And execute one or more instructions in the computer storage medium to realize the corresponding method process or corresponding function, so as to realize the steps of the multi-task scheduling method for IoT agents based on delay control in the above embodiment.

实施例5：Example 5:

基于同一种发明构思，本发明还提供了一种存储介质，具体为计算机可读存储介质(Memory)，所述计算机可读存储介质是计算机设备中的记忆设备，用于存放程序和数据。可以理解的是，此处的计算机可读存储介质既可以包括计算机设备中的内置存储介质，当然也可以包括计算机设备所支持的扩展存储介质。计算机可读存储介质提供存储空间，该存储空间存储了终端的操作系统。并且，在该存储空间中还存放了适于被处理器加载并执行的一条或一条以上的指令，这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是，此处的计算机可读存储介质可以是高速RAM存储器，也可以是非不稳定的存储器(non-volatile memory)，例如至少一个磁盘存储器。可由处理器加载并执行计算机可读存储介质中存放的一条或一条以上指令，以实现上述实施例中一种基于时延控制的物联代理多任务调度方法的步骤。Based on the same inventive concept, the present invention also provides a storage medium, specifically a computer-readable storage medium (Memory), which is a memory device in a computer device for storing programs and data. It can be understood that, the computer-readable storage medium here may include both a built-in storage medium in a computer device, and certainly also an extended storage medium supported by the computer device. The computer-readable storage medium provides storage space in which the operating system of the terminal is stored. In addition, one or more instructions suitable for being loaded and executed by the processor are also stored in the storage space, and these instructions may be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory. One or more instructions stored in the computer-readable storage medium can be loaded and executed by the processor to implement the steps of the method for scheduling multiple tasks of an IoT agent based on delay control in the foregoing embodiment.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

最后应当说明的是:以上实施例仅用于说明本发明的技术方案而非对其保护范围的限制,尽管参照上述实施例对本发明进行了详细的说明,所属领域的普通技术人员应当理解:本领域技术人员阅读本发明后依然可对申请的具体实施方式进行种种变更、修改或者等同替换，但这些变更、修改或者等同替换，均在申请待批的权利要求保护范围之内。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit the scope of its protection. Although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand: Those skilled in the art can still make various changes, modifications or equivalent replacements to the specific embodiments of the application after reading the present disclosure, but these changes, modifications or equivalent replacements are all within the protection scope of the pending claims of the application.

Claims

1. a method for multi-task scheduling of IoT agents based on time delay control, is characterized in that, comprising:

Obtain the information that multiple power services arrive at the resource pool of the IoT agent;

Inputting the information into a pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model to solve, to obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

According to the deployment method of the power service and the resource allocation method of the IoT agent, the power service is scheduled;

Wherein, the multi-task deployment and resource allocation optimization problem model of delay control is constructed based on the goal of minimizing the total delay of multiple power services.

2. The method according to claim 1, wherein the construction of the optimization problem model of the multi-task deployment and resource allocation of the delay control comprises:

The objective function is constructed with the goal of minimizing the total delay in the deployment of multiple power services in the resource pool of each IoT agent and the resource allocation mode;

Build constraints based on resource constraints of IoT agents;

Build an optimization problem model of delay-controlled multi-task deployment and resource allocation based on the objective function and constraints;

Wherein, the resources include computing resources and bandwidth resources; the information includes at least one or more of the following: the sequence of containers in the resource pool required by the power service request, the priority of the power service, and the data of the power service The time interval distribution of packets leaving the resource pool corresponds to the parameters of the Poisson process, the processing speed of the unit computing resource of the resource pool to the power service, and the transmission speed of the unit bandwidth resource of the resource pool to the power service.

3. method as claimed in claim 2, is characterized in that, the computational formula of described objective function is as follows:

In the formula, F represents the total delay of the deployment and resource allocation of multiple power services in the resource pool of each IoT agent, K _S represents the number of power services in the same resource pool in the same time period, ω _s represents the power service The priority of s, D _s represents the total delay of the power service s, and S represents the set of all power services;

The calculation formula of the total delay D _s of the power service s is as follows:

In the formula, |N _s | represents the number of containers in the resource pool that process the power service s, D _s,z represents the average delay of the power service s at the resource pool z,

The calculation formula of the average delay D _s,z of the power service s at the resource pool z is as follows:

In the formula, c _s,1 represents the processing speed of the power service s after resource allocation, c _s,2 represents the transmission speed of the power service s after resource allocation, and P _s,z represents the power service s in the resource pool. The non-empty probability in the processing queue of z, λ _s represents the parameters of the Poisson process corresponding to the time interval distribution of the data packets of the power service s leaving the resource pool;

The link delay of the power service s after leaving the resource pool z

The calculation formula is as follows:

In the formula, n _z represents the size of the data to be transmitted in the resource pool z.

4. The method of claim 2, wherein the calculation formula of the resource constraint is as follows:

In the formula, _KB represents the total number of power services deployed on the resource pool, κ _z,i represents the proportion of computing resources allocated by resource pool z to power service i, and ν _z,i represents the bandwidth resources allocated by resource pool z to power service i , σ represents the maximum amount of resources available to the power business in the resource pool,

5. The method according to claim 2, wherein the information is input into a pre-built time-delay controlled multi-task deployment and an optimization problem model of resource allocation to solve, to obtain a deployment mode of the power service and The resource allocation method of the IoT agent, including:

The information is input into a pre-built optimization problem model of delay-controlled multi-task deployment and resource allocation, and an improved ant colony algorithm based on dynamic pheromone volatility coefficient is used to optimize the delay-controlled multi-task deployment and resource allocation. The optimization problem model is solved to obtain the deployment mode of each container in the resource pool of the IoT agent for multiple power tasks;

Based on the deployment methods of multiple power tasks in the resource pool of the IoT agent, the near-end strategy optimization algorithm is used to solve the optimization problem model of the multi-task deployment and resource allocation for delay control, and the IoT agent is obtained. The resource allocation method of the resource pool to each container.

6. The method according to claim 5, characterized in that, the improved ant colony algorithm based on dynamic pheromone volatility coefficient is used to solve the optimization problem model of the multi-task deployment and resource allocation of the time delay control, Get the deployment methods of multiple power tasks in each container in the IoT agent's resource pool, including:

Use the computing resources in the resource pool of the IoT agent to initialize the pheromone of the ant colony algorithm;

Calculate the disappearance of the pheromone trajectory of each resource pool based on the dynamic pheromone volatilization coefficient;

Calculate the probability of selecting each resource pool for task deployment based on the disappearance of the pheromone trajectory, and select the next resource pool to deploy the power task based on the probability;

Based on the disappearance of the pheromone trajectory and the selected resource pool for the next deployment power task, the ant colony is used for iterative calculation, and the most frequent planning in the ant colony path planning is obtained as multiple power tasks in each container in the resource pool of the IoT agent deployment method.

7. The method according to claim 6, wherein the calculation formula of the pheromone of the initialization ant colony algorithm is as follows:

In the formula, τ _0,j represents the initialized pheromone corresponding to resource pool j, r′ _m,j represents the available memory of resource pool j, the subscript m represents the memory, r _m,j represents the total memory of resource pool j, ψ _m represents the total memory of all resource pools, r′ _p,j represents the available CPU of resource pool j, subscript p represents CPU, r _p,j represents the total CPU of resource pool j, ψ _p represents the total CPU of all resource pools.

8. The method of claim 6, wherein the calculation formula for the disappearance of the pheromone track is as follows:

In the formula, τ(t,j) represents the disappearance of the pheromone trajectory corresponding to resource pool j at time t, τ _0,j represents the initialized pheromone corresponding to resource pool j, Δ represents the change of pheromone, and P(k) represents the first pheromone. The resource pool that can be selected in k iterations, ρ represents the dynamic pheromone volatility coefficient;

The calculation formula of the dynamic pheromone volatilization coefficient ρ is as follows:

9. A IoT agent multi-task scheduling system based on time delay control, characterized in that it comprises: a data acquisition module, a solution module and a scheduling module;

The data acquisition module is used to acquire the information that multiple power services reach the resource pool of the IoT agent;

The solving module is used for inputting the information into a pre-built time-delay controlled multi-task deployment and resource allocation optimization problem model for solving, to obtain the deployment mode of the power service and the resource allocation mode of the IoT agent;

The scheduling module is used to schedule the power service according to the deployment mode of the power service and the resource allocation mode of the IoT agent;

10 . The system according to claim 9 , wherein the solving module is further configured to target the minimum total time delay in the deployment of multiple power services in the resource pool of each IoT agent and the resource allocation mode. 11 . Build the objective function;

Build constraints based on resource constraints of IoT agents;

11. system as claimed in claim 10, is characterized in that, the computational formula of the objective function that described solving module builds is as follows:

The link delay of the power service s after leaving the resource pool z

The calculation formula is as follows:

12. The system according to claim 10, wherein the calculation formula of the resource constraint constructed by the solving module is as follows:

13. The system of claim 10, wherein the solving module is specifically used for:

14. The system according to claim 13, wherein the solving module adopts an improved ant colony algorithm based on dynamic pheromone volatilization coefficient to carry out the optimization problem model of the multi-task deployment and resource allocation of the delay control. Solve to get the deployment mode of each container in the resource pool of the IoT agent for multiple power tasks, including:

15. The system of claim 14, wherein the calculation formula of the pheromone of the ant colony algorithm initialized by the solution module is as follows:

16. The system of claim 14, wherein the calculation formula for calculating the disappearance of the pheromone trajectory by the solving module is as follows:

17. A computer device, comprising: one or more processors;

memory for storing one or more programs;

When the one or more programs are executed by the one or more processors, the method for scheduling multiple tasks of IoT agents based on delay control according to any one of claims 1 to 8 is implemented.

18. A computer-readable storage medium, characterized in that a computer program is stored thereon, and when the computer program is executed, the delay control-based IoT according to any one of claims 1 to 8 is realized. Proxy multitasking scheduling method.