[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2024174426A1 - 一种基于移动边缘计算的任务卸载及资源分配方法 - Google Patents

一种基于移动边缘计算的任务卸载及资源分配方法 Download PDF

Info

Publication number
WO2024174426A1
WO2024174426A1 PCT/CN2023/100968 CN2023100968W WO2024174426A1 WO 2024174426 A1 WO2024174426 A1 WO 2024174426A1 CN 2023100968 W CN2023100968 W CN 2023100968W WO 2024174426 A1 WO2024174426 A1 WO 2024174426A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
offloading
resource allocation
base station
processing
Prior art date
Application number
PCT/CN2023/100968
Other languages
English (en)
French (fr)
Inventor
李云
高倩
姚枝秀
夏士超
梁吉申
Original Assignee
重庆邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 重庆邮电大学 filed Critical 重庆邮电大学
Publication of WO2024174426A1 publication Critical patent/WO2024174426A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0925Management thereof using policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters
    • H04W28/0967Quality of Service [QoS] parameters
    • H04W28/0975Quality of Service [QoS] parameters for reducing delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/10Flow control between communication endpoints
    • H04W28/14Flow control between communication endpoints using intermediate storage
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present invention belongs to the technical field of wireless communications, and in particular relates to a task offloading and resource allocation method based on mobile edge computing.
  • MDs mobile devices
  • VR virtual reality
  • AR augmented reality
  • telemedicine etc.
  • MEC Mobile Edge Computing
  • MEC sinks the computing power, storage, and other resources of the cloud center to the edge of the network, and drives users to offload computing tasks to the edge of the network to enjoy a high-performance computing service experience.
  • Deep reinforcement learning combines the perception ability of deep learning and the decision-making ability of reinforcement learning, and can effectively handle various decision-making problems in MEC systems.
  • a method for calculating deep reinforcement learning in vehicle multi-access edge computing is The resource management method studies the joint allocation of spectrum, computing and storage resources in MEC vehicle networks, and uses DDPG and hierarchical learning to achieve rapid resource allocation, meeting the service quality requirements of vehicle applications.
  • a dynamic computing offloading and resource allocation method based on deep reinforcement learning in a cache-assisted mobile edge computing system studies the dynamic caching, computing offloading and resource allocation problems in cache-assisted MEC systems, and proposes an intelligent dynamic scheduling strategy based on DRL.
  • the above methods all use single-agent deep reinforcement learning algorithms. Single-agent deep reinforcement learning algorithms require a stable environment, while the actual network environment is often dynamically changing. The environment is unstable, which is not conducive to convergence, and it also makes techniques such as experience replay unable to be used directly.
  • the present invention proposes a task offloading and resource allocation method based on mobile edge computing, which includes:
  • the computationally intensive task generated at time slot t(t ⁇ T) is defined as in, Indicates the data size of the task. represents the maximum tolerable delay of the task, Indicates the number of CPU cycles required to process a unit bit task. represents the service type required for processing tasks; the tasks generated by all users under base station BS m are represented as
  • constructing the service assignment model in step S2 specifically includes: for any user There are four task processing modes, and different task processing modes have different processing delays; the four task processing modes are: local calculation, offloading to the associated BS m for processing, forwarding the offloaded tasks to other BSs for processing through the associated base station, and offloading to the cloud center for processing.
  • the task processing delay Indicates the task processing delay when the user performs local computing.
  • T tr,m (t) represents the delay of the task being forwarded by the associated base station, represents the time delay of other base stations processing tasks
  • T m,c (t) represents the transmission delay of tasks forwarded to the cloud center through associated base stations
  • the task offloading and resource allocation joint optimization problem is expressed as:
  • T represents the system operation time
  • M represents the number of base stations
  • a(t) represents the base station service cache strategy
  • b(t) represents represents the task offloading strategy
  • ⁇ (t) represents the spectrum resource allocation strategy
  • ⁇ (t) represents the base station computing resource allocation strategy
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the mth base station
  • Nm represents the number of user devices under the
  • the process of using the DSRA algorithm to solve the joint optimization problem of task offloading and resource allocation includes: abstracting the joint optimization problem of task offloading and resource allocation into a partially observable Markov decision process, with the base station acting as an intelligent agent, and constructing the corresponding observation space, action space and reward function; each intelligent agent has an actor network and a critic network embedded in an LSTM network; the actor network generates corresponding actions according to the current local observation state of a single intelligent agent and updates the reward function according to the action, and enters the next state; the critic network estimates the strategies of other intelligent agents based on the global observation state and action; generates experience information based on the current state, next state, action and reward value; samples multiple pieces of experience information to train the actor network and the critic network, updates the network parameters, and obtains the trained actor network and the critic network; and obtains the task offloading and resource allocation strategy based on the actor network training results.
  • r m (t) represents the reward value of base station BS m at time slot t
  • T represents the system running time
  • M represents the number of base stations
  • N m represents the number of user equipment under the mth base station
  • Y m (t) represents the reward when the task processing delay satisfies the delay constraint
  • U m (t) represents the reward when the cache does not exceed the storage capacity limit of the edge server.
  • the present invention aims at the service orchestration and computing network resource allocation problems in the decentralized MEC scenario, and proposes a task offloading and resource allocation method based on mobile edge computing with the goal of minimizing task processing delay; considering the time dependency of user service requests and the coupling relationship between service requests and service cache, an LSTM network is introduced to extract historical status information about service requests, so that users can make better decisions by learning this historical information. Through simulation experiments, this method can achieve lower latency and higher cache hit rate, and realize on-demand resource allocation.
  • FIG1 is a flow chart of a method for task offloading and resource allocation based on mobile edge computing in the present invention
  • FIG2 is a schematic diagram of a mobile edge computing system model in the present invention.
  • FIG3 is a block diagram of the DSRA algorithm in the present invention.
  • FIG4 is a diagram showing the variation of the average delay of the DSRA algorithm and the comparison algorithm in the present invention with the number of training iterations;
  • FIG5 is a diagram showing how the average cache hit rate of the DSRA algorithm of the present invention and the comparison algorithm changes with the number of training iterations.
  • the present invention proposes a task offloading and resource allocation method based on mobile edge computing, as shown in FIG1 , the method includes the following contents:
  • the present invention considers a typical MEC system, which includes M base stations (BS) and defines a base station set: Each BS is equipped with a MEC server with certain computing and storage resources; There are N m user devices MD under the mth base station, and the user set under the mth base station is defined as The system operates in discrete time slots, defining the time set For the i-th user under BS m, set Time slot t
  • the resulting computationally intensive task is defined as in, Indicates the data size of the task, in bits; represents the maximum tolerable delay of the task, Indicates the number of CPU cycles required to process a unit bit task; represents the service type required to process the task.
  • the tasks generated by all users under BS m are expressed as
  • S2 Construct service cache model and service assignment model based on the mobile edge computing system model.
  • Building a service cache model specifically includes:
  • a service refers to a specific program or data required to run various types of tasks (such as games, virtual/augmented reality).
  • tasks such as games, virtual/augmented reality.
  • MEC server that caches the corresponding service Only then can it provide computing services for MD's offloading tasks.
  • Building a service assignment model specifically includes:
  • BS m caches the processing Type of service required for the task Then the task can be processed by BS m , otherwise, the task can only be processed locally on the device or offloaded to other servers.
  • Indicates that at time slot t The task offloading strategy is: express The local task processing strategy, Indicates that the task can be processed locally.
  • Indicates the strategy of offloading tasks to the associated base station for processing represents the strategy of offloading tasks to neighboring base stations for processing. represents the strategy of offloading tasks to the cloud center for processing; the task offloading strategy for all users under base station BS m in time slot t is
  • the local processing time of the task can be expressed as Indicates the data size of the task, in bits. Indicates the CPU required to process a unit bit task Number of cycles.
  • the transmission rate of the uplink to BS m is
  • Bm is the bandwidth of BS m , Assigned to BS m in time slot t
  • the spectrum resource allocation coefficient satisfies Assigned to BS m
  • the bandwidth of BS m, the spectrum resource allocation strategy of BS m can be expressed as express
  • the channel gain between BS m and BS m, ⁇ 2 (t) represents the additive white Gaussian noise power in time slot t .
  • the transmission delay of the task is
  • f m represents the CPU frequency of BS m , Assigned to BS m in time slot t
  • the CPU frequency allocation coefficient satisfies Indicates that BS m is allocated to
  • the CPU frequency of BS m can be expressed as
  • the processing result of the task is usually much smaller than the uploaded data, and the present invention ignores the delay of returning the result.
  • the associated base station BS m does not cache service k, but its nearby base station BS n (n ⁇ 1,2,...,M ⁇ and n ⁇ m) caches service k, then
  • the task can be performed by the associated base station BS m forwarded, and migrated to other nearby base stations BS n for processing, that is, At time slot t, the transmission rate of tasks forwarded from the associated base station to nearby base stations is Among them, ⁇ m is the bandwidth of base station m when forwarding the task, P m is the forwarding power of base station m, G m,n is the channel gain between base stations m and n, then the time for the task to be forwarded by the associated base station is:
  • the task can also be forwarded by the associated base station BS m to the cloud center for processing, that is, The cloud center has abundant computing resources and storage resources, and the present invention ignores the task processing time and result transmission time of the cloud center.
  • the computational offloading time of the task is forwarded to the cloud center through the associated base station BS m.
  • r m,c (t) is the transmission rate at which BS m forwards tasks to the cloud center.
  • the delay of offloading tasks to the cloud center for processing is
  • the task processing delay represents the user under base station BS m at time slot t
  • the task processing delay represents the user under base station BS m at time slot t
  • the transmission delay of offloading the task to the associated base station represents the delay of the associated base station processing the task
  • T tr,m (t) represents the delay of the task being forwarded by the associated base station
  • Tm ,c (t) represents the number of users under base station BS m in time slot t
  • S3 Based on the service cache model and service assignment model, establish task offloading and resource allocation constraints.
  • the storage space of the MEC server is limited, and the storage space occupied by the cached services cannot exceed the storage capacity of the MEC server.
  • the size of the storage space of the mth MEC server MECm as R m , then Where l k represents the size of the storage space occupied by the service that processes the task.
  • the processing delay of the task cannot exceed the maximum tolerable delay:
  • the total amount of allocated spectrum resources should not be greater than the base station bandwidth:
  • the total amount of allocated computing resources should not be greater than the base station computing resources:
  • the server's resources such as computing, spectrum and storage space
  • task offloading and resource allocation are coupled with each other.
  • the present invention aims to minimize the long-term processing delay of tasks.
  • the joint optimization problem of service cache and computing network resource allocation is established and expressed as:
  • T represents the system operation time
  • M represents the number of base stations
  • ⁇ (t) ⁇ 1 (t)
  • ⁇ M (t) ⁇ represents the base station computing resource allocation strategy
  • N m represents the number of user devices under the m-th base station
  • the maximum tolerable delay of the task is represents the user under base station BS m at time slot t
  • the local task processing strategy Indicates user The strategy of offloading the task to the associated base station for processing, Indicates user The strategy of offloading the task to other base stations for processing is Indicates user The strategy of offloading the task to the cloud center
  • the present invention designs a distributed intelligent service arrangement and computing network resource allocation algorithm (Distributed Service Arrangement and Resource Allocation Algorithm, DSRA) based on multi-agent deep reinforcement learning, in which the base station is used as an agent to learn task offloading strategies, service caching strategies, and computing network resource allocation strategies.
  • DSRA distributed Service Arrangement and Resource Allocation Algorithm
  • the LSTM network is used to extract historical status information about service requests. By learning these historical information, the agent can better understand the future environmental status and make better decisions. As shown in Figure 3, it specifically includes the following contents:
  • the joint optimization problem of task offloading and resource allocation is abstracted into a partially observable Markov decision process (POMDP), with the base station acting as the intelligent agent, and constructing the corresponding observation space, action space and reward function; defining the tuple Describe the above Markov game process, where Represents the global state space, and the environment of time slot t is the global state is the observation space set of the agent, is the global action space set, is the reward set.
  • agent m observes Take strategy ⁇ m : Select the corresponding action Get corresponding rewards
  • the agent can receive detailed task information from mobile devices within its coverage, including the data size of the task, the maximum tolerable delay, the number of CPU cycles required to process the task per bit, and the required service type.
  • the environment state observed by agent m is The definition is as follows:
  • Agent m selects the corresponding action from the action space according to the observed environment state o m (t) and the current strategy ⁇ m .
  • the action of agent m is The definition is as follows:
  • the reward function measures the effect of an action taken by an agent in a given state.
  • the agent takes an action in the t-1 time slot, and the corresponding reward will be returned to the agent in the t time slot.
  • the agent will update its strategy to obtain the optimal result. Since the reward causes each agent to reach its optimal strategy, and the strategy directly determines the computing network resource allocation strategy, computing offloading strategy and service caching strategy of the corresponding MEC server, the reward function should be designed according to the original optimization problem.
  • the reward function constructed by the present invention includes three parts: the first part is the reward for the task processing time, and the second part is the reward for the task processing delay satisfying the delay constraint, that is, The third part is the reward for caching that does not exceed the storage capacity limit of the edge server, i.e.
  • the optimization goal is to minimize the long-term processing delay of the task and maximize the long-term reward, so the cumulative reward of agent m should be:
  • H( ⁇ ) is the Heaviside step function
  • ⁇ 1 and ⁇ 2 represent the first and second weight coefficients respectively
  • Y m (t) represents the reward for the task processing delay satisfying the delay constraint
  • U m (t) represents the reward for the cache not exceeding the storage capacity limit of the edge server.
  • Each base station has an actor network and a critic network embedded in an LSTM network. Both the actor network and the critic network include the current network and the target network.
  • the framework of the DSRA algorithm consists of an environment and M agents, namely base stations. Each agent has a centralized training phase and a decentralized execution phase. During training, centralized learning is used to train the critic network and the actor network. The critic network training requires the use of State information of other agents. During distributed execution, the actor network only needs to know local information. That is, each agent will use the global state and action to estimate the strategies of other agents during training, and adjust the local strategy according to the estimated strategy of other agents to achieve the global optimum.
  • the Multi-agent Deep Deterministic Policy Gradient (MADDPG) algorithm can handle the situation where the environment is fully observable, while the real environment state is often partially observable.
  • the present invention adds the long short-term memory network LSTM to the actor network and the critic network.
  • LSTM is a recurrent neural network that can extract historical state information about business requests. By learning this historical information, the agent can better understand the future state and make better decisions.
  • the actor network generates corresponding actions based on the current local observation state of a single agent; specifically: the actor network obtains the current task offloading and resource allocation strategy based on the local observation state, and can generate corresponding actions from the action space based on the task offloading and resource allocation strategy; the agent enters the next state.
  • Update the reward function according to the action generate experience information according to the current state, next state, action and reward value; sample multiple pieces of experience information to train the actor network and critic network, update the network parameters, and obtain the trained actor network.
  • the experience replay memory D of the agent m contains a set of experience tuples, Where o m (t) represents the observed state of agent m in time slot t, and a m (t) represents the observed state of agent m in time slot t based on the current observation.
  • o m (t) represents the action taken by agent m
  • r m (t) represents the reward obtained after agent m takes action
  • o' m (t+1) represents the state of agent m in time slot t+1
  • each agent’s actor network uses the local observed state o m (t) and the current historical state information And its own strategy Select Action
  • each critic network can obtain the observations o m (t) and actions a m (t) of other agents, so the Q function of agent m can be expressed as
  • the Q function evaluates the actions of the actor network from a global perspective and guides the actor network to choose a better action.
  • the critic network updates the network parameters by minimizing the loss function, which is defined as follows:
  • the actor network updates the network parameters ⁇ based on the centralized Q function calculated by the critic network and its own observation information, and outputs action a.
  • the actor network parameters ⁇ are updated by maximizing the policy gradient, that is:
  • the parameters of the target network are updated by soft updating, namely:
  • the actions taken by the actor network can be used to obtain the task offloading, service caching and resource allocation strategies within the time period T.
  • Task offloading based on the task offloading and resource allocation strategies can minimize the total processing delay of the task while satisfying various constraints.
  • the present invention is compared with the multi-agent deep deterministic policy gradient algorithm MADDPG (Multi-agent Deep Deterministic Policy Gradient), the single-agent deep deterministic gradient algorithm SADDPG (Single-agent Deep Deterministic Policy Gradient) and the single-agent deep deterministic gradient algorithm TADPG based on LSTM.
  • MADDPG Multi-agent Deep Deterministic Policy Gradient
  • SADDPG Single-agent Deep Deterministic Policy Gradient
  • TADPG single-agent deep deterministic gradient algorithm

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

一种基于移动边缘计算的任务卸载及资源分配方法,涉及无线通信技术领域;该方法包括:构建移动边缘计算系统模型;基于移动边缘计算系统模型构建服务缓存模型和服务指派模型;基于服务缓存模型和服务指派模型,建立任务卸载及资源分配约束条件;根据任务卸载及资源分配约束条件,以最小化任务处理时延为目标构建任务卸载及资源分配联合优化问题;采用DSRA算法求解任务卸载及资源分配联合优化问题,得到任务卸载及资源分配策略;本发明可实现低时延和高缓存命中率,实现资源的按需分配。

Description

一种基于移动边缘计算的任务卸载及资源分配方法
本申请要求于2023年02月20日提交中国专利局、申请号为202310138344.8、发明名称为“一种基于移动边缘计算的任务卸载及资源分配方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明属于无线通信技术领域,具体涉及一种基于移动边缘计算的任务卸载及资源分配方法。
背景技术
随着物联网的快速发展和智能移动设备(Mobile Device,MD)的爆炸式增长,以大数据和智能化为特点的新型应用不断涌现(如在线游戏、虚拟现实(VR)、增强现实(AR)、远程医疗等),且这些应用业务通常具有计算密集和时延敏感的特征。然而,受限于移动设备体积、计算能力、存储能力和电池电量等,MDs在处理高能耗、高复杂度的计算任务时,通常存在算力不足、时延大、续航能力低等问题。移动边缘计算(Mobile Edge Computing,MEC)作为一种先进的计算方式被提出,以实现网络边缘的超大容量、超低时延、超高带宽和低能耗数据处理愿景。MEC通过将云中心的算力、存储等资源下沉到网络边缘,并驱动用户将计算任务卸载到网络边缘,以享受高性能的计算服务体验。
深度强化学习(Deep Reinforcement Learning,DRL)结合了深度学习的感知能力和强化学习的决策能力,可以有效处理MEC系统中的各种决策问题。例如,现有技术中一种车辆多接入边缘计算中计算深度强化学习的 资源管理方法研究了MEC车载网络中频谱、计算和存储资源的联合分配问题,利用DDPG和分层学习,实现资源的快速分配,满足了车辆应用的服务质量要求。一种缓存辅助的移动边缘计算系统中基于深度强化学习的动态计算卸载和资源分配方法研究了缓存辅助的MEC系统中的动态缓存、计算卸载和资源分配问题,提出了一种基于DRL的智能动态调度策略。然而,上述方法都采用单智能体的深度强化学习算法,单智能体的深度强化学习算法要求环境是稳定的,而现实的网络环境往往是动态变化的,环境是不稳定的,不利于收敛,同时也会使经验回放等技巧无法直接使用。
因此,在未来网络结构日益密集异构化、资源部署去中心化的边缘网络中,如何设计实现更加动态灵活的分布式计算卸载和资源分配策略具有重要意义。同时,考虑到网络环境的部分可观测性和业务请求的时间依赖性等特征对网络服务编排和算网资源分配的影响,去中心化的MEC场景中的任务卸载和多维资源分配问题具有重要研究价值。
发明内容
针对现有技术存在的不足,本发明提出了一种基于移动边缘计算的任务卸载及资源分配方法,该方法包括:
S1:构建移动边缘计算系统模型;
S2:基于移动边缘计算系统模型构建服务缓存模型和服务指派模型;
S3:基于服务缓存模型和服务指派模型,建立任务卸载及资源分配约束条件;
S4:根据任务卸载及资源分配约束条件,以最小化任务处理时延为目 标构建任务卸载及资源分配联合优化问题;
S5:采用DSRA算法求解任务卸载及资源分配联合优化问题,得到任务卸载及资源分配策略。
优选的,步骤S1具体包括:构建移动边缘计算系统模型,包含M个基站BS,基站集合表示为每个基站配备有一个MEC服务器;对于基站BSm 其下有Nm个用户设备MD,用户集合表示为系统在离散的时隙中运行,定义时间集合T={0,1,2,…};对于基站BSm下的一个用户在时隙t(t∈T)产生的计算密集型任务定义为其中,表示任务的数据量大小,表示任务的最大容忍时延,表示处理单位比特任务所需要的CPU周期数,表示处理任务所需的服务类型;基站BSm下所有用户产生的任务表示为
优选的,步骤S2中构建服务缓存模型具体包括:定义服务类型集合为令ak,m(t)∈{0,1}表示在时隙t时BSm中服务k的缓存指示函数,ak,m(t)=1表示在BSm中缓存服务k,否则BSm将不会缓存服务k;基站BSmt时隙的服务缓存策略集合表示为am(t)={a1,m(t),…,ak,m(t),…,aK,m(t)}。
优选的,步骤S2中构建服务指派模型具体包括:对于任一用户具有四种任务处理方式,不同的任务处理方式具有不同的处理时延;四种任务处理方式分别为:本地计算、卸载到关联BSm进行处理、通过关联基站将卸载的任务转发到其他BS进行处理、卸载到云中心进行处理。
进一步的,用户的任务处理时延表示为:
其中,表示在t时隙时基站BSm下的用户的任务处理时延,表示用户进行本地计算时的任务处理时延,表示任务卸载到关联基站的传输时延,表示关联基站处理任务的时延,Ttr,m(t)表示任务被关联基站进行转发的时延,表示其他基站处理任务的时延,Tm,c(t)表示任务通过关联基站转发到云中心的传输时延,表示本地任务处理策略,表示任务卸载到关联基站进行处理的策略,表示任务卸载到其他基站进行处理的策略,表示任务卸载到云中心进行处理的策略。
优选的,所述任务卸载及资源分配联合优化问题表示为:





其中,T表示系统运行时间,M表示基站数量,表示在t时隙时基站BSm下的用户的任务处理时延,a(t)表示基站服务缓存策略,b(t)表 示任务卸载策略,α(t)表示频谱资源分配策略,β(t)表示基站算力资源分配策略,Nm表示第m个基站下的用户设备数量,表示在t时隙时基站BSm下的用户的任务最大容忍时延,表示用户的本地任务处理策略,表示用户的任务卸载到关联基站进行处理的策略,表示用户的任务卸载到其他基站进行处理的策略,表示用户的任务卸载到云中心进行处理的策略,ak,m(t)表示在时隙t时第m个基站BSm关于服务k的缓存指示函数,K表示服务类型数量,lk表示处理任务的服务k所占用的存储空间大小,Rm表示第m个MEC服务器的存储空间大小,表示BSm在时隙t分配给的频谱资源分配系数,表示BSm在时隙t分配给的CPU频率分配系数。
优选的,采用DSRA算法求解任务卸载及资源分配联合优化问题的过程包括:将任务卸载及资源分配联合优化问题抽象为部分可观测的马尔科夫决策过程,由基站充当智能体,并构建对应的观测空间、动作空间和奖励函数;每个智能体均具有嵌入LSTM网络的actor网络和critic网络;actor网络根据单个智能体当前的本地观测状态生成相应的动作并根据动作更新奖励函数,进入下一状态;critic网络根据全局的观测状态和动作来估计其他智能体的策略;根据当前状态、下一状态、动作和奖励值生成经验信息;采样多条经验信息训练actor网络和critic网络,更新网络参数,得到训练好的actor网络和critic网络;根据actor网络训练结果得到任务卸载及资源分配策略。
进一步的,所述奖励函数表示为:
其中,rm(t)表示t时隙时基站BSm的奖励值,T表示系统运行时间,M表示基站数量,Nm表示第m个基站下的用户设备数量,表示t时隙时基站BSm下的用户的任务处理时延,Ym(t)表示任务处理时延满足时延约束的奖励,Um(t)表示缓存不超过边缘服务器存储容量限制的奖励。
本发明的有益效果为:本发明针对去中心化的MEC场景中的服务编排和算网资源分配问题,以最小化任务处理时延为目标,提出了一种基于移动边缘计算的任务卸载及资源分配方法;考虑到用户业务请求的时间依赖性以及业务请求和服务缓存间的耦合关系,引入了LSTM网络来提取有关业务请求的历史状态信息,使用户通过学习这些历史信息,从而做出更优的决策。通过仿真实验,该方法可以实现更低的时延和更高的缓存命中率,实现了资源的按需分配。
附图说明
图1为本发明中基于移动边缘计算的任务卸载及资源分配方法流程图;
图2为本发明中移动边缘计算系统模型示意图;
图3为本发明中DSRA算法框图;
图4为本发明中DSRA算法和对比算法的平均时延随训练次数迭代的变化过程图;
图5为本发明中DSRA算法和对比算法的平均缓存命中率随训练迭代次数的变化过程图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本发明提出了一种基于移动边缘计算的任务卸载及资源分配方法,如图1所示,所述方法包括以下内容:
S1:构建移动边缘计算系统模型。
如图2所示,本发明考虑一种典型的MEC系统,其中,包含M个基站(Base Station,BS),定义基站集合每个BS配置了具有一定计算和存储资源的MEC服务器;在第m个基站BSm 下有Nm个用户设备MD,定义第m个基站下的用户集合表示为系统在离散的时隙中运行,定义时间集合对于BSm下的第i个用户设置时隙t产生的计算密集型任务定义为其中,表示任务的数据量大小,单位为bit;表示任务的最大容忍时延,表示处理单位比特任务所需要的CPU周期数;表示处理任务所需的服务类型。则BSm下所有用户产生的任务表示为
S2:基于移动边缘计算系统模型构建服务缓存模型和服务指派模型。
构建服务缓存模型具体包括:
在本发明中,服务是指运行各类型任务(如游戏、虚拟/增强现实)所需的特定程序或数据,在任一时隙,只有缓存了相应服务的MEC服务器 才能够为MD的卸载任务提供计算服务。假设网络中总共有K种不同类型的服务,定义服务类型集合为令ak,m(t)∈{0,1}表示在时隙t时BSm关于服务k的缓存指示函数,ak,m(t)=1表示在BSm中缓存服务k,否则BSm将不会缓存服务k;基站BSm在t时隙的服务缓存策略集合表示为am(t)={a1,m(t),…,ak,m(t),…,aK,m(t)}。
构建服务指派模型具体包括:
若BSm缓存了处理任务所需服务类型则该任务可由BSm处理,否则,任务只能在设备本地或卸载到其他服务器进行处理。对于任一具有四种任务处理方式,不同的任务处理方式具有不同的处理时延;四种任务处理方式分别为:1)本地计算;2)卸载到关联BSm进行处理;3)通过关联基站将卸载的任务转发到其他BS进行处理;4)卸载到云中心进行处理。令表示在时隙t时,的任务卸载策略。其中,表示的本地任务处理策略,表示任务可在本地处理。类似地,表示任务卸载到关联基站进行处理的策略,表示任务卸载到邻近基站进行处理的策略,表示任务卸载到云中心进行处理的策略;时隙t基站BSm下所有用户的任务卸载策略为
1)任务在本地计算
当任务在本地进行处理时,即表示的本地CPU频率,则任务在本地的处理时间可表示为表示任务的数据量大小,单位为bit,表示处理单位比特任务所需要的CPU 周期数。
2)任务卸载到关联基站进行处理
的关联基站BSm缓存了服务k,则的任务可以通过无线链路卸载到BSm处理,即根据香农公式,从到BSm的上行链路的传输速率为其中,Bm为BSm的带宽,为BSm在时隙t分配给的频谱资源分配系数,满足 为BSm分配给的带宽,则BSm频谱资源分配策略可以表示为表示的发送功率,表示与BSm间的信道增益,σ2(t)表示时隙t下的加性高斯白噪声功率。则任务的传输时延为
BSm处理任务的时间为其中,fm表示BSm的CPU频率,为BSm在时隙t分配给的CPU频率分配系数,满足表示BSm分配给的CPU频率,则BSm的算力资源分配策略可以表示为任务的处理结果通常比上传的数据小得多,本发明忽略结果传回的时延。
由上述分析可知,的任务卸载到关联基站BSm进行处理的时延为
3)任务迁移到附近基站进行处理
的关联基站BSm上没有缓存服务k,但其附近的基站BSn(n∈{1,2,…,M}且n≠m)缓存了服务k,则的任务可以由关联基站BSm进 行转发,迁移到附近的其他基站BSn进行处理,即在时隙t,任务从关联基站转发到附近基站的传输速率为其中,ωm为基站m转发任务时的带宽,Pm为基站m的转发功率,Gm,n为基站m与基站n间的信道增益,则任务由关联基站进行转发的时间为:
由上述分析可知,BSn处理任务的时间为因此,任务转发到BSn处理的计算卸载时延为
4)任务卸载到云中心进行处理
的关联基站BSm没有缓存处理该任务的相关服务,则该任务也可由关联基站BSm转发到云中心进行处理,即云中心具有丰富的计算资源和存储资源,本发明忽略云中心的任务处理时间和结果传回时间。
的任务通过关联基站BSm转发到云中心的计算卸载时间为其中,rm,c(t)为BSm把任务转发到云中心的传输速率。任务卸载到云中心进行处理的时延为
综上所述,在t时隙,用户的任务处理时延表示为:
其中,表示在t时隙时基站BSm下的用户的任务处理时延,表示在t时隙时基站BSm下的用户进行本地计算时的任务处理时 延,表示在t时隙时基站BSm下的用户将任务卸载到关联基站的传输时延,表示关联基站处理任务的时延,Ttr,m(t)表示任务被关联基站进行转发的时延,表示其他基站处理任务的时延,Tm,c(t)表示在t时隙时基站BSm下的用户的任务通过关联基站转发到云中心的传输时延。
S3:基于服务缓存模型和服务指派模型,建立任务卸载及资源分配约束条件。
MEC服务器的存储空间有限,缓存的服务所占据存储空间不能超过MEC服务器的存储容量。定义第m个MEC服务器MECm的存储空间的大小为Rm,则有其中lk表示处理该任务的服务所占用的存储空间的大小。
在时隙t,满足
任务的处理时延不能超过最大容忍时延:
分配的频谱资源总和应不大于基站带宽:
分配的计算资源总和应不大于基站计算资源:
S4:根据任务卸载及资源分配约束条件,以最小化任务处理时延为目标构建任务卸载及资源分配联合优化问题。
受限于服务器的资源(如计算、频谱和存储空间),同时,任务卸载和资源分配相互耦合。鉴于此,本发明以最小化任务的长期处理时延为目标, 建立了服务缓存和算网资源分配的联合优化问题,表示为:





其中,T表示系统运行时间,M表示基站数量,表示t时隙的用户的任务处理时延,a(t)={a1(t),…,aM(t)}表示基站服务缓存策略,b(t)={b1(t),…,bM(t)}表示任务卸载策略,α(t)={α1(t),…,αM(t)}表示频谱资源分配策略,β(t)={β1(t),…,βM(t)}表示基站算力资源分配策略,Nm表示第m个基站下的用户设备数量,表示在t时隙时基站BSm下的用户的任务最大容忍时延,表示在t时隙时基站BSm下的用户的本地任务处理策略,表示用户的任务卸载到关联基站进行处理的策略,表示用户的任务卸载到其他基站进行处理的策略,表示用户的任务卸载到云中心进行处理的策略,ak,m(t)表示在时隙t时第m个基站BSm关于服务k的缓存指示函数,K表示服务类型数量,lk表示处理任务的服务k所占用的存储空间大小,Rm表示第m个MEC服务器的存储空间的大小,表示BSm在时隙t分配给的频谱资源分配系数,表示BSm在时隙t分配给的CPU频率分配系数。
S5:采用DSRA算法求解任务卸载及资源分配联合优化问题,得到任 务卸载及资源分配策略。
边缘网络环境中,算网资源部署去中心化、网络环境高度动态化以及网络结构日益密集化等特点使得集中式的管理方式不能很好地应对高度动态的去中心化MEC环境,需要设计出更加动态灵活的分布式计算卸载和资源分配策略。多智能体深度强化学习作为一种分布式的DRL算法,可以很好地应用于去中心化MEC环境中的问题求解。鉴于此,本发明设计了一种基于多智能体深度强化学习的分布式智能服务编排和算网资源分配算法(Distributed Service Arrangement and Resource Allocation Algorithm,DSRA),由基站作为智能体来学习任务卸载策略、服务缓存策略以及算网资源分配策略。同时,考虑到用户业务请求的时间依赖性以及业务请求和服务缓存间的耦合关系,利用LSTM网络来提取有关业务请求的历史状态信息,智能体通过学习这些的历史信息,可以更好地理解未来的环境状态,从而做出更优的决策。如图3所示,具体包括以下内容:
将任务卸载及资源分配联合优化问题抽象为部分可观测的马尔科夫决策过程(Partially Observable Markov Decision Process,POMDP),由基站充当智能体,并构建对应的观测空间、动作空间和奖励函数;定义元组描述上述马尔科夫博弈过程,其中表示全局的状态空间,时隙t的环境为全局状态为智能体的观测空间集合,是全局的动作空间集合,为奖励集合。在时隙t,智能体m根据本地观测采取策略πm:选择对应的动作从而获得相应的奖励
1)环境状态
时隙t,智能体可以接收到其覆盖范围内移动设备的详细任务信息,包括任务的数据量大小、最大容忍时延,处理单位比特任务所需要的CPU周期数以及所需服务类型。环境状态可定义为s(t)={d1,d2,…,dM,P1,P2,…,PM,f1,f2,…,fM,B1,B2,…,BM,G1,G2,…,GM},其中,表示BSm下所有用户产生的任务,fm表示BSm的CPU频率,为BSm下所有用户的发送功率集合,为BSm下所有用户与BSm间的信道增益集合。时隙t,智能体m观测到的环境状态定义如下:



2)动作空间
智能体m根据观察到的环境状态om(t)和当前的策略πm,从动作空间选择相应的动作,时隙t,智能体m的动作定义如下:





a1,m(t),a2,m(t),…,aK,m(t)}
将二进制变量ak,m(t),松弛为实值变量a'k,m(t)>0.5表示BSm中缓存服务k,否则BSm将不会缓存服务k。对于 任务将选择其中最大值对应的卸载模式进行计算卸载。根据动作空间的定义和am(t)中每个元素的取值范围,可知动作空间是个连续的集合。
3)奖励函数
奖励函数衡量智能体在给定状态下采取某一动作所带来的效果。在训练过程中,智能体在t-1时隙采取了某一动作,对应的奖励将会在t时隙返回给智能体。根据所获得的奖励,智能体会更新其策略来获得最优的结果。由于奖励导致每个智能体达到其最优策略,并且策略直接决定对应的MEC服务器的算网资源分配策略、计算卸载策略和服务缓存策略,因此奖励函数应根据原始优化问题进行设计。本发明构建的奖励函数包含三部分:第一部分是任务处理时间的奖励,第二部分是任务处理时延满足时延约束的奖励,即第三部分是缓存不超过边缘服务器存储容量限制的奖励,即优化目标是最小化任务的长期处理时延,最大化长期回报,所以智能体m的累计奖励应为:
其中,H(·)是Heaviside阶跃函数;λ1,λ2分别表示第一、第二权重系数,Ym(t)表示任务处理时延满足时延约束的奖励,Um(t)表示缓存不超过边缘服务器存储容量限制的奖励。
每个基站均具有嵌入LSTM网络的actor网络和critic网络,actor网络和critic网络均包括当前网络和目标网络。DSRA算法的框架由环境和M个智能体即基站组成,每个智能体有集中训练阶段和分散执行阶段。训练时,采用集中式学习来训练critic网络和actor网络,critic网络训练时需要使用 其他智能体的状态信息。分布式执行时,actor网络只需知道局部信息。即每个智能体在训练过程中会利用全局状态和动作来估计其他智能体的策略,并根据其他智能体的估计策略来调整局部策略,以达到全局最优。多智能体深度确定性策略梯度算法(Multi-agent Deep Deterministic Policy Gradient,MADDPG)能很好地处理环境完全可观测的情况,而真实的环境状态往往是部分可观测的,为了应对环境的部分可观测性和业务请求的时间依赖性,本发明将长短期记忆网络LSTM加入到actor网络和critic网络中。LSTM是一种循环神经网络,可以提取到有关业务请求的历史状态信息。智能体通过学习这些历史信息,可以更好地理解未来的状态,做出更优的决策。
actor网络根据单个智能体当前的本地观测状态生成相应的动作;具体的:actor网络根据本地观测状态获取当前任务卸载和资源分配策略,根据任务卸载和资源分配策略可从动作空间中生成相应的动作;智能体进入下一状态。
根据动作更新奖励函数;根据当前状态、下一状态、动作和奖励值生成经验信息;采样多条经验信息训练actor网络和critic网络,更新网络参数,得到训练好的actor网络。具体的:在训练过程中,令分别表示采取动作前后actor网络和critic网络有关业务请求的历史信息,并利用来自经验回放存储器D中的经验来迭代更新DSRA算法。智能体m的经验回放存储器D包含一组经验元组,其中om(t)表示t时隙智能体m的观测状态,am(t)表示t时隙智能体m基于当前观测 om(t)所采取的动作,rm(t)表示t时隙智能体m采取动作am(t)后获得的奖励,o'm(t+1)表示智能体m在t+1时隙的状态,表示t时隙actor网络有关业务请求的历史信息,表示t时隙critic网络有关业务请求的历史信息,表示t+1时隙actor网络有关业务请求的历史信息,表示t+1时隙critic网络有关业务请求的历史信息。
在分散执行阶段,时隙t,每个智能体的actor网络根据本地观测状态om(t)、当前的历史状态信息以及它自身的策略选择动作
在集中训练阶段,每个critic网络可获得其他智能体的观测om(t)和动作am(t),则智能体m的Q函数可表示为
Q函数从全局的角度来评估actor网络的动作,并且指导actor网络选择更优的动作。在训练时,critic网络通过最小化损失函数来更新网络参数,损失函数定义如下:
其中γ为折扣因子。同时,actor网络基于critic网络计算得到的集中Q函数和它自身的观测信息来更新网络参数θ,并输出动作a。actor网络参数θ通过最大化策略梯度来更新,即:
目标网络的参数通过软更新的方式进行更新,即:
actor网络训练好后,根据actor网络做出的动作可以得到在时间周期T内的任务卸载、服务缓存及资源分配策略。根据任务卸载及资源分配策略进行任务卸载,可使得在满足各种约束的前提下任务的总处理时延最小。
对本发明进行评价:
将本发明与多智能体深度确定性策略梯度算法MADDPG(Multi-agent Deep Deterministic Policy Gradient)、单智能体深度确定性梯度算法SADDPG(Single agent Deep Deterministic Policy Gradient)以及基于LSTM的单智能体深度确定性梯度算法TADPG进行对比。如图4所示,可以看出,随着训练次数episode的增加,任务的平均处理时延在不断地减小,并逐渐趋于稳定,最终达到收敛,DSRA算法的时延最小,表明DSRA算法可以做出更优的卸载和算网资源分配决策,从而获得更小的时延,实现了资源的按需分配,证明了该算法的有效性。从图5可以看出,随着episode的增加,缓存命中率曲线呈上升趋势,并最终达到收敛,且DSRA的缓存命中率最大,证明了该算法的有效性。
以上所举实施例,对本发明的目的、技术方案和优点进行了进一步的详细说明,所应理解的是,以上所举实施例仅为本发明的优选实施方式而已,并不用以限制本发明,凡在本发明的精神和原则之内对本发明所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (8)

  1. 一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,包括:
    S1:构建移动边缘计算系统模型;
    S2:基于移动边缘计算系统模型构建服务缓存模型和服务指派模型;
    S3:基于服务缓存模型和服务指派模型,建立任务卸载及资源分配约束条件;
    S4:根据任务卸载及资源分配约束条件,以最小化任务处理时延为目标构建任务卸载及资源分配联合优化问题;
    S5:采用DSRA算法求解任务卸载及资源分配联合优化问题,得到任务卸载及资源分配策略。
  2. 根据权利要求1所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,步骤S1具体包括:构建移动边缘计算系统模型,包含M个基站BS,基站集合表示为每个基站配备有一个MEC服务器;对于基站其下有Nm个用户设备MD,用户集合表示为系统在离散的时隙中运行,定义时间集合T={0,1,2,…};对于基站BSm下的一个用户在时隙t(t∈T)产生的计算密集型任务定义为其中,表示任务的数据量大小,表示任务的最大容忍时延,表示处理单位比特任务所需要的CPU周期数,表示处理任务所需的服务类型;基站BSm下所有用户产生的任务表示为
  3. 根据权利要求1所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,步骤S2中构建服务缓存模型具体包括:定义服务类型 集合为令ak,m(t)∈{0,1}表示在时隙t时BSm中服务k的缓存指示函数,ak,m(t)=1表示在BSm中缓存服务k,否则BSm将不会缓存服务k;基站BSm在t时隙的服务缓存策略集合表示为am(t)={a1,m(t),…,ak,m(t),…,aK,m(t)}。
  4. 根据权利要求1所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,步骤S2中构建服务指派模型具体包括:对于任一用户具有四种任务处理方式,不同的任务处理方式具有不同的处理时延;四种任务处理方式分别为:本地计算、卸载到关联BSm进行处理、通过关联基站将卸载的任务转发到其他BS进行处理、卸载到云中心进行处理。
  5. 根据权利要求4所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,用户的任务处理时延表示为:
    其中,表示在t时隙时基站BSm下的用户的任务处理时延,表示用户进行本地计算时的任务处理时延,表示任务卸载到关联基站的传输时延,表示关联基站处理任务的时延,Ttr,m(t)表示任务被关联基站进行转发的时延,表示其他基站处理任务的时延,Tm,c(t)表示任务通过关联基站转发到云中心的传输时延,表示本地任务处理策略,表示任务卸载到关联基站进行处理的策略,表示任务卸载到其他基站进行处理的策略,表示任务卸载到云中心进行处理的策略。
  6. 根据权利要求1所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,所述任务卸载及资源分配联合优化问题表示为:





    其中,T表示系统运行时间,M表示基站数量,表示在t时隙时基站BSm下的用户的任务处理时延,a(t)表示基站服务缓存策略,b(t)表示任务卸载策略,α(t)表示频谱资源分配策略,β(t)表示基站算力资源分配策略,Nm表示第m个基站下的用户设备数量,表示在t时隙时基站BSm下的用户的任务最大容忍时延,表示用户的本地任务处理策略,表示用户的任务卸载到关联基站进行处理的策略,表示用户的任务卸载到其他基站进行处理的策略,表示用户的任务卸载到云中心进行处理的策略,ak,m(t)表示在时隙t时第m个基站BSm关于服务k的缓存指示函数,K表示服务类型数量,lk表示处理任务的服务k所占用的存储空间大小,Rm表示第m个MEC服务器的存储空间大小,表示BSm在时隙t分配给的频谱资源分配系数,表示BSm在时隙t分配给的CPU频率分配系数。
  7. 根据权利要求1所述的一种基于移动边缘计算的任务卸载及资源分配 方法,其特征在于,采用DSRA算法求解任务卸载及资源分配联合优化问题的过程包括:将任务卸载及资源分配联合优化问题抽象为部分可观测的马尔科夫决策过程,由基站充当智能体,并构建对应的观测空间、动作空间和奖励函数;每个智能体均具有嵌入LSTM网络的actor网络和critic网络;actor网络根据单个智能体当前的本地观测状态生成相应的动作并根据动作更新奖励函数,进入下一状态;critic网络根据全局的观测状态和动作来估计其他智能体的策略;根据当前状态、下一状态、动作和奖励值生成经验信息;采样多条经验信息训练actor网络和critic网络,更新网络参数,得到训练好的actor网络和critic网络;根据actor网络训练结果得到任务卸载及资源分配策略。
  8. 根据权利要求7所述的一种基于移动边缘计算的任务卸载及资源分配方法,其特征在于,所述奖励函数表示为:
    其中,rm(t)表示t时隙时基站BSm的奖励值,T表示系统运行时间,M表示基站数量,Nm表示第m个基站下的用户设备数量,表示t时隙时基站BSm下的用户的任务处理时延,Ym(t)表示任务处理时延满足时延约束的奖励,Um(t)表示缓存不超过边缘服务器存储容量限制的奖励。
PCT/CN2023/100968 2023-02-20 2023-06-19 一种基于移动边缘计算的任务卸载及资源分配方法 WO2024174426A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202310138344.8 2023-02-20
CN202310138344.8A CN116137724A (zh) 2023-02-20 2023-02-20 一种基于移动边缘计算的任务卸载及资源分配方法

Publications (1)

Publication Number Publication Date
WO2024174426A1 true WO2024174426A1 (zh) 2024-08-29

Family

ID=86333467

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/100968 WO2024174426A1 (zh) 2023-02-20 2023-06-19 一种基于移动边缘计算的任务卸载及资源分配方法

Country Status (2)

Country Link
CN (1) CN116137724A (zh)
WO (1) WO2024174426A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116137724A (zh) * 2023-02-20 2023-05-19 重庆邮电大学 一种基于移动边缘计算的任务卸载及资源分配方法
CN116743584B (zh) * 2023-08-09 2023-10-27 山东科技大学 一种基于信息感知及联合计算缓存的动态ran切片方法
CN118574161A (zh) * 2024-06-19 2024-08-30 中国传媒大学 基于gat-ddpg的无人机辅助车联网任务卸载策略

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111132191A (zh) * 2019-12-12 2020-05-08 重庆邮电大学 移动边缘计算服务器联合任务卸载、缓存及资源分配方法
US20220032933A1 (en) * 2020-07-31 2022-02-03 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods for generating a task offloading strategy for a vehicular edge-computing environment
CN114760311A (zh) * 2022-04-22 2022-07-15 南京邮电大学 一种面向移动边缘网络系统的优化服务缓存及计算卸载方法
CN115297013A (zh) * 2022-08-04 2022-11-04 重庆大学 一种基于边缘协作的任务卸载及服务缓存联合优化方法
CN116137724A (zh) * 2023-02-20 2023-05-19 重庆邮电大学 一种基于移动边缘计算的任务卸载及资源分配方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111132191A (zh) * 2019-12-12 2020-05-08 重庆邮电大学 移动边缘计算服务器联合任务卸载、缓存及资源分配方法
US20220032933A1 (en) * 2020-07-31 2022-02-03 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods for generating a task offloading strategy for a vehicular edge-computing environment
CN114760311A (zh) * 2022-04-22 2022-07-15 南京邮电大学 一种面向移动边缘网络系统的优化服务缓存及计算卸载方法
CN115297013A (zh) * 2022-08-04 2022-11-04 重庆大学 一种基于边缘协作的任务卸载及服务缓存联合优化方法
CN116137724A (zh) * 2023-02-20 2023-05-19 重庆邮电大学 一种基于移动边缘计算的任务卸载及资源分配方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YAO ZHIXIU; LI YUN; XIA SHICHAO; WU GUANGFU: "Attention Cooperative Task Offloading and Service Caching in Edge Computing", GLOBECOM 2022 - 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE, IEEE, 4 December 2022 (2022-12-04), pages 5189 - 5194, XP034268202, DOI: 10.1109/GLOBECOM48099.2022.10001202 *

Also Published As

Publication number Publication date
CN116137724A (zh) 2023-05-19

Similar Documents

Publication Publication Date Title
WO2024174426A1 (zh) 一种基于移动边缘计算的任务卸载及资源分配方法
Lin et al. Resource management for pervasive-edge-computing-assisted wireless VR streaming in industrial Internet of Things
CN111031102A (zh) 一种多用户、多任务的移动边缘计算系统中可缓存的任务迁移方法
CN114340016B (zh) 一种电网边缘计算卸载分配方法及系统
CN112689296B (zh) 一种异构IoT网络中的边缘计算与缓存方法及系统
Qin et al. Collaborative edge computing and caching in vehicular networks
CN113115368A (zh) 基于深度强化学习的基站缓存替换方法、系统及存储介质
CN116260871A (zh) 一种基于本地和边缘协同缓存的独立任务卸载方法
CN115344395B (zh) 面向异质任务泛化的边缘缓存调度、任务卸载方法和系统
Ai et al. Dynamic offloading strategy for delay-sensitive task in mobile-edge computing networks
CN114626298A (zh) 无人机辅助车联网中高效缓存和任务卸载的状态更新方法
CN116367231A (zh) 基于ddpg算法的边缘计算车联网资源管理联合优化方法
CN116233926A (zh) 一种基于移动边缘计算的任务卸载及服务缓存联合优化方法
CN116233927A (zh) 一种在移动边缘计算中负载感知的计算卸载节能优化方法
CN116489712B (zh) 一种基于深度强化学习的移动边缘计算任务卸载方法
CN116566838A (zh) 一种区块链与边缘计算协同的车联网任务卸载和内容缓存方法
CN114980039A (zh) D2d协作计算的mec系统中的随机任务调度和资源分配方法
Zhang et al. A deep reinforcement learning approach for online computation offloading in mobile edge computing
Ansere et al. Quantum deep reinforcement learning for dynamic resource allocation in mobile edge computing-based IoT systems
Lakew et al. Adaptive partial offloading and resource harmonization in wireless edge computing-assisted IoE networks
Zhang et al. Computation offloading and resource allocation in F-RANs: A federated deep reinforcement learning approach
CN116321293A (zh) 基于多智能体强化学习的边缘计算卸载和资源分配方法
Qin et al. Joint Optimization of Base Station Clustering and Service Caching in User-Centric MEC
Li et al. Dqn-based collaborative computation offloading for edge load balancing
CN117858109A (zh) 基于数字孪生的用户关联、任务卸载和资源分配优化方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23923594

Country of ref document: EP

Kind code of ref document: A1