CN115499849B - Wireless access point and reconfigurable intelligent surface cooperation method - Google Patents
Wireless access point and reconfigurable intelligent surface cooperation method Download PDFInfo
- Publication number
- CN115499849B CN115499849B CN202211429707.5A CN202211429707A CN115499849B CN 115499849 B CN115499849 B CN 115499849B CN 202211429707 A CN202211429707 A CN 202211429707A CN 115499849 B CN115499849 B CN 115499849B
- Authority
- CN
- China
- Prior art keywords
- network
- things
- access point
- intelligent
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000004891 communication Methods 0.000 claims abstract description 42
- 230000005540 biological transmission Effects 0.000 claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000013528 artificial neural network Methods 0.000 claims description 26
- 230000008569 process Effects 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 19
- 230000009471 action Effects 0.000 claims description 17
- 230000003993 interaction Effects 0.000 claims description 15
- 238000005457 optimization Methods 0.000 claims description 12
- 238000005516 engineering process Methods 0.000 claims description 9
- 230000006872 improvement Effects 0.000 claims description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 230000008901 benefit Effects 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 230000007613 environmental effect Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 230000010363 phase shift Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 2
- 230000005611 electricity Effects 0.000 claims 1
- 230000002787 reinforcement Effects 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/18—Network planning tools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W84/00—Network topologies
- H04W84/18—Self-organising networks, e.g. ad-hoc networks or sensor networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The application relates to a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which comprises the following steps: building an equipment communication architecture based on the power internet of things; according to the established equipment communication architecture based on the power internet of things, a corresponding access point and intelligent reconfigurable surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met; and each access point is cooperated with the reconfigurable intelligent surface according to the trained model so as to meet the access requirement of the mass equipment in the power internet of things. According to the method, the giant wireless communication network is modeled into a graph, and the graph is subjected to dimension reduction by using a graph embedding method to obtain an efficient graph representation, so that the model training complexity can be effectively reduced, and highly customized communication is realized.
Description
Technical Field
The application belongs to the technical field of power Internet of things, and particularly relates to a wireless access point and reconfigurable intelligent surface cooperation method.
Background
In recent years, with the rapid development of the power internet of things, massive equipment is deployed at the network edge of the power internet of things. Because the power network system is complex and huge, and the problems of high management difficulty, high cost and the like exist only by relying on manpower to manage and control, a new information communication technology needs to be introduced to improve the operation performance and the management and control efficiency of the power system. In order to realize intelligent management and control of the power internet of things, the allocation condition and performance of the power network need to be sensed and measured in real time. Therefore, the power internet of things needs to meet the requirements of network edge internet of things equipment access and mass data transmission, so that efficient and reliable operation of the power internet of things is guaranteed. With the continuous development of information communication technology, a new generation of mobile communication technology can provide high-speed and stable service when a large amount of power equipment is accessed to a power network, but due to the heterogeneity of network edge equipment, high-degree customized and intelligent communication cannot be realized at present, namely, network resources are dynamically configured to support ultra-dense connection.
Reconfigurable smart surfaces are a totally new revolutionary technology that can intelligently reconfigure the wireless propagation environment by integrating a large number of low-cost passive reflective elements in a plane, thereby significantly improving the performance of wireless communication networks. The reconfigurable intelligent surface provides possibility for high customization, and can reconfigure a wireless propagation environment through highly controllable and intelligent signal reflection, thereby providing a new degree of freedom for further improving the performance of a wireless link and paving a road for realizing an intelligent programmable wireless environment. By means of a reconfigurable intelligent surface technology, mixed space beams are flexibly configured through cooperation of the wireless access point and the wireless access point, data are enhanced as required, interference suppression is flexibly carried out, efficient mixed airspace and power domain multiplexing is carried out, and efficient customized communication and intelligent communication can be effectively carried out. Therefore, in a power internet of things scene with a heterogeneous power grid and massive devices, an effective wireless access point and reconfigurable intelligent surface cooperation technology needs to be designed urgently so as to realize highly customized communication and intelligent communication.
Disclosure of Invention
The embodiment of the application aims to provide a method for cooperation between a wireless access point and a reconfigurable intelligent surface, wherein a wireless communication network is modeled into a graph representation, an embedded representation of the network is obtained by using a graph embedding method, a low-dimensional representation of the graph can be effectively obtained by using the graph embedding method, the model training complexity is reduced, and high-degree customized communication is realized.
In order to achieve the above purpose, the present application provides the following technical solutions:
the embodiment of the application provides a method for cooperation between a wireless access point and a reconfigurable intelligent surface, which is characterized by comprising the following steps:
step 1: building a device communication architecture based on an electric power internet of things, wherein the network architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;
and 2, step: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.
The step 1 is specifically as follows:
step 1: in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented asRepresenting a reconfigurable intelligent surface in a network as &>The method comprises the steps of representing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, representing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, taking access information of the power internet of things equipment, mixed space wave beam configuration between the wireless access points and the reconfigurable intelligent surfaces as features in graph topology, inputting the features into a message transfer graph neural network, and obtaining stable node feature graph embedded representation through a message transfer mechanism of the message transfer graph neural network.
The step 2 is specifically as follows:
step 2.1: in order to achieve a dynamic maximization of the system energy efficiency of the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:
whereinRepresents the network energy efficiency of the time slot t, < > is greater or less>And representing user parameters, combining the selection of a reconfigurable intelligent surface unit, coordinating the discrete phase shift control and the power distribution strategy, and modeling the long-term energy efficiency optimization problem into a decentralized part observable Markov decision process. After converting the above optimization problem into a decentralized part observable markov decision process, the converted optimization function is as follows:
whereinA positive factor representing a trade-off between control energy efficiency and transmission reliability>Is a non-negative parameter that imposes a penalty on violating the data rate, and>indicates a data rate limit, <' > or>Is a fixed value in each time slot, is greater than or equal to>Indicates the data rate at each time slot, and->Representing the number of antennas>Representing the access point and the users of the reconfigurable intelligent surface collaboration service.
Its global reward function can be expressed as:
step 2.2: more efficient cooperative learning is achieved through two technologies of integration graph embedding and different rewards, the intelligent bodies represent wireless access points and reconfigurable intelligent surfaces, the interaction between the intelligent bodies represents a wireless communication environment and a communication mode thereof, and the intelligent bodies and the interaction between the intelligent bodies are modeled into a directed communication graphWherein the agent is modeled as node I, the interaction between agents is modeled as a directed edge { [ MEANS ]>,Represents a characteristic of a node, is asserted>The characteristics of the edges are represented by,
the node characteristics of a wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:
the edge being characterized as an agentTo intelligent agent>The interaction between them can be expressed mathematically as:
step 2.3: because graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided, and each distributed node is provided with a plurality of distributed nodesA messaging graph neural network is maintained. Similar to the multi-layered perceptron, the message passing graph neural network adopts a layered structure, in each message passing graph neural network layer, each agent first transmits embedded information to its neighboring agents, and then aggregates the embedded information from the neighboring agents and updates its local hidden state, and the message passing process is as follows:
whereinRepresents a message function, <' > or>Represents an update operation, after the graph embedding module, the agent @>Will use a gated-loop unit based on the locally embedded state of the output->Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows:
intelligent agentThe local action taken->Is slave action taken sub-strategy>The obtained result of the medium sampling is that,
step 2.4: representing combined parameters of graph embedding module and action generating module in distributed strategy asOur goal is to maximize the performance function:
whereinIs to follow a union strategy>Based on the dominance function, a policy gradient is calculated, which is given by:
whereinRepresents a global state value, <' > based on a global status>Representing global state-action values, for solving credit allocation problems during training, training a distributed network using value decomposition with a global state value->The decomposition is in the form of a combination with a mixing function as shown in the following equation:
whereinIndicating an intelligent cube pick>In a centralized training process, each agent receives different rewards by evaluating its contribution to global reward improvements based on local map embedded features to further facilitate coordination between agents that will ÷ or based on their local state values>Weight parameters expressed as a distributed network, shared among agents, with ≦ based on>Indicates the mixing network->By small batch gradient descent, the distributed and hybrid networks are optimized such that the following losses are minimized:
whereinIs n steps back from the last state, the upper limit of n is T, and the parameters of the hybrid network can be updated by the following formula:
whereinIs the learning rate of the mixed network update, further shares the weight parameter of the non-output layer in the distributed network, and represents that the combined weight parameter of the distributed network is ^ er>About>The gradient of (d) can be calculated as:
the update rule for a distributed network can be derived as:
wherein,and &>Respectively representing the strategy improvement learning rate and the critic learning rate.
The step 3 is specifically as follows:
step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate,
Step 3.2: extracting data of a batch from an experience poolThe strategy gradient is calculated according to the formula derived in step 2.4>And network loss->Updating the hybrid network parameters based on the hybrid network parameter updating formula in step 2.4,
step 3.3: further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges,
step 3.4: the trained network parameters are updated regularly, or the network parameters are retrained and updated when the power internet of things is changed greatly, so that the access requirements of equipment in the circuit internet of things are met, and customized communication is realized.
Compared with the prior art, the beneficial effects of this application are: the application provides a wireless access point and reconfigurable intelligent surface cooperation framework aiming at the requirements of an electric power Internet of things, so that the access requirements of mass equipment are met. According to the method and the device, the cooperation between the wireless access point and the reconfigurable intelligent surface is realized, and the system energy efficiency is dynamically maximized, so that the high-efficiency communication is realized. In addition, the application provides a graph embedding-based wireless network representation method, which models a huge wireless communication network into a graph and reduces the dimension of the graph to obtain an efficient graph representation by using the graph embedding method. The method provided by the application can effectively reduce the complexity of model training and realize highly customized communication.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flow chart of a method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
Referring to fig. 1, the present application provides a method for a wireless access point to cooperate with a reconfigurable intelligent surface, which includes the following steps.
Step 1: building a device communication architecture based on an electric power internet of things, wherein the network architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input are built, input topology of a message transmission graph neural network is built, and embedded representation of the topology is obtained through the message transmission graph neural network, so that services are provided for a power internet of things terminal;
step 2: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access point and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things.
Preferably, the step 1 is as follows:
step 1: in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented asRepresenting a reconfigurable intelligent surface in a network as &>M wireless access points and J reconfigurable intelligent surfaces are expressed as different intelligent body nodes, and wireless access points and reconfigurable intelligent surfaces are expressed as drawing godsThe access information of the power internet of things equipment, the mixed space beam configuration between the plurality of wireless access points and the plurality of reconfigurable intelligent surfaces are regarded as characteristics in graph topology through nodes in network input and input into a message transmission graph neural network, and stable node characteristic graph embedded representation is obtained through a message transmission mechanism of the message transmission graph neural network.
Preferably, the step 2 is specifically as follows:
step 2.1: because the network edge of the power internet of things is provided with mass equipment, and a high-performance mass equipment access framework needs to be elaborately designed, the hybrid beams can be flexibly and coordinately reconstructed by designing the cooperation between the access point and the reconfigurable intelligent surface, so that the equipment is coordinately accessed into a communication network, and the customizable intelligent communication is realized. Therefore, to achieve a system energy efficiency that dynamically maximizes the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:
whereinRepresenting the network energy efficiency of the time slot t. This objective function can be modeled as a constrained markov decision process, however, solving the above problem in a centralized manner is computationally inefficient due to the large scale joint state-action space and the high dimensional information exchange overhead of multiple wireless access points and reconfigurable smart surfaces to a centralized controller. To address the above issues in an efficient and low-complexity manner and to maximize network energy efficiency while ensuring diversified user performance, we can model the above long-term energy efficiency optimization problem as a decentralized partially observable markov decision process in conjunction with reconfigurable intelligent surface unit selection, coordinated discrete phase shift control, and power allocation strategies. In particular, the partially observable Markov decision process provides a general framework for describing Markov with incomplete informationThe decision process, while the de-centering portion may observe the markov decision process to extend it to discrete locations.
Based on the Lyapunov optimization theory, we can convert the above optimization problem into a decentralized partially observable markov decision process, and the converted optimization function is as follows:
whereinA positive factor representing a trade-off between control energy efficiency and transmission reliability>Is a non-negative parameter which penalizes a data rate violation>Indicates a data rate limit, <' > or>In each time slot is a fixed value>Representing a data rate in each time slot, based on the time period>Indicates the number of antennas, and>representing the access point and the users of the reconfigurable intelligent surface collaboration service.
Its global reward function can be expressed as:
step 2.2: the optimization problem described in step 2.1 can be solved using the conventional multi-agent reinforcement learning method, but because information needs to be exchanged between adjacent agents to achieve cooperation, the conventional multi-agent reinforcement learning method causes high communication overhead and delay in processing high-dimensional information, so the conventional multi-agent reinforcement learning method is inefficient in solving the observable markov decision process problem of the highly-coupled decentralized part. The common centralized training and decentralized execution in the existing multi-agent reinforcement learning algorithm is expanded, and more efficient cooperative learning is realized by integrating two technologies of graph embedding and different rewards. The intelligence represents a wireless access point and a reconfigurable intelligent surface. The interaction between agents represents the wireless communication environment and its way of communication. Agents and interactions therebetween are modeled as directed communication graphs. Where agent is modeled as node I, the interaction between agents is modeled as a directed edge +>,Represents a characteristic of a node, is asserted>Representing the characteristics of the edge.
The node characteristics of a wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:
the edge being characterized as an agentTo intelligent agent>The interaction between them can be expressed mathematically as:
step 2.3: since graph nodes and edges have high-dimensional characteristics in a large-scale network, an action generation module based on graph embedding is provided. The module utilizes the low-dimensional embedding characteristic of the message transfer graph neural network learning directed graph, can effectively improve the generalization capability of the network and enhance the cooperation capability between the wireless access point and the reconfigurable intelligent surface, and simultaneously only needs lower information exchange overhead.
We are at each distributed nodeA messaging graph neural network is maintained. Similar to the multi-tier perceptrons, the messaging graph neural network employs a hierarchical structure. Within each messaging graph neural network layer, each agent first transmits embedded information to its neighboring agents, then aggregates the embedded information from the neighboring agents and updates its local hidden state, the messaging process is as follows:
whereinRepresents a message function, <' > or>Indicating an update operation. After the map embedding module, the agent->Will use a gated-loop unit based on the locally embedded state of the output->Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows: />
Intelligent agentThe local action taken->Is slave action taken sub-strategy>And (4) medium sampling.
Step 2.4: representing combined parameters of graph embedding module and action generating module in distributed strategy asOur goal is to maximize the performance function:
whereinIs to follow a union policy->The joint state transition of (1). Therefore, we meanThe policy gradient is computed from the merit function, which is given by:
whereinIs the actual entry of the map insert, is asserted>Represents the time difference advantage, given by:
whereinRepresents a global state value, <' > is asserted>Representing a global state-action value. To solve the credit allocation problem during training, we train a distributed network with a value decomposition that brings a global status value ≦ into the ≦ value>The decomposition is in the form of a combination with a mixing function as shown in the following equation:
whereinIndicating an intelligent cube pick>The local state value of (2).In the centralized training process, each agent receives different rewards by evaluating its contribution to global reward improvement based on the local graph embedding features, thereby further facilitating coordination between agents. Will->Weight parameters expressed as a distributed network, shared among agents, with ≦ based on>Indicates the mixing network->The weight of (c). The distributed and hybrid networks are optimized by small batch gradient descent, minimizing the following losses:
whereinIs an n-step return from the last state, with the upper limit of n being T. Thus, the parameters of the hybrid network may be updated by:
whereinIs the learning rate of the hybrid network update. To reduce complexity, we further share the weight parameter of the non-output layer in the distributed network, indicating that the combining weight parameter of the distributed network is ≥ l>. Accordingly, in respect of->The gradient of (d) can be calculated as:
thus, the update rule for a distributed network can be derived as:
wherein,and &>Respectively representing a strategy improvement learning rate and a critic learning rate.
Preferably, the step 3 is specifically as follows:
step 3.1: inputting the data of the power internet of things obtained by actual observation as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters and initializing network learning rate。
Step 3.2: extracting data of a batch from an experience poolThe strategy gradient is calculated according to the formula derived in step 2.4>And network loss>Based on the mixed network parameters in step 2.4The update formula updates the hybrid network parameters.
Step 3.3: and further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges.
Step 3.4: and the trained network parameters are updated periodically, or the network parameters are retrained and updated when the power internet of things is greatly changed. Therefore, the access requirement of the equipment in the circuit Internet of things is met, and customized communication is realized.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.
Claims (1)
1. A method for cooperation between a wireless access point and a reconfigurable intelligent surface is characterized by comprising the following steps:
step 1: setting up a device communication architecture based on an electric power internet of things, wherein the device communication architecture comprises: the method comprises the steps that M pre-installed access points and J reconfigurable intelligent surfaces are built, wherein each access point is modeled into interaction between intelligent bodies through a cooperative relation with adjacent access points and the reconfigurable intelligent surfaces, namely edges in graph neural network input, an input topology of a message transmission graph neural network is built, and the message transmission graph neural network is utilized to obtain an embedded representation of the topology so as to provide service for a power internet of things terminal;
and 2, step: according to the established equipment communication architecture based on the power internet of things, a corresponding access point and reconfigurable intelligent surface cooperation method is designed, the aim of maximizing system energy efficiency is taken, and the service quality requirements of mass equipment under the power internet of things on the aspects of data transmission rate and reliability are met;
and step 3: based on the method for the cooperation between the access points and the reconfigurable intelligent surface, which is provided by the step 2, each access point cooperates with the reconfigurable intelligent surface according to the trained model so as to meet the access requirements of mass equipment in the power internet of things;
the step 1 is specifically as follows:
in the device communication architecture of the power internet of things, a preinstalled access point in the network is represented asExpressing a reconfigurable intelligent surface in a network as ^ or ^>The method comprises the steps of expressing M wireless access points and J reconfigurable intelligent surfaces as different intelligent body nodes, expressing the wireless access points and the reconfigurable intelligent surfaces as nodes in graph neural network input, considering the access information of the power internet of things equipment, the configuration of mixed space wave beams between a plurality of wireless access points and a plurality of reconfigurable intelligent surfaces as characteristics in graph topology, inputting the characteristics to a message transmission graph neural network, and obtaining stable node characteristic graph embedded representation through a message transmission mechanism of the message transmission graph neural network;
the step 2 is specifically as follows:
step 2.1: modeling the system energy efficiency optimization problem as a decentralized part observable Markov decision process;
in order to achieve a dynamic maximization of the system energy efficiency of the cooperation of the wireless access point and the reconfigurable intelligent surface, the objective function of the system can be expressed as:
whereinRepresents the network energy efficiency of the time slot t, < > is greater or less>Representing user parameters, combining the selection of the reconfigurable intelligent surface unit, the coordination of the discrete phase shift control and the power distribution strategy, and combining the aboveModeling the system energy efficiency optimization problem into a decentralized part observable Markov decision process, and after converting the optimization problem into the decentralized part observable Markov decision process, the converted optimization function is as follows:
whereinPositive coefficient representing a trade-off between control energy efficiency and transmission reliability>Is a non-negative parameter that imposes a penalty on violating the data rate, and>indicates a data rate limit, <' > or>In each time slot is a fixed value>Indicates the data rate at each time slot, and->Representing the number of antennas>Representing the access point and the users of the reconfigurable intelligent surface collaboration service,
its global reward function can be expressed as:
step 2.2: more efficient cooperative learning is realized through two technologies of integration graph embedding and different rewards;
the agents represent wireless access points and reconfigurable intelligent surfaces, the interaction between agents represents the wireless communication environment and the communication mode thereof, and the agents and the interaction therebetween are modeled as directed communication graphsWherein the agent is modeled as node I, the interaction between agents is modeled as a directed edge { [ MEANS ]>,Represents a characteristic of a node, is asserted>The characteristics of the edges are represented by,
the node characteristics of wireless access point i include spatial channel information of the access point to its associated devices, queue information of associated users, and local action observation history of the access point:
the edge being characterized as an agentTo the intelligent body->The interaction between them can be expressed mathematically as:
step 2.3: maintaining a message passing graph neural network at each distributed node i, wherein in each message passing graph neural network layer, each agent firstly transmits embedded information to adjacent agents, and then aggregates the embedded information from the adjacent agents and updates the local hidden state of the agents;
the message passing process is shown as follows:
whereinRepresents a message function, <' > or>Represents an update operation, after the graph embedding module, the agent @>Will use a gated-loop unit based on the locally embedded state of the output->Predicting local action, wherein the gated cyclic unit is a simplified variant of the long-short term memory network, and the local embedding state is shown as follows:
intelligent agentThe local action taken->Is slave action taken sub-strategy>Obtained by middle sampling;
step 2.4: embedding graphs in distributed policiesThe combined parameters of the module and the action generating module are expressed asOur goal is to maximize the performance function:
whereinIs to follow a union policy->Based on the dominance function, a policy gradient is calculated, which is given by: />
WhereinIs the actual entry of the map insert, is asserted>Representing the time difference advantage, given by:
whereinRepresents a global state value, <' > is asserted>Representing global state-action values, training a distributed network using value decomposition to solve credit allocation problems during trainingThe global state value is->The decomposition is in the form of a combination with a mixing function as shown in the following equation:
whereinIndicating an intelligent cube pick>In a centralized training process, each agent receives different rewards by evaluating its contribution to global reward improvement based on local graph-embedded features to further facilitate coordination between agents that will ∑ er>Expressed as weight parameters of a distributed network, shared among agents, usingIndicates the mixing network->By small batch gradient descent, the distributed and hybrid networks are optimized such that the following losses are minimized:
whereinIs based on the last state>Step back and is->With an upper limit of T, the parameters of the hybrid network may be updated by:
whereinIs the learning rate of the mixed network update, further shares the weight parameter of the non-output layer in the distributed network, and represents that the combined weight parameter of the distributed network is ^ er>On/in>The gradient of (d) can be calculated as:
the update rule for a distributed network can be derived as:
wherein,and &>Respectively representing a strategy improvement learning rate and a criticc learning rate;
the step 3 is specifically as follows:
step 3.1: the electricity obtained by actual observationInputting the data of the Internet of things as the observation state of the intelligent agent and environmental information into a network updating algorithm based on graph embedding, initializing network parameters, and initializing network learning rate,
Step 3.2: data B for a batch is extracted from the experience pool and the policy gradient is calculated according to the formula derived in step 2.4And network loss->Updating the hybrid network parameters based on the hybrid network parameter updating formula in step 2.4,
step 3.3: further updating the network parameters in the power internet of things according to the distributed network parameter updating algorithm in the step 2.4 until the network converges,
step 3.4: the trained network parameters are updated regularly, or the network parameters are retrained and updated when the power internet of things is changed greatly, so that the access requirements of equipment in the circuit internet of things are met, and customized communication is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211429707.5A CN115499849B (en) | 2022-11-16 | 2022-11-16 | Wireless access point and reconfigurable intelligent surface cooperation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211429707.5A CN115499849B (en) | 2022-11-16 | 2022-11-16 | Wireless access point and reconfigurable intelligent surface cooperation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115499849A CN115499849A (en) | 2022-12-20 |
CN115499849B true CN115499849B (en) | 2023-04-07 |
Family
ID=85115737
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211429707.5A Active CN115499849B (en) | 2022-11-16 | 2022-11-16 | Wireless access point and reconfigurable intelligent surface cooperation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115499849B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN113472419A (en) * | 2021-06-23 | 2021-10-01 | 西北工业大学 | Safe transmission method and system based on space-based reconfigurable intelligent surface |
CN115103372A (en) * | 2022-06-17 | 2022-09-23 | 东南大学 | Multi-user MIMO system user scheduling method based on deep reinforcement learning |
CN115310775A (en) * | 2022-07-13 | 2022-11-08 | 武汉大学 | Multi-agent reinforcement learning rolling scheduling method, device, equipment and storage medium |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3776370A1 (en) * | 2018-05-18 | 2021-02-17 | Deepmind Technologies Limited | Graph neural network systems for behavior prediction and reinforcement learning in multple agent environments |
CN111612126B (en) * | 2020-04-18 | 2024-06-21 | 华为技术有限公司 | Method and apparatus for reinforcement learning |
US11546022B2 (en) * | 2020-04-29 | 2023-01-03 | The Regents Of The University Of California | Virtual MIMO with smart surfaces |
JP7307825B2 (en) * | 2021-02-01 | 2023-07-12 | 株式会社Nttドコモ | Method and apparatus for user location and tracking using radio signals reflected by reconfigurable smart surfaces |
CN113573293B (en) * | 2021-07-14 | 2022-10-04 | 南通大学 | Intelligent emergency communication system based on RIS |
CN114422056B (en) * | 2021-12-03 | 2023-05-23 | 北京航空航天大学 | Space-to-ground non-orthogonal multiple access uplink transmission method based on intelligent reflecting surface |
CN114286369B (en) * | 2021-12-28 | 2024-02-27 | 杭州电子科技大学 | AP and RIS joint selection method of RIS auxiliary communication system |
CN114466388B (en) * | 2022-02-16 | 2023-08-08 | 北京航空航天大学 | Intelligent super-surface-assisted wireless energy-carrying communication method |
CN115333143B (en) * | 2022-07-08 | 2024-05-07 | 国网黑龙江省电力有限公司大庆供电公司 | Deep learning multi-agent micro-grid cooperative control method based on double neural networks |
CN115146538A (en) * | 2022-07-11 | 2022-10-04 | 河海大学 | Power system state estimation method based on message passing graph neural network |
-
2022
- 2022-11-16 CN CN202211429707.5A patent/CN115499849B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN113472419A (en) * | 2021-06-23 | 2021-10-01 | 西北工业大学 | Safe transmission method and system based on space-based reconfigurable intelligent surface |
CN115103372A (en) * | 2022-06-17 | 2022-09-23 | 东南大学 | Multi-user MIMO system user scheduling method based on deep reinforcement learning |
CN115310775A (en) * | 2022-07-13 | 2022-11-08 | 武汉大学 | Multi-agent reinforcement learning rolling scheduling method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115499849A (en) | 2022-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mocanu et al. | On-line building energy optimization using deep reinforcement learning | |
Zeb et al. | Industrial digital twins at the nexus of NextG wireless networks and computational intelligence: A survey | |
CN113282368B (en) | Edge computing resource scheduling method for substation inspection | |
Liu et al. | Federated reinforcement learning for decentralized voltage control in distribution networks | |
Chen et al. | Mean field deep reinforcement learning for fair and efficient UAV control | |
Shi et al. | Machine learning for large-scale optimization in 6g wireless networks | |
Abdullahi et al. | A survey of symbiotic organisms search algorithms and applications | |
Zhang et al. | Consensus Transfer ${Q} $-Learning for Decentralized Generation Command Dispatch Based on Virtual Generation Tribe | |
CN112598150B (en) | Method for improving fire detection effect based on federal learning in intelligent power plant | |
Kumari et al. | An energy efficient smart metering system using edge computing in LoRa network | |
WO2017114810A9 (en) | Methods, controllers and systems for the control of distribution systems using a neural network architecture | |
Hsieh et al. | AQ-learning-based swarm optimization algorithm for economic dispatch problem | |
Xia et al. | Intelligent task offloading and collaborative computation in multi-UAV-enabled mobile edge computing | |
Hlophe et al. | AI meets CRNs: A prospective review on the application of deep architectures in spectrum management | |
Zhou et al. | Hierarchical multi-agent deep reinforcement learning for energy-efficient hybrid computation offloading | |
Qin et al. | Dynamic IoT service placement based on shared parallel architecture in fog-cloud computing | |
Zhang et al. | Backtracking search algorithm with dynamic population for energy consumption problem of a UAV-assisted IoT data collection system | |
CN115499849B (en) | Wireless access point and reconfigurable intelligent surface cooperation method | |
Si et al. | When spectrum sharing in cognitive networks meets deep reinforcement learning: Architecture, fundamentals, and challenges | |
Li et al. | Toward Reinforcement-Learning-Based Intelligent Network Control in 6G Networks | |
KR102515287B1 (en) | Intelligent home energy mangement system and method based on federated learning | |
Zhang et al. | Application of artificial intelligence for space-air-ground-sea integrated network | |
Rodway et al. | Differential evolution optimized fuzzy controller for wireless sensor network energy management | |
Chen et al. | Joint optimization of UAV-WPT and mixed task offloading strategies with shared mode in SAG-PIoT: A MAD4PG approach | |
Zhang | Artificial Intelligence for Digital Twin |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |