CN114861792A - Complex power grid key node identification method based on deep reinforcement learning - Google Patents
Complex power grid key node identification method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN114861792A CN114861792A CN202210484829.8A CN202210484829A CN114861792A CN 114861792 A CN114861792 A CN 114861792A CN 202210484829 A CN202210484829 A CN 202210484829A CN 114861792 A CN114861792 A CN 114861792A
- Authority
- CN
- China
- Prior art keywords
- power grid
- complex
- node
- reinforcement learning
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000002787 reinforcement Effects 0.000 title claims abstract description 23
- 230000009471 action Effects 0.000 claims abstract description 24
- 230000008569 process Effects 0.000 claims abstract description 11
- 238000009826 distribution Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 9
- 230000002452 interceptive effect Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 24
- 238000013528 artificial neural network Methods 0.000 claims description 21
- 238000010606 normalization Methods 0.000 claims description 13
- 239000003795 chemical substances by application Substances 0.000 claims description 11
- 230000003068 static effect Effects 0.000 claims description 11
- 238000013527 convolutional neural network Methods 0.000 claims description 10
- 230000008447 perception Effects 0.000 claims description 10
- 230000009977 dual effect Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 8
- 238000004870 electrical engineering Methods 0.000 claims description 7
- 238000003062 neural network model Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 5
- 238000013178 mathematical model Methods 0.000 abstract description 3
- 230000006399 behavior Effects 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Business, Economics & Management (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Economics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A complex power grid key node identification method based on deep reinforcement learning belongs to the technical field of electric power big data processing. The method utilizes the thought of the Deep reinforcement learning model DDQN (Double Deep Q-Network, DDQN) interactive learning, completes the calculation of the Q value under the action of a specific state by the self-initiated behavior exploration of an intelligent body and combining the empirical data formed by environment information, action information and reward information, thereby evaluating the value of the action of a certain state of the complex power grid. The method is based on data driving, and overcomes the limitations of establishing a mathematical model based on a business mechanism in various aspects of adaptability, algorithm efficiency and accuracy under the complex environment of the power distribution network. The method avoids the process of performing distribution hypothesis and characteristic modeling on the state according to a large amount of priori knowledge in the traditional complex power grid reconstruction process, reduces the complexity of key node identification, is more suitable for a key node identification method of a large-scale power grid under the background of big data, and has higher robustness and accuracy.
Description
Technical Field
The invention relates to a complex power grid key node identification method based on deep reinforcement learning, and belongs to the technical field of electric power big data processing.
Background
The electric power system is used as the basic guarantee for the stable and healthy development of social economy and the survival of people of all countries in the world, and the safe, stable and continuous work of the electric power system is crucial to the social development. With the rapid development of human society, the demand for electricity in society is greatly increased, and the scale and complexity of a power grid are continuously enlarged, so that a power grid system is very complex. The development of new energy power is further promoted by the new concepts of carbon peak arrival, carbon neutralization and the like, but the coupling degree of different parts of a power grid can be further improved along with the continuous increase and change of the proportion of the new energy to be connected into a power system, so that the probability of occurrence of disturbance signals and the propagation capacity are more easily increased. The disturbance signals are transmitted through each node in the power grid, and when a key link of the power system is interfered, the spread range is wider, the influence range is deeper, systematic accidents are more easily caused, and the financial and manpower losses which are extremely dangerous and serious are caused. The complex power grid has the characteristic of a non-homogeneous topological structure, namely the number of key nodes is small, but the fault of the nodes can greatly affect the topological structure and the operation function of the network, and even can rapidly affect the whole network. If a critical link in a complex power grid fails once, large-area power failure can be caused, and loss of a small degree is caused. The power grid accidents are not only caused by inefficacy, but also certain responsibility is given to the instability of the power system, so that the stability of the power system is improved, key nodes of the complex power grid are identified, and the significance of pertinently deploying monitoring and adding protection to the safe and stable operation of the power system by utilizing limited financial resources and material resources is great.
At present, the existing key node identification methods can be divided into the following two categories: firstly, based on the static characteristics of the power grid, a pure topological structure is used for identifying key nodes of the complex power grid, such as topological betweenness and degree centrality. Such methods are computationally simple but are not accurate in practical power grids due to the lack of consideration of electrical engineering physical characteristics. And secondly, combining electrical characteristics by adopting a topological structure. Such as topological entropy, energy function, etc. The method has extremely high requirement on the richness of the prior knowledge, and is difficult to avoid the influence of subjective factors on the model, so that the constructed mechanism analysis model has increased difficulty in accurately describing high-dimensional, complex and time-varying object characteristics, and has low robustness for large-scale range power grid identification.
The traditional model established based on a mathematical statistics method and prior knowledge combined with mechanism analysis is not enough in completing the work, and the root cause is at least two points as follows: firstly, modeling data actually measured in the real world needs a large amount of priori knowledge support, and the performance of the model and the robustness of an identification result are directly influenced by the quality of modeling; secondly, the actual electrical data of the complex power grid is often influenced by a multi-physical-field coupling system, the characteristics are complex, the upper and lower correlations are close, and fitting the model generally requires a great amount of calculation, and even is difficult to bear.
The deep reinforcement learning technology has the advantages of learning affair characteristics in a data-driven and self-adaptive manner, low dependence on a specific mathematical model and transfer learning capability on a source domain. The Deep reinforcement learning model DDQN (Double Deep-Qnetwork) is a typical model applied to the field of various fields in the current Deep reinforcement learning field, has the characteristics of low modeling cost and capability of self-adapting to potential characteristics of learning data by completely depending on data drive, guides the model to narrow the difference with target distribution in a Double-Q cycle iteration mode, overcomes the problem that a dynamically-changed Q value function is difficult to calculate and converge by using a regression model, and avoids the problem that the Q value is over-estimated by a DQN model.
The method for identifying the key nodes of the complex power grid based on deep reinforcement learning is provided. Firstly, abstracting a complex power grid into an intuitive and simply-described unauthorized undirected graph of the connection relation between nodes and links; secondly, preprocessing and normalizing the node attribute and the link attribute of the topological structure to facilitate subsequent calculation, distributing the weight of the obtained topology and electrical information on the design of a reward function, and integrating an abstract topological graph to form a weighted undirected topological structure; the method comprises the steps of improving feature extraction of exploration data of an intelligent agent by adopting a classical CNN model on the design of a neural network, copying a Convolutional Neural Network (CNN) by using a data distribution network (DDQN) to form a double-Q network, optimizing the convergence speed and feature extraction capability of the model by using double iterative perception, and enabling the model to generate the Q value probability distribution which has the maximum similarity with known information and accords with data obtained by interaction of the intelligent agent in a complex power grid by using a back propagation algorithm. The method overcomes the problems of insufficient feature extraction and poor robustness among high-coupling nodes possibly caused by the conventional method for reconstructing missing data through explicit modeling, and in addition, the model can be suitable for other networks with more complex features, only needs to increase the level of a neural network along with the complexity and increase training, increases the exploration times of an intelligent agent, ensures the sufficient extraction of the potential features of the data, and has extremely high generalization capability and stability. The artificial intelligence method is characterized in that a reward matrix aiming at a complex power grid environment is provided by combining a power grid topological structure with priori knowledge such as electrical distance, an environment solved by mass interactive data of an agent and the reward matrix is formed, an optimal strategy for completing a target task is learned by applying a deep double-Q network in a deep reinforcement learning model, and the artificial intelligence method which is high in efficiency and robustness and suitable for complex power grid key node identification is achieved.
Disclosure of Invention
The invention aims to provide a complex power grid key node identification method based on deep reinforcement learning, aiming at the problems of complex modeling and low identification efficiency and robustness of the traditional power grid key node identification method.
The invention adopts a deep double-Q network (DDQN) with strong decision-making capability and autonomous feature extraction capability as a main framework, and combines double Q value iterative perception to regularly summarize exploration data with complex relationships. The method obtains the potential characteristics of the data in a self-adaptive exploration utilization and unsupervised learning mode, and completely overcomes the defects of adaptability, algorithm efficiency, accuracy and the like due to the fact that a large amount of priori knowledge is needed for complex modeling of the power grid. The method of the invention has good robustness.
Firstly, abstract preprocessing is carried out on a static complex power grid, all nodes of large components including a generator and a transformer and links of the large components are simplified into connection of points and lines based on graph theory, existence and physical connection characteristics of the nodes are only considered, and the complex power grid is abstracted into a pure non-right undirected connection graph.
Then, carrying out normalization processing on the attribute of the power grid topological structure and the physical characteristic data of the electrical engineering, and carrying out statistics on the related attribute of the power grid topological structure, including the degree of entrance and exit of the node and the strength of the node; the physical properties of electrical engineering include electrical distance, i.e. equivalent impedance. The data are normalized by (0,1), the normalization can better compare the difference between the same data, and the problem of dimension explosion of the neural network is solved.
And then, designing an incentive function by using a Gaussian function and an objective weighting method, wherein the smaller the equivalent impedance is, the tighter the connection between two nodes is, the inverse distance weighting is performed on the electrical distance represented by the equivalent impedance by using the Gaussian function, so that the weight between two points with the smaller equivalent impedance is larger, and the proportion of different attributes in the incentive function is configured by using an objective weighting method, so that a depth-enhanced learning incentive function suitable for the selected static complex network is obtained.
Secondly, recording the exploration process of the agent by using an experience pool, and inputting exploration data into a dual neural network iterative perception Q value. The recording of the exploration process of the intelligent agent by the experience pool is a balance of exploration and utilization mechanisms, the reuse rate of data is increased, and the data is recorded to be a data packet which comprises a node state s, an action a, a reward r and a next node state s 'and is packaged into a plurality of groups of shapes (s, a, r and s') and stored into the experience pool to be used for training the Q values of different node states of the neural network. In the double-Q network, the double neural network is utilized to perform sensing prediction on the Q values of the action states at different time steps and iterate after a certain time step, the loss function is reduced faster due to double sensing, a more stable regression prediction scheme is provided, and the result is more accurate.
And thirdly, designing a proper neural network model according to the scale of the static power grid, and training and fitting the Q values in different states. The neural network model adopts a classical CNN model, wherein the classical CNN model comprises a layer of input layers of all states of n power grid nodes, a layer of convolution layers of 30 convolution kernels of 3 x 3, a layer of convolution layers of 60 convolution kernels of 3 x 3, a last hidden layer which is a full-connection layer, and a final output layer which is a full-connection layer, outputs a vector containing a Q value of each legal action and represents the value of state transfer to different power grid nodes. The trained neural network can calculate action selections of different probability distributions formed according to the Q value for the input of any state, and a complete link with the highest value can be finally obtained through a series of continuous inputs.
And finally, measuring the optimal link by taking the node Q value as a standard, and outputting key nodes with different importance in the global power grid according to the frequency of the nodes appearing in the optimal solution set. The method comprises the steps of using a trained neural network to obtain values Q of different actions in different states, screening out the optimal link between any two points according to the Q value maximization principle, and obtaining key nodes with different importance according to the frequency of the nodes appearing in the optimal link set and sequencing.
Drawings
The fig. 1IEEE30 node abstracted structure diagram without authority.
Figure 2 is a diagram of a dual Q neural network iterative perception operation machine.
Fig. 3 is a schematic diagram of a CNN neural network.
FIG. 4 is a graph of importance of critical node identification results of IEEE30 power system using DDQN model.
Figure 5DDQN model operational flow diagram.
The DDQN model solves the process of the key nodes of the power grid.
FIG. 7 is a flow chart of the present invention.
Detailed Description
The invention is further illustrated with reference to the following figures and examples, but the practice of the invention is not limited thereto.
Example (b):
the power grid system structure of the present embodiment is selected from an open source data IEEE30 node system structure.
Step 1: the method comprises the steps of carrying out abstraction preprocessing on a static complex power grid, simplifying all nodes of large components including a generator and a transformer and links thereof into connection of points and lines based on a graph theory, abstracting the complex power grid into a pure non-weight undirected connection diagram only by considering existence and physical connection characteristics of the nodes, wherein the total number of the nodes is 30, and 41 connection edges are shown in figure 1.
Step 2: carrying out (0,1) normalization processing on the attribute of the power grid topological structure and the physical characteristic data of the electrical engineering, carrying out statistics on the related attribute of the power grid topological structure, carrying out node access degree normalization processing, firstly, scanning the data of the attribute once from beginning to end, and finding out the maximum access degree d of the attribute max And a minimum degree of ingress and egress d min . Then, a min-max normalization formula is used for solving the normalization value D of the input and output degree of each node existing in the complex power grid, wherein the formula is as follows:
normalizing the node strength, firstly, scanning the data of the attribute from beginning to end once to find out the maximum node strength s of the attribute max And minimum node strength s min . Then, a min-max normalization formula is used for solving a node strength S normalization value S of each node existing in the complex power grid, wherein the formula is as follows:
the physical characteristics of the electrical engineering comprise electrical distance, namely equivalent impedance, to carry out (0,1) normalization processing, firstly, the data of the attribute needs to be scanned from beginning to end once, and the maximum equivalent impedance z of the attribute is found out max And minimum equivalent impedance z min . Then, a min-max normalization formula is used for solving an equivalent impedance Z normalization value Z of each node existing in the complex power grid, wherein the formula is as follows:
and step 3: the reward function is designed by using a Gaussian function and an objective weighting method, wherein the smaller the equivalent impedance is, the closer the connection between two nodes is, and therefore the electrical distance represented by the equivalent impedance is inversely weighted by using the Gaussian function, so that the smaller the equivalent impedance is, the higher the weight between the two nodes is. The formula is as follows:
and then configuring the proportion of different attributes in the reward function by using an subjective and objective weighting mode, thereby obtaining the deep reinforcement learning reward function suitable for the selected static complex network, wherein the formula is as follows:
wherein Z is ij Representing the equivalent impedance after a Gaussian transformation, d j Represents the degree of entry or exit of node j, s j Representing the strength of node j.
And 4, step 4: and recording the exploration process of the intelligent agent by using an experience pool, and inputting exploration data into a dual neural network iteration perception Q value. The process of searching the intelligent agent is recorded by the experience pool through balancing a searching mechanism and a utilization mechanism, the reuse rate of data is increased, the data is recorded to include a node state s, an action a, a reward r and a next node state s ', and a plurality of groups of data packets in the shape of (s, a, r, s') are stored and enter the experience pool to be used for training the Q values of different node states of the neural network. In the double-Q network, the dual neural network is used for carrying out perception prediction on the Q values of the action states at different time steps and carrying out iteration after a certain time step, the loss function is reduced faster due to the dual perception, a more stable regression prediction scheme is provided, the result is more accurate, and the working mechanism of the double-Q network is shown in figure 2. The loss function is:
L=E[(r t+1 +γQ target (s t+1 ,argmax a (Q eval (s t+1 ,a t+1 ;θ)))-Q eval (s t ,a t ;θ)) 2 ]
wherein r represents the reward value, a represents the action, γ Qtarget is the value of the action prediction output by the fixed target network in the dual Q network according to the current training network, and Qeval (s, a, θ) is the value of the current training network prediction.
And 5: and designing a proper neural network model according to the scale of the static power grid, and training and fitting the Q values under different states. The neural network model adopts a classical CNN model, wherein the classical CNN model comprises a layer of input layers of all states of 30 power grid nodes, a layer of convolution layers of 30 convolution kernels of 3 x 3, a layer of convolution layers of 60 convolution kernels of 3 x 3, a last hidden layer which is a full-connection layer, and a final output layer which is a full-connection layer, and outputs a vector containing a Q value of each legal action to represent the value of state transfer to different power grid nodes, as shown in FIG. 3. The trained neural network can calculate action selections of different probability distributions formed according to the Q value for the input of any state, and a complete link with the highest value can be finally obtained through a series of continuous inputs.
Step 6: and measuring the optimal link by taking the Q value of the node as a standard, and outputting key nodes with different importance in the global power grid according to the frequency of the node in the optimal solution set. The method for using the node Q value as the standard for measuring the optimal link comprises the steps of obtaining values Q of different actions selected in different states by using a trained neural network, screening the optimal link between any two points according to the Q value maximization principle, obtaining key nodes with different importance according to the frequency of the nodes in the optimal link set, and sequencing, wherein the result is shown in fig. 4. The calculation formula is as follows:
wherein l jk (i) Indicates the optimal number of links containing node i, l jk Representing the optimal number of links between all nodes.
The above embodiments are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to this, for example, the neuron level in the neural network may be increased or decreased according to the actual level, and different deep learning models may be used. The model has great advantages in processing complex data feature extraction, and can be applied to complex power grids with more state numbers.
The invention provides a complex power grid key node identification method based on deep reinforcement learning, and belongs to the technical field of electric power big data processing. The method utilizes the thought of the Deep reinforcement learning model DDQN (Double Deep Q-Network, DDQN) interactive learning, completes the calculation of the Q value under the action of a specific state by the self-initiated behavior exploration of an intelligent body and combining the empirical data formed by environment information, action information and reward information, thereby evaluating the value of the action of a certain state of the complex power grid. The method is based on data driving, and overcomes the limitations of establishing a mathematical model based on a business mechanism in various aspects of adaptability, algorithm efficiency and accuracy under the complex environment of the power distribution network. The method avoids the process of performing distribution hypothesis and characteristic modeling on the state according to a large amount of priori knowledge in the traditional complex power grid reconstruction process, reduces the complexity of key node identification, is more suitable for a key node identification method of a large-scale power grid under the background of big data, and has higher robustness and accuracy.
Claims (7)
1. A complex power grid key node identification method based on deep reinforcement learning is characterized in that a deep double-Q network in the deep reinforcement learning is adopted, an interactive learning mode is established in a Markov view, and the Q value of each power grid node state is evaluated through empirical data interactively collected by an intelligent agent and a complex power grid environment; then, a state action CNN convolution model suitable for a static power grid environment is trained according to context action information and reward information by utilizing the spatial-temporal distribution characteristics and potential characteristics of data among the nodes captured by the neural network, an optimal link between any two points in the power grid is found, and the frequency of each node appearing on the optimal link is counted to complete the sequencing of the global key nodes of the power grid, wherein the method comprises the following steps:
step 1: performing abstract preprocessing on the static complex power grid;
step 2: carrying out normalization processing on the power grid topological structure attribute and the electrical engineering physical characteristic data;
and step 3: designing an incentive function by using a Gaussian function and an objective and subjective weighting method;
and 4, step 4: recording the exploration process of the intelligent agent by using an experience pool, and inputting exploration data into a dual neural network iterative perception Q value;
and 5: designing a proper neural network model according to the scale of the static power grid, and training and fitting Q values in different states;
step 6: and measuring the optimal link by taking the Q value of the node as a standard, and outputting key nodes with different importance in the global power grid according to the frequency of the node in the optimal solution set.
2. The complex grid key node identification method based on deep reinforcement learning as claimed in claim 1, wherein the abstract preprocessing of the complex grid in step 1 simplifies all large component nodes including generators and transformers and links thereof into connection with points and lines based on graph theory, and only considers existence and physical connection characteristics of the nodes.
3. The complex grid key node identification method based on deep reinforcement learning according to claim 1, wherein the statistics of the grid topology related attributes in step 2 includes node entry and exit degrees and node strength; the physical characteristics of electrical engineering include electrical distance, i.e., equivalent impedance, and the above data is subjected to local normalization processing of (0, 1).
4. The complex grid key node identification method based on deep reinforcement learning as claimed in claim 1, wherein the designing of the reward function in step 3 comprises performing inverse distance weighting on equivalent impedance by using a gaussian function, and configuring the proportion of different attributes in the reward function by using a subjective and objective weighting mode, so as to obtain the deep reinforcement learning reward function suitable for the selected static complex network.
5. The method for identifying key nodes of a complex power grid based on deep reinforcement learning according to claim 1, wherein the step 4 of recording the discovery process of an agent by using an experience pool is a balance between discovery and utilization mechanisms, and increases the reuse rate of data, and the data is recorded to include a node state s, an action a, a reward r, and a next node state s ', and is packaged into a plurality of groups of data packets (s, a, r, s') to be stored in the experience pool and used for the neural network to train Q values of different node states in the step 5, and the dual neural network is used for perception prediction of the Q values of the action states at different time steps and iteration is performed after a certain time step, so that the loss function is more rapidly reduced due to dual perception, a more stable regression prediction scheme is provided, and the result is more accurate.
6. The method for identifying key nodes of the complex power grid based on deep reinforcement learning according to claim 1, wherein the neural network model in the step 5 adopts a classical CNN model, which comprises an input layer and two convolutional layers, wherein the last hidden layer is a fully-connected layer, the final output layer is a fully-connected layer, and a vector containing a Q value of each legal action is output to represent the value of state transition to different power grid nodes.
7. The complex grid key node identification method based on deep reinforcement learning as claimed in claim 1, wherein the using of the node Q value as the criterion for measuring the optimal link in step 6 includes obtaining values Q of different actions selected in different states by using a trained neural network, screening the optimal link between any two points according to the principle of Q value maximization, obtaining key nodes of different importance according to the frequency of the nodes appearing in the optimal link set, and sorting the key nodes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210484829.8A CN114861792A (en) | 2022-05-06 | 2022-05-06 | Complex power grid key node identification method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210484829.8A CN114861792A (en) | 2022-05-06 | 2022-05-06 | Complex power grid key node identification method based on deep reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114861792A true CN114861792A (en) | 2022-08-05 |
Family
ID=82636223
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210484829.8A Pending CN114861792A (en) | 2022-05-06 | 2022-05-06 | Complex power grid key node identification method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114861792A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545580A (en) * | 2022-12-01 | 2022-12-30 | 四川大学华西医院 | Medical training process standardization verification method and system |
-
2022
- 2022-05-06 CN CN202210484829.8A patent/CN114861792A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545580A (en) * | 2022-12-01 | 2022-12-30 | 四川大学华西医院 | Medical training process standardization verification method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109669087A (en) | A kind of method for diagnosing fault of power transformer based on Multi-source Information Fusion | |
CN108062572A (en) | A kind of Fault Diagnosis Method of Hydro-generating Unit and system based on DdAE deep learning models | |
CN114580706A (en) | Power financial service wind control method and system based on GRU-LSTM neural network | |
Wang et al. | Power system network topology identification based on knowledge graph and graph neural network | |
CN106874963B (en) | A kind of Fault Diagnosis Method for Distribution Networks and system based on big data technology | |
CN106656357B (en) | Power frequency communication channel state evaluation system and method | |
CN112039687A (en) | Small sample feature-oriented fault diagnosis method based on improved generation countermeasure network | |
CN111783879B (en) | Hierarchical compressed graph matching method and system based on orthogonal attention mechanism | |
CN113780002A (en) | Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning | |
Guo et al. | AI-oriented smart power system transient stability: the rationality, applications, challenges and future opportunities | |
CN104200096A (en) | Lightning arrester grading ring optimization method based on differential evolutionary algorithm and BP neural network | |
Xiao et al. | Network security situation prediction method based on MEA-BP | |
Khomami et al. | Utilizing cellular learning automata for finding communities in weighted networks | |
CN114861792A (en) | Complex power grid key node identification method based on deep reinforcement learning | |
CN115310589A (en) | Group identification method and system based on depth map self-supervision learning | |
Chen et al. | Harsanyinet: Computing accurate shapley values in a single forward propagation | |
CN113379063B (en) | Whole-flow task time sequence intelligent decision-making method based on online reinforcement learning model | |
Beldi et al. | A new brainstorming based algorithm for the community detection problem | |
He et al. | Representation learning of knowledge graph for wireless communication networks | |
Feng et al. | Hybrid artificial intelligence approach to urban planning | |
Shan et al. | An integrated knowledge-based system for urban planning decision support | |
CN112465253B (en) | Method and device for predicting links in urban road network | |
Wang et al. | A genetic-algorithm-based two-stage learning scheme for neural networks | |
CN114706977A (en) | Rumor detection method and system based on dynamic multi-hop graph attention network | |
Fu et al. | Fault Diagnosis of an Excitation System Using a Fuzzy Neural Network Optimized by a Novel Adaptive Grey Wolf Optimizer. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |