CN111507374A - Power grid mass data anomaly detection method based on random matrix theory - Google Patents
Power grid mass data anomaly detection method based on random matrix theory Download PDFInfo
- Publication number
- CN111507374A CN111507374A CN202010090430.2A CN202010090430A CN111507374A CN 111507374 A CN111507374 A CN 111507374A CN 202010090430 A CN202010090430 A CN 202010090430A CN 111507374 A CN111507374 A CN 111507374A
- Authority
- CN
- China
- Prior art keywords
- data
- matrix
- power grid
- abnormal
- window
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000011159 matrix material Substances 0.000 title claims abstract description 49
- 238000001514 detection method Methods 0.000 title claims abstract description 13
- 230000002159 abnormal effect Effects 0.000 claims abstract description 39
- 238000004458 analytical method Methods 0.000 claims abstract description 12
- 238000000034 method Methods 0.000 claims description 23
- 238000005070 sampling Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000004088 simulation Methods 0.000 claims description 5
- 230000005856 abnormality Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000007418 data mining Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000003203 everyday effect Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000002547 anomalous effect Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E40/00—Technologies for an efficient electrical power generation, transmission or distribution
- Y02E40/70—Smart grids as climate change mitigation technology in the energy generation sector
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Water Supply & Treatment (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The invention discloses a power grid mass data anomaly detection method based on a random matrix theory, which can be used for detecting a large amount of incomplete and inconsistent dirty data caused by equipment faults, data stream acquisition errors, external noise disturbance and other factors, wherein three-phase voltage and current are used as analysis indexes, a Random Matrix Theory (RMT) is used as a basis, linear characteristic value statistics (L ES) is used as a statistical index after source data are processed, and the characteristics of elements in a data matrix are continuously analyzed, L ES can reflect some statistical rules of the matrix, so that the abnormal data content in a space-time source data matrix can be represented by the fluctuation degree of the matrix within a period of time.
Description
Technical Field
The invention relates to the problem of abnormal detection of power distribution network operation data, in particular to a power grid mass data abnormal detection method based on a random matrix theory, and belongs to the technical field of power grid management control.
Background
With the continuous expansion of the scale of a power distribution network in China, provincial interconnection, regional interconnection and national interconnection become a necessary development trend, meanwhile, the number of power grid lines is increased, the network structure is increasingly complex, the quantity of real-time massive operation data acquired by equipment is huge, and because the intelligent power grid is influenced by equipment faults, data stream acquisition errors, external noise disturbance and the like, the initially acquired massive power grid operation data has a large quantity of incomplete and inconsistent dirty data, data mining cannot be directly performed, or the mining result is poor. In order to improve the quality of data mining, initial detection of data is required. Meanwhile, the electric power big data theory is increasingly perfected, the power grid operation state on-line monitoring system can collect power grid operation parameters in time and in a centralized manner, and the data meet 5 characteristics of big data: the method has the advantages of large data Volume (Volume), high processing speed (Velocity), multiple data types (Variety), high Value (Value) and high accuracy (Veracity), and contains a large amount of valuable information related to the operation of the power distribution network.
Theoretical studies of abnormal data mining formally entered the field of vision of people in 1887, started with a paper by the british statistician francisco iderolo. With the deep research on abnormal data mining, a plurality of abnormal detection technologies appear and are widely applied to practical engineering. However, as big data theory develops, the detection of abnormal data develops relatively late compared to data mining. The earliest methods applied to anomaly diagnosis were statistical-based methods that relied on data streams satisfying a certain standard distribution, i.e., the method was defined by probability distribution, such as Yamannishi et al, which describes normal behavior using a gaussian mixture model, and finds diagnostic anomalies by calculating the degree of deviation between target data and the model standard state, but such methods have significant limitations because it was not known in advance what standard distribution the data set under study satisfies; towel et al have proposed an abnormal data mining algorithm based on a neural network on the basis of analyzing the neural network, but problems in the aspects of poor generalization capability of the neural network, the need of expert experience in constructing the network, and the like bring certain problems to the application of the model. The classification method based on the kernel is an algorithm developed in recent years and used for abnormal data mining, and the main idea is that target source data is mapped to a high-dimensional feature space through a functional relation, and a classification model can be established according to a classification hyperplane of the high-dimensional feature space so as to distinguish abnormal data.
The research on abnormal data mining in China starts relatively late, but in recent years, a plurality of important research achievements have been obtained, and the automatic monitoring on the abnormal state of the mass data flow of the power grid is mainly realized by setting a threshold value or based on theories such as wavelet analysis, an artificial neural network, a support vector machine and the like.
In conclusion, many abnormal data mining algorithms are provided at home and abroad aiming at abnormal data detection, but the method has the problems of low detection speed, low accuracy and complicated model establishment, and the method for detecting the abnormal mass data of the power grid based on the random matrix theory is provided aiming at the problems, has certain engineering value and research interest and has guiding significance for management and decision-making departments.
Disclosure of Invention
The invention mainly solves the technical problem of abnormal data in the power grid mass voltage and current operation data, provides a power grid mass data abnormality detection method based on a random matrix theory, can accurately and quickly identify the abnormality of the power grid mass voltage and current data, can effectively solve the problems of complexity of a model to be designed, inaccurate model design and the like, and has higher real-time performance and accuracy.
In order to solve the technical problems, the invention adopts a technical scheme that:
step 1: the method comprises the steps of determining an area needing to be detected, obtaining massive power grid operation Data under a feeder line and a branch line of the area in real time through a Supervisory control and Data Acquisition (SCADA) system, selecting three-phase voltage and three-phase current as analysis samples from the operation Data, requiring that the time scales of the samples are consistent, selecting 380V feeder lines in a certain area and three-phase voltage and three-phase current under the branch line of the feeder line as the analysis samples, and forming a space-time source Data matrix D, wherein the form of the space-time source Data matrix D is shown in a table.
TABLE 1 spatio-temporal source data matrix D form
Written in matrix form asWherein DijAnd (i is 1, 2, …, p, j is 1, 2, … n) is the three-phase voltage quantity or the three-phase current quantity corresponding to the j time of the ith distribution transformer.
Step 2: obtained in step 1Can seeD is a real matrix of 6p × n, let Dw∈Rp×nFor data matrix DwAccording toStandardized, processed matrix notationAnd (4) showing. To better satisfy the RMT analysis condition, gaussian white noise, i.e. white gaussian noise, is usually added moderately due to the weak correlation between matrix rows and analysis errorsTo add the noisy matrix, the signal-to-noise ratio is usually taken to be large, W ∈ Rp×n。
And step 3: setting sampling time t of moving window methodjThe data sampling interval is 15min, 96 points are sampled every day, the moving step length is set to be 1, and the window size is large. In order to meet the application condition of RMT, the window data obtained from D to form the data matrix should satisfy the requirement that the length is greater than the width.
And 4, step 4: computing a covariance matrixCalculating the eigenvalue of the covariance matrix; for convenience of analysis, the characteristic values are normalized to be uniform between (0, 1), that is, the characteristic values are normalizedWhere λ (i) is the actual eigenvalue calculation and p is the covariance matrix dimension.
When P, n → ∞ and C ═ P/n ∈ (0, 1), the ESD of the covariance matrix S converges to the following Probability Density Function (PDF) according to the M-P law.
The analysis shows that the Empirical Spectrum Distribution (ESD) of S does not obey the M-P law distribution when abnormal data exist.
Step 6, calculating a linear eigenvalue statistic L ES, and according to the eigenvalues obtained in step 4 and the conclusion obtained in step 5, it can be known that the ESD containing abnormal data is different from the ESD under normal conditions, and can reflect the data behavior (normal or abnormal) in the matrix to some extent, and the data behavior is defined asWherein λi(i ═ 1, 2, … n) are the n eigenvalues of the covariance matrix S,is a test function, typically using the Chebyshev polynomial, Shannon-Entrophy, Wasserstein Distance, etc. In the invention, Shannon-Entrophy is selected, and the calculation formula is
And 7, taking a 380V feeder line L1 as an analysis object, sequentially calculating L ES values at each moment in a period of time, drawing a L ES-t curve, analyzing and explaining the characteristics containing abnormal data from the curve, and verifying by combining with an actual engineering example.
Drawings
FIG. 1 is a specific flowchart of a method for identifying abnormal operation data of a power grid according to the present invention;
FIG. 2 is a scale of abnormal data contained in a case used in the present invention;
FIG. 3 is a corresponding L ES-t simulation curve obtained for different abnormal data ratios in the present invention;
FIG. 4 is a diagram illustrating the L ES-t curve recovered by the present invention after partial anomaly data has been modified;
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
The invention provides a power grid mass data anomaly detection method based on a random matrix theory, which selects mass voltage and current real-time operation data detected by all distribution transformers under 380V feeder lines and branch lines in a certain area for simulation verification, and adopts the following specific technical implementation scheme:
1. A380V feeder line in the area is selected, in the example, 10 distribution transformers are arranged under the selected feeder line, operation data in one week are selected, 96 data can be sampled by the equipment every day, and therefore a space-time source data matrix D generated by taking three-phase voltage quantity and three-phase current quantity as analysis indexes is a matrix of 60 × 672 order.
2. According to the calculation method of the technical scheme provided by the invention in the steps 2 and 3, firstly, window parameters are set, a starting point is defined as a timing zero point, a data sampling interval is 15min, 96 points are sampled every day, a moving step length is set to be 1, the window size is set to be 60 × 192, namely window data are obtained from D to form a data matrix with the 60 × 192 order.
3. According to the technical scheme step 4, a covariance matrix S (t) is calculated corresponding to each sampling momentj),S(tj) 60 × 60 order matrix, characteristic values of covariance matrix at a certain time and results after normalization are shown in the following table
TABLE 2
4. Selecting Shannon-Entrophy as test function according to calculation formulaWhen the power distribution network operates, the operation data of the power distribution network collected by the monitoring control and data acquisition system continuously increases along with the increase of the operation time, the contained abnormal data also increases, and the ordered degree of data distribution also decreases along with the increase of the operation timeIt is also becoming increasingly difficult to detect anomalous data in large amounts of voltage and current. The proportion of the contained abnormal data is shown in fig. 2.
The L ES-t simulations obtained for the three cases of anomalous data involved are shown in FIG. 3.
It can be seen that the correlation between the operation data does not change substantially when no abnormal data is contained, and L ES can reflect the property to a certain extent, and the change is more gradual from L ES on the image, the fluctuation amplitude of the L ES-t curve is larger with the higher content of the abnormal data, for the graph (b), it can be seen that at the corresponding sampling time tiAfter the curve is suddenly changed at 60, the parameters set by the moving window method can roughly determine that the abnormal data exists in the 60 th column of the space-time source data matrix, and then the operation parameters near the 60 th column are corrected to obtain L ES-t curves again as shown in FIG. 4.
From FIG. 3, it can be seen that the L ES-t fluctuation amplitude at this time was significantly smaller than before unrepaired, and also the conclusion that the fluctuation amplitude of L ES was related to the proportion of contained abnormal data was confirmed.
6. The above description is only an embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent flow transformations made by using the contents of the description and drawings of the present invention, or directly or indirectly applied to other related technical fields, are similarly included in the patent protection of the present invention.
Claims (4)
1. The method for detecting the abnormal data of the power grid mass data based on the random matrix theory is characterized in that the abnormal data detection is carried out by adopting a method completely driven by data aiming at the power grid mass voltage and current operation data, and the method comprises the following steps:
step 1: and selecting a power grid mass voltage and current operation data of the power distribution network feeder and all distribution transformers under the branch lines of the power distribution network feeder in a period of time t, selecting three-phase voltage and three-phase current as analysis indexes, and forming a space-time source data matrix D.
Step 2: setting sampling time t of moving window methodjStep size, window size, signal-to-noise ratio, SNR. Obtaining window data from D to form data matrix Dw(tj)。
And step 3: for data matrix Dw(tj) According to the elements inIs subjected to standardization to obtainWherein DijIs the value of the element in the ith row and the jth column of the matrix,and σiIs the i mean and standard deviation of the elements of the first row,normalized values for the corresponding elements.
And 5: calculate S (t)j) For the next step L ES is calculated, the eigenvalues are normalized to be between (0, 1).
And 7, drawing L ES-t curves through simulation, and analyzing and comparing curve characteristics in the two conditions of abnormal data and abnormal data.
2. The method for detecting the abnormality of the power grid mass data based on the random matrix theory as claimed in claim 1, wherein the space-time source data matrix D in the step 1 is obtained through power grid mass voltage and current operation data. Analyzing the time sequence data by adopting a moving window method in the step 2, selecting the dimension of the window with the same size as that of the data matrix to meet the RMT application condition, and according to the sampling time tjAnd sequentially calculating linear data indexes of the moving window data to indicate the behavior (whether the abnormal condition exists) of the massive voltage and current operation data of the power grid.
4. The method for detecting the abnormal data of the power grid mass data based on the random matrix theory as claimed in claim 1, wherein the process of detecting the abnormal data by the moving window method in the step 5 is a process of moving a window according to set parameters, L ES of the data matrix obtained by each sampling window is sequentially calculated, a L ES-t curve is drawn, the fluctuation range of L ES within a period of time is compared, so as to judge whether the abnormal data exists, the step 7 obtains the L ES-t curve taking an engineering data example as an example through simulation, and the proportion of the abnormal data contained in the space-time source data matrix is judged according to the smoothness degree of curve change.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010090430.2A CN111507374A (en) | 2020-02-13 | 2020-02-13 | Power grid mass data anomaly detection method based on random matrix theory |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010090430.2A CN111507374A (en) | 2020-02-13 | 2020-02-13 | Power grid mass data anomaly detection method based on random matrix theory |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111507374A true CN111507374A (en) | 2020-08-07 |
Family
ID=71863925
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010090430.2A Pending CN111507374A (en) | 2020-02-13 | 2020-02-13 | Power grid mass data anomaly detection method based on random matrix theory |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111507374A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348644A (en) * | 2020-11-16 | 2021-02-09 | 上海品见智能科技有限公司 | Abnormal logistics order detection method by establishing monotonous positive correlation filter screen |
WO2023241326A1 (en) * | 2022-06-14 | 2023-12-21 | 无锡隆玛科技股份有限公司 | Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix |
-
2020
- 2020-02-13 CN CN202010090430.2A patent/CN111507374A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112348644A (en) * | 2020-11-16 | 2021-02-09 | 上海品见智能科技有限公司 | Abnormal logistics order detection method by establishing monotonous positive correlation filter screen |
CN112348644B (en) * | 2020-11-16 | 2024-04-02 | 上海品见智能科技有限公司 | Abnormal logistics order detection method by establishing monotonic positive correlation filter screen |
WO2023241326A1 (en) * | 2022-06-14 | 2023-12-21 | 无锡隆玛科技股份有限公司 | Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105425779B (en) | ICA-PCA multi-state method for diagnosing faults based on local neighborhood standardization and Bayesian inference | |
CN110336534B (en) | Fault diagnosis method based on photovoltaic array electrical parameter time series feature extraction | |
CN105700518B (en) | A kind of industrial process method for diagnosing faults | |
CN109193650B (en) | Power grid weak point evaluation method based on high-dimensional random matrix theory | |
CN109460574A (en) | A kind of prediction technique of aero-engine remaining life | |
CN109816031B (en) | Transformer state evaluation clustering analysis method based on data imbalance measurement | |
CN110458230A (en) | A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method | |
CN117290802B (en) | Host power supply operation monitoring method based on data processing | |
CN115409131B (en) | Production line abnormity detection method based on SPC process control system | |
CN109389325B (en) | Method for evaluating state of electronic transformer of transformer substation based on wavelet neural network | |
WO2023241326A1 (en) | Power grid anomaly detection method based on maximum eigenvalue rate of sample covariance matrix | |
CN116610998A (en) | Switch cabinet fault diagnosis method and system based on multi-mode data fusion | |
CN110751217B (en) | Equipment energy consumption duty ratio early warning analysis method based on principal component analysis | |
CN114062850B (en) | Double-threshold power grid early fault detection method | |
CN111507374A (en) | Power grid mass data anomaly detection method based on random matrix theory | |
CN110632455A (en) | Fault detection and positioning method based on distribution network synchronous measurement big data | |
CN111797533B (en) | Nuclear power device operation parameter abnormity detection method and system | |
CN105516206A (en) | Network intrusion detection method and system based on partial least squares | |
CN112947649B (en) | Multivariate process monitoring method based on mutual information matrix projection | |
CN107274025B (en) | System and method for realizing intelligent identification and management of power consumption mode | |
CN113837591A (en) | Equipment health assessment method oriented to multi-working-condition operation conditions | |
CN114597886A (en) | Power distribution network operation state evaluation method based on interval type two fuzzy clustering analysis | |
CN106444706A (en) | Industrial process fault detection method based on data neighborhood feature preservation | |
CN116990633A (en) | Fault studying and judging method based on multiple characteristic quantities | |
CN115828114A (en) | Energy consumption abnormity detection method for aluminum profile extruder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200807 |