CN110458230A - A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method - Google Patents
A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method Download PDFInfo
- Publication number
- CN110458230A CN110458230A CN201910740107.2A CN201910740107A CN110458230A CN 110458230 A CN110458230 A CN 110458230A CN 201910740107 A CN201910740107 A CN 201910740107A CN 110458230 A CN110458230 A CN 110458230A
- Authority
- CN
- China
- Prior art keywords
- data
- abnormal
- voltage
- point
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 122
- 230000004927 fusion Effects 0.000 title claims abstract description 18
- 230000001131 transforming effect Effects 0.000 title abstract 2
- 230000002159 abnormal effect Effects 0.000 claims abstract description 175
- 238000013135 deep learning Methods 0.000 claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 20
- 230000008859 change Effects 0.000 claims description 53
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 238000010586 diagram Methods 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000005070 sampling Methods 0.000 claims description 15
- 230000009466 transformation Effects 0.000 claims description 11
- 230000015654 memory Effects 0.000 claims description 8
- 210000002569 neuron Anatomy 0.000 claims description 8
- 230000006403 short-term memory Effects 0.000 claims description 8
- 238000013480 data collection Methods 0.000 claims description 7
- 230000007787 long-term memory Effects 0.000 claims description 7
- 238000013136 deep learning model Methods 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- IOYNQIMAUDJVEI-BMVIKAAMSA-N Tepraloxydim Chemical group C1C(=O)C(C(=N/OC\C=C\Cl)/CC)=C(O)CC1C1CCOCC1 IOYNQIMAUDJVEI-BMVIKAAMSA-N 0.000 claims description 2
- 230000002441 reversible effect Effects 0.000 claims description 2
- 230000017105 transposition Effects 0.000 claims description 2
- 238000007619 statistical method Methods 0.000 abstract description 3
- 238000012795 verification Methods 0.000 abstract description 3
- 238000001514 detection method Methods 0.000 description 30
- 238000012360 testing method Methods 0.000 description 20
- 238000004458 analytical method Methods 0.000 description 13
- 238000012216 screening Methods 0.000 description 13
- 238000013528 artificial neural network Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000012549 training Methods 0.000 description 9
- 230000005611 electricity Effects 0.000 description 8
- 238000013450 outlier detection Methods 0.000 description 6
- 230000005856 abnormality Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 238000012706 support-vector machine Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000012850 discrimination method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 230000003203 everyday effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- RXKJFZQQPQGTFL-UHFFFAOYSA-N dihydroxyacetone Chemical compound OCC(=O)CO RXKJFZQQPQGTFL-UHFFFAOYSA-N 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/251—Fusion techniques of input or preprocessed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The invention discloses a kind of distribution transformings based on the fusion of more criterions with adopting data exception discriminating method, comprising: to for statistical analysis with data breakpoint, abnormal point and scene actual operating data situation is adopted;The examination that four kinds of methods such as prototype clustering procedure, Density Clustering method, probability density method, deep learning method carry out exceptional value is respectively adopted, " 4 take 2 " verification result is carried out to four kinds of models, i.e. four kinds of models think that point to be determined is abnormal point there are two model, then point to be determined is abnormal point.The difficulty that faces is big when the present invention solves conventional machines learning method processing mass data, low efficiency, the problems such as real-time is not high.
Description
Technical Field
The invention belongs to the technical field of power system distribution transformer data processing, and particularly relates to a distribution transformer data acquisition abnormity discrimination method based on multi-criterion fusion.
Background
With the wide application of computer, communication and sensing technologies, the continuous promotion of distribution network operation monitoring services and the deployment of a large number of monitoring and metering devices, the distribution and transformation station area is monitored to obtain massive operation data, user power consumption data and equipment state data, the data are analyzed, mined, extracted and processed, the safe and economical operation of the distribution and transformation station area is realized, the service quality is improved, and the electricity and electricity charge service is expanded, so that the challenge facing the distribution network is provided. It should be noted that, about 10% of abnormal data exists in the massive power grid data obtained by monitoring the distribution transformer area, and it is necessary to analyze the quality of the data to be acquired and discriminate the abnormal data, so as to provide reliable, accurate and effective data support for developing monitoring operation services. The main reasons for the occurrence of abnormal data in time series are as follows:
(1) failure of the metering device: the metering device comprises a terminal, a mutual inductor, a junction box and a meter, and faults can exist in any link. For example: partial discharge or complete discharge is generated in the corona of the mutual inductor, so that inaccurate data collection is caused; and the abnormal metering data caused by poor contact of the junction box.
(2) Communication signal difference: and 3G signals are used in part of regions, so that the signals are occasionally not transmitted, and data transmission fails in part of time intervals. Meanwhile, large buildings can shield communication signals and influence communication.
(3) Collector failure: the collector realizes data summarization and distribution of all devices in the control range, and realizes the function of transmitting the control command of the intelligent electric meter. In the low-voltage user, the collectors are separated from the metering device, and each collector controls a plurality of intelligent electric meters. When the collector is in communication or has a fault, all the intelligent electric meter electricity data in the whole collection range are failed to be collected.
(4) Human factors: the method mainly uses unreasonable electricity, so that the electricity meter is in an overload state for a long time and steals electricity, which causes the occurrence of abnormal data of time series.
The quality of the data is adopted, so that the quality of the model analysis result is determined to a great extent. Therefore, the detection and the discrimination of abnormal values existing in the acquired data before establishing an analysis model are important ways for improving the data quality. The current common abnormal point detection methods mainly comprise:
(1) the statistical method comprises the following steps: it was first used for outlier detection and is generally classified into hypothesis-based testing methods and model-based methods. Most of the real data mining problems need to search abnormal points in a multi-dimensional space, but most of consistency tests are only suitable for single-attribute tests; meanwhile, the method is very limited because a data distribution model must be known before the method.
(2) The detection method based on the distance abnormal value comprises the following steps: the distance function and parameters are not easy to select, and only global abnormal points can be detected, but local abnormal points cannot be detected.
(3) The detection method based on the density abnormal value comprises the following steps: the method can detect global and local abnormal points, but is complex and tedious in calculation and not suitable for high-dimensional data occasions.
(4) The cluster outlier detection method comprises the following steps: the method can find the class and the abnormal point at the same time, but has low efficiency and strong pertinence.
(5) The abnormal value detection method based on machine learning comprises the following steps: artificial Neural Networks (ANN) and Support Vector Machines (SVM) can be divided into two major classes. The ANN has a good application effect on processing small-scale problems, but the efficiency of large-scale data scenes is low, the problem of parameter training is difficult to solve well, the training process is easy to fall into local optimization, and the accuracy of the model is seriously influenced due to improper setting of the model structure and the weight. The SVM has better generalization capability, but the SVM faces a severe challenge in processing massive samples, and the modeling is complex, so that certain difficulty exists in practical application.
Abnormal data in current, voltage, active power and reactive power curves of the intelligent electric meter directly reflect the running state of the intelligent electric meter, and the abnormal data belong to measurement abnormal points and user electricity utilization abnormal points in the aspect of representation. The failure of the smart meter is not always caused instantaneously, but is in a sub-healthy operation state for a period of time before the failure. In this state, the abnormal data on the curve is more hidden and is not easily distinguished by the basic criterion. The quality of the data adopted seriously affects the credibility of analysis results of departments such as an operation center and the like, and the quality of the data is seriously affected by adopting abnormal data. In addition, the sampling data has the problems of breakpoints, phase loss, abnormal high and low values and the like, and the current general sampling abnormal data discrimination rule has the defect of setting too rigid, so that the abnormal value discrimination rule needs to be improved in a targeted manner to improve the abnormal value detection discrimination accuracy.
The data mining and deep learning theory is taken as a research hotspot in the current computer field, the analysis and processing of high-dimensional, complex and nonlinear problems can be effectively carried out, the deep learning divides a training set into small batches of data in advance for calculation, and the training efficiency is improved. Therefore, in comparison, deep learning is more suitable for detecting and discriminating abnormal values of mass acquired data in time series such as current and voltage, and the defects of high memory occupation, low operation processing speed, difficulty in processing high-dimensional characteristic data and the like in the process of processing mass data by using the traditional machine learning method can be overcome by adopting the deep learning.
Disclosure of Invention
The invention aims to provide a distribution transformer data acquisition abnormity screening method based on multi-criterion fusion, wherein four methods, namely a prototype clustering method, a density clustering method, a probability density method, a deep learning method and the like are respectively adopted to screen abnormal values, and a '4-to-2' verification result is carried out on the four models, so that the problems of high difficulty, low efficiency, low instantaneity and the like when the traditional machine learning method is used for processing mass data are solved.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a method for discriminating abnormal data acquisition for distribution transformer based on multi-criterion fusion comprises the following steps:
acquiring original data for distribution transformer;
preprocessing the original data adopted by the distribution transformer;
randomly adding noise points to the preprocessed distribution transformation sampling original data to form a sampling data sequence containing abnormal points;
respectively adopting four models of a prototype clustering method, a density clustering method, a probability density method and a deep learning method to discriminate abnormal points of the sampling data sequence containing the abnormal points;
determining abnormal data for distribution transformer; and the distribution transformer adopts the intersection of abnormal data which are the discrimination results of the abnormal points of any two models, and then takes the union of the intersection determined by the combination of every two models.
Further, the obtaining of the distribution transformation raw data includes:
the metering device collects three-phase current, three-phase voltage and active power original data based on normal operation, and the collection time interval is 15 min.
Further, the preprocessing the distribution transformer raw data includes:
carrying out missing value processing and removing obvious abnormal values on the original data adopted by the distribution transformer;
the missing value processing of the original data adopted by the distribution transformer comprises the following steps: missing individual data in the continuous time data set, and filling up the missing data by adopting a linear interpolation method; directly eliminating a large amount of data missing in the continuous time data set;
the removing of the obvious abnormal value refers to removing data which are displayed as-9999 from the three-phase current, three-phase voltage and active power original data.
Further, when the distribution transformation adopts the original data and large-scale data loss occurs, a curve before the data loss is selected for calculation.
Further, the noise points follow a normal distribution.
Further, the method for discriminating the abnormal points of the sampling data sequence containing the abnormal points by adopting a prototype clustering method comprises the following steps:
determining a clustering attribute; the method comprises the following steps: selecting an actual voltage value of a point to be detected, a voltage change value of the point to be detected and a previous voltage change value of the point to be detected and a previous voltage change value of the point to be detected as clustering attributes of the voltage time sequence, selecting an actual current value of the point to be detected, a current change value of the point to be detected and the previous voltage change value as clustering attributes of the current time sequence, and selecting an actual active power value of the point to be detected, an active power change value of;
according to the clustering attributes, clustering the time sequences to be detected into 4 classes by adopting a k-means algorithm, and determining the mass centers of the classes;
calculating the distance from each point to be detected to the nearest clustering center and the relative distance from each point to be detected to the nearest clustering center according to each type of centroid;
comparing the relative distance from each point to be detected to the nearest clustering center with a given threshold value; and if the relative distance from a certain point to be detected to the nearest clustering center is greater than a given threshold value, the point to be detected is an outlier, namely an outlier.
Further, the method for screening abnormal points of the sampling data sequence containing the abnormal points by adopting a density clustering method comprises the following steps:
respectively drawing a voltage-current plane distribution diagram, a current-active power plane distribution diagram and a voltage-active power plane distribution diagram;
clustering points on the histogram, including: if the distance between two points in the plane distribution map does not exceed the set maximum distance, classifying the points into one class; points on the voltage-current plane distribution diagram are a certain phase current time sequence and a certain voltage time sequence; points on the current-active power plane distribution diagram are a certain phase current time sequence and an active power time sequence; points on the voltage-active power plane distribution diagram are a certain phase voltage time sequence and an active power time sequence;
and (5) performing loop iteration to find out points which do not belong to any class, namely abnormal points.
Further, the method for screening the abnormal points of the sampling data sequence containing the abnormal points by using a probability density method includes:
determining a model input and a model output; the model inputs are: the current time sequence is the change value of the current at the point to be determined and the current at the previous point; the voltage time sequence is the change value of the voltage at the point to be determined and the voltage at the previous point; the active power time sequence is the change value of the voltage of a point to be determined and the voltage of the previous point; the model output is: normal range of variation values;
respectively fitting probability distribution of voltage, current and active power data by adopting a kernel density function, and obtaining a probability density function;
for any value d to be detected, integral calculation is carried out on the probability density function to obtain the probability of the occurrence of the [ d, + ∞ ] numerical range, and the probability is compared with a threshold value, namely whether the probability is lower than the probability 0.003 corresponding to 3 sigma or not; if yes, the numerical value to be detected is an abnormal point.
Further, the method for screening the abnormal points of the sampling data sequence containing the added abnormal points by adopting a deep learning method comprises the following steps:
predicting future current, voltage or power data by adopting a deep learning model based on a long-term and short-term memory network trained by current, voltage and power time sequences, and comparing errors of predicted values and true values; if the deviation of the predicted value from the true value exceeds a set threshold value, the predicted value is an abnormal point;
the deep learning model of the long-short term memory network is as follows: after completing the forward calculation, the model parameters are updated and adjusted by adopting an error back propagation algorithm, and the method comprises the following steps:
neuron weighted input net of long-short term memory network at time tf,t,neti,t,netc′,t,neto,tComprises the following steps:
wherein, Wox、Wfx、Wix、Wcx、Woh、Wfh、Wih、WchRepresents a weight, ht-1Is the LSTM output, x, of the previous time instanttIs input at the current time, bf、bi、bo、bcThe offset of the forgetting gate structure, the input gate structure, the output gate structure and the input unit at the current moment are respectively;
neuron error term delta of long-term and short-term memory network at time tf,t,δi,t,δc′,t,δo,tComprises the following steps:
wherein E is a prediction error;
error term delta at time t-1 when the error propagates in the reverse direction in timet-1Comprises the following steps:
wherein,is a Jacobian matrix;
when the error is reversely transferred from the current l layer to the l-1 layer, the l-1 layer errorComprises the following steps:
finally, the weight W is obtainedoh、Wfh、Wih、WchComprises the following steps:
wherein, Woh,t,Wfh,t,Wih,t,Wch,tRespectively representing the weight of T time, and superscript T representing transposition;
weight Wox、Wfx、Wix、WcxComprises the following steps:
bf、bi、bo、bccomprises the following steps:
wherein, bo,t,bf,t,bi,t,bc,tRespectively, representing the error term at time t.
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
(1) according to the deep learning method used in the method, the deep learning algorithm can process mass data, efficient and comprehensive feature learning is performed, and the inefficiency and incompleteness of manual feature learning are reduced, so that the generalization capability of the learned features is stronger;
(2) the LSTM neural network used in the deep learning algorithm is a long-term and short-term memory network, is a time cycle neural network and is suitable for processing and predicting important events with relatively long intervals and delays in a time sequence. The main task of the abnormal operation state identification model of the metering device is to identify time sequence data acquired by the metering device, and the LSTM neural network has good performance in processing the problem;
(3) when the number of the running data sample sets obtained by the electric meter is small or the time series characteristics are not obvious, and the model of the LSTM is poor in performance, the effect of the other three algorithms in the model is often better than that of the LSTM. The multi-criterion fusion algorithm can be used for recognizing abnormal running states under different conditions, and has better generalization performance and higher accuracy;
(4) the method provided by the invention adopts the method of '2 out of 4' for cross validation, so that the abnormal point screening detection rate is improved, the misjudgment rate of the abnormal point screening is reduced, the abnormal point screening capability of the model is improved, accurate data guarantee is provided for operation detection services, and workers in relevant departments of the power grid can carry out maintenance and investigation according to the identification result, so that the potential problems of abnormal power meters are solved as early as possible, and the safety and reliability of the power grid operation are improved.
Drawings
FIG. 1 is a schematic diagram of the structure of each component of a long term memory network and a short term memory network according to an embodiment of the present invention;
FIG. 2 is a diagram of a deep long short term memory network architecture in accordance with an embodiment of the present invention;
FIG. 3 is a diagram illustrating the analysis result of the prototype clustering voltage under the normal distribution error with the standard deviation of 6 according to the embodiment of the present invention;
FIG. 4 is a result of density clustering analysis under a normal distribution error with a standard deviation of 6 according to an embodiment of the present invention; FIG. 4(a) is a power voltage plane distribution, and FIG. 4(b) is a voltage current plane distribution;
FIG. 5 is a result of probability distribution analysis under a normal distribution error with a standard deviation of 6 according to an embodiment of the present invention;
FIG. 6 shows the LSTM prediction result and the true voltage under the normal distribution error with the standard deviation of 6 according to the embodiment of the present invention;
FIG. 7 is a graph of the LSTM prediction error under a normal distribution error with a standard deviation of 6 in an embodiment of the present invention;
FIG. 8 is a diagram illustrating the results of prototype clustering analysis under a normal distribution error with a standard deviation of 8 in an embodiment of the present invention;
FIG. 9 shows the result of density clustering analysis under a normal distribution error with a standard deviation of 8 according to an embodiment of the present invention; FIG. 9(a) is a power voltage plane distribution, and FIG. 9(b) is a voltage current plane distribution;
FIG. 10 shows the true voltage and the LSTM prediction result under the normal distribution error with the standard deviation of 8 according to the embodiment of the present invention;
FIG. 11 is a graph showing the prediction error of the LSTM model under a normal distribution error with a standard deviation of 8 in an embodiment of the present invention;
FIG. 12 is a graph of 49932 electric meter current and current variation according to an embodiment of the present invention;
FIG. 13 is a graph of 29047 electric meter power and power variation values in accordance with an embodiment of the present invention;
FIG. 14 is a 45000 ammeter abnormal value detection curve in the embodiment of the present invention; FIG. 14(a) is a graph of power and power variation; FIG. 14(b) is a graph showing the current and the current variation of the ammeter;
FIG. 15 is a plot of 29047 meter current and power variation values in accordance with an embodiment of the present invention;
FIG. 16 is a graph of 64258 electric meter current and power variation values according to an embodiment of the present invention.
Detailed Description
The invention is further described below. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The invention provides a method for discriminating abnormal data acquired by a distribution transformer based on multi-criterion fusion, which comprises the following steps:
1) by acquisition of data
The real-time data acquisition contents mainly comprise current, voltage, active power, reactive power and meter reading electric quantity of each phase. Wherein, the first four items are measured every 15min, 96 data points are generated every day, and the meter reading electric quantity is measured every day. In the operation monitoring business and the related reading, checking and receiving work of an actual enterprise, the data quality requirements for various currents, voltages, active power and reading electric quantity are high. The data quality requirements for reactive power are relatively low. The meter reading electric quantity is closely related to the active power data, and the abnormal active power data means the abnormal meter reading electric quantity data. Therefore, the invention mainly analyzes the current, voltage and active power data of the data. In the actual sampling data search set, the problems of sampling data breakpoints and abnormal points mainly exist.
2) Determining abnormal data by data analysis
Collecting normal distribution transformation and fault distribution transformation data of a distribution transformer area of a Jianning area in a certain city, wherein the normal distribution transformation data and the fault distribution transformation data are three-phase current, three-phase voltage and positive active power data from 8 month 1 to 8 month 31 in 2017, and the time interval scale is 15 min. A brief analysis of the overall profile of the data provided, the main conclusions are as follows:
a) voltage data: in all the voltage data provided, 8208 transformation sets are provided for data acquisition, the total number of acquired data points is 13,555,680, the total number of missing data (NULL) is 885,129, and the proportion of the missing data in the total data amount is 6.53%. Wherein: the voltage data collected by 6001 distribution transformers are complete and have no missing data. The 10kV Danfo small town 7# change, the five-community 5# public change, the Jiangshan housing #3 change, the Yuyangzhan garden #2 change, the twenty-first century modern city #7 change, the lake mountain manzu 3# beautiful change, the Tankqiao north garden # 23 change, the Baijiahu international garden B area temporary change #1, the Baijiahu international garden B area temporary change #3 and the Baijiahu international garden B area temporary change #2 (10 in total) have the largest proportion of missing data, and the proportion is 97.92 percent.
Among the data without deletion, 2400 data were-9999, which are apparently abnormal data, accounting for 0.02%. Specifically, voltage data values acquired by a #1 main transformer, a #2 main transformer, a #3 main transformer and a #4 main transformer of a central grain house color cloud living #1 power distribution station from 8 months 28 days to 8 months 30 days are shown as-9999V.
b) Current data: in all the provided current data, the number of the distribution transformers for data acquisition is 8271, the total number of the acquired data points is 84,205,344, the total number of the missing data (NULL) is 4,390,384, and the proportion of the total missing is 5.21%. The current data collected by 4464 distribution transformers are complete and have no missing data. The 10kV Danfo small town 7# transformer, the China grain color cloud house #1 power distribution station #1 main transformer, the China grain color cloud house #1 power distribution station #2 main transformer, the China grain color cloud house #1 power distribution station #3 main transformer, the Yiwu commercial city #5 transformer, the twenty-first century modern city #7 transformer, the five-community 5# public transformer, the Wuyi oasis #2 box transformer, the Jiangshan house #3 transformer, the lake mountain honor moon #3 American transformer, the lake bridge north garden 23 transformer, the Yuzhuan #2 transformer, the Baijiahu international garden B area temporary transformer #1, the Baijia international garden B area temporary transformer #3 and the Baijiahu international garden B area temporary transformer #2 (15 in total) have the largest missing data proportion, and are all 97.92 percent.
Among the data without deletion, 2195 data are-9999, which belong to obvious abnormal data, the proportion is 0.003%, the current data collected in the second 8 months of Yang bridge Yang is concentrated, and all the abnormal current values are greater than 700A.
c) Active power data: in all the provided active power data, there are 8153 distribution transformers for data acquisition, the total number of acquired data points is 59,650,944, the total number of missing data (NULL) is 1,748,960, and the proportion of the total missing is 2.93%. Wherein, the active power data collected by 4111 distribution transformers are complete and have no missing data. The missing data rate of 122 pairs of distribution transformers was the largest and was 97.92%.
On the basis of data overall analysis, the original data set is preprocessed, missing values of the original data set are processed, and obvious abnormal values are removed. The processing for missing values is divided into two cases: one is to process missing individual data in a continuous time data set by using a conventional linear interpolation method. And secondly, for the missing mass data in the continuous time data set, interpolation processing cannot be carried out, and the data can be directly removed. The elimination of the obvious abnormal value refers to the elimination of data which is shown as-9999 in the voltage, current and active power data.
The method adopts four models of a prototype clustering method, a density clustering method, a probability density method and a deep learning theory to detect and discriminate abnormal values of voltage, current and active power data, and aims to search out abnormal value data points from original data, thereby improving the data quality and providing effective data support for other related services. In the test, the A-phase current, A-phase voltage and active power curves 1-3 months before the fault are taken for abnormal point detection. When large-scale data loss occurs, the curve before the data loss is selected for calculation.
3) Discriminating process by using acquired abnormal data
In order to verify the performance of the model, certain noise is artificially added on the basis of original data, and whether the model can detect the interference is tested. On the basis, models such as a prototype clustering method, a density clustering method, a probability density method, a deep learning method and the like are used for actual data, and abnormal values of the original data of the voltage, the current and the active power are detected. The specific implementation process is as follows:
31) based on current, voltage and active data collected by a normal operation metering device, random noises and interferences with different degrees are added in original data, the noises obey normal distribution, normal distribution noise points are randomly added to the original data, the original data is simulated into abnormal points after the noises are added, and a time sequence containing the abnormal points is formed. And testing the interference and the noise by using a prototype clustering method, a density clustering method, a probability density method and a deep learning method, and checking the precision. In the test, the final abnormal value detection result is as follows: and the points at which the detection results of any two or more models are abnormal are finally determined abnormal points.
32) The method is the same as the step 31), the data before the fault occurs in the known fault ammeter is subjected to abnormal point screening, and four models are adopted to respectively perform multi-angle abnormal point screening on the data of current, voltage, active power and the like.
33) And analyzing the classified abnormal points, and searching for rules and common points.
4) Abnormal data acquisition screening method based on multi-criterion fusion
And respectively adopting a prototype clustering method, a density clustering method, a probability density method, a deep learning method and other four methods to screen abnormal values, and comparing the abnormal value judgment precision of each model.
41) Prototype clustering method: the input and output of the model for detecting abnormal values by the prototype clustering method are shown in table 1.
TABLE 1 input/output of abnormal value detection model by prototype clustering method
The method for discriminating the abnormal value based on the prototype clustering method comprises the following steps:
a) selecting clustering attributes; such as: when the abnormal value is screened for the voltage time sequence, the actual voltage value and the voltage change value are used as clustering attributes, and the combined action of the voltage value and the change speed on the abnormal value is comprehensively considered. Similarly, when the current and power time series are subjected to abnormal value discrimination, the clustering attributes are a current actual value, a current change value, a power actual value and a power change value respectively.
b) And (5) clustering the samples into 4 classes by adopting a k-means algorithm, and determining the mass centers of the classes.
c) And calculating the distance from each point to be judged to the nearest cluster center.
d) And calculating the relative distance from each point to be judged to the nearest cluster center.
e) Compared to a given threshold. The threshold value is determined according to the voltage characteristics of each distribution transformer area. And if the relative distance from the point to be determined to the nearest cluster center is greater than the threshold value, the point to be determined is considered to be an outlier.
Through the steps, an abnormal value detection result based on a prototype clustering method can be obtained.
42) Density clustering method: the density clustering method assumes that the class can be determined by how closely the samples are distributed, and can classify the samples into a dense sample class and a discrete sample noise point. The method comprises the following specific steps:
a) and (4) respectively drawing a voltage-current two-dimensional plane distribution diagram, a current-power plane distribution diagram and a voltage-power plane distribution diagram by considering the relation between every two of the voltage, the current and the power.
b) Setting a maximum distance d, and when the distance between two points in the plane distribution map exceeds d, considering that the two points are not density reachable, namely that the two points do not belong to the same category;
c) and (4) circularly and iteratively finding a series of sample points with all reachable densities, and dividing the points into one class. The remaining points that do not belong to any class are noise.
The density clustering outlier detection model is shown in table 2.
TABLE 2 Density clustering abnormal value detection model input and output
43) Probability density method: the deviation-based data outlier detection method mainly judges outliers according to the "3 σ" criterion. If the data obeys a normal distribution, an abnormal value is defined as a value that deviates from the mean by more than 3 standard deviations among the measured values under the "3 σ" criterion. Under the assumption of normal distribution, the probability of occurrence of values other than 3 σ from the average value isThis probability belongs to a very individual small probability event. In the formula, σ is a normal distribution standard deviation of the original data.
For test data r1,r2,…,rnRepresenting the input current, power or voltage time series, taking the arithmetic mean:
where n is the number of samples in the current, power or voltage sequence.
And residual error valueThe root mean square deviation was found to be:
the basis for the abnormal value is as follows: if it isThe value is abnormal data; if it isThen r isiIs normal data.
However, for the voltage, current and power data actually measured on site, the probability distribution type is difficult to judge in advance, and the data is generally not in accordance with the normal distribution. Therefore, the 3 sigma criterion is adopted to judge that the abnormal value has larger error and is difficult to completely describe the probability distribution condition of the voltage, the current and the power.
The input and output of the probability density method for carrying out the abnormal value screening model are shown as 3.
TABLE 3 probability Density abnormal value detection model input/output
The method for detecting the abnormal value based on the probability density method is concretely implemented as follows:
a) acquiring voltage, current and power data;
b) when abnormal values of different time sequences such as voltage, current, power and the like are screened, a kernel density function is adopted to respectively fit probability distribution of voltage, current and power data, and a probability density function is obtained;
c) for any value d to be detected, the probability of occurrence of the [ d, + ∞ ] numerical range can be calculated by performing integral calculation on the probability density function, and compared with a threshold value, namely whether the probability is lower than the probability 0.003 corresponding to 3 sigma; if yes, the point is an abnormal point;
d) and judging whether the data is abnormal data according to the comparison result.
Through the steps, the abnormal value detection result based on the probability density method can be obtained.
44) Deep learning algorithm model
The abnormal value discrimination model based on the deep learning algorithm carries out scientific and reasonable prediction on future current, voltage or power data by the trained deep learning model according to the principle of a statistical method, and the error of a predicted value and an actual value is compared. If the predicted value is far away from the true value, the point is an abnormal value point. If the predicted value fluctuates around the true value, the point error belongs to a normal random error, that is, the point is a normal point. The long-term and short-term memory network has good long-term sequence processing capacity, can realize the storage and control of remote information, and is favorable for providing accurate current, voltage and power predicted values. Therefore, on the basis of analyzing the basic principle of the long-short-term memory network (LSTM), the invention respectively establishes a long-short-term memory network prediction model for the current, the voltage and the power, and realizes the discrimination of the abnormal values of the current, the voltage and the power time sequence. The deep learning abnormal value detection model input and output are shown in table 4.
TABLE 4 deep learning method abnormal value detection model input and output
In the traditional artificial neural network model, neurons between an input layer and a hidden layer and between the hidden layer and an output layer are all connected, and the neurons in all layers are not connected. However, the method of processing each sample independently and independently ignores the relevance between input data at the front time and the back time, and has poor processing capability on certain long-time sequence problems such as natural language processing, machine translation and the like. A recurrent-neural network (RNN) is an important network structure in the field of deep learning, and is typically characterized in that neurons not only have internal feedback connections, but also have feedforward connections. The RNN is prone to gradient disappearance and gradient explosion in the training process, so that the RNN cannot capture the influence of the remote output on the current output, and the wide application and development of the RNN are limited.
The calculation process of each part of the LSTM structure is explained with reference to fig. 1, where the specific meaning of each input/output variable is: x is the number oftRepresenting 96 points of history for model inputCurrent, voltage and power; in FIG. 1(e), otAn output gate for LSTM representing the current, voltage or power at the moment to be predicted; and E is a prediction error, namely the difference between a predicted value and an actual value output by the model, and is used for judging whether the abnormal point exists. The remaining variables are intermediate variables and parameters of the model.
Training algorithm of long-short term memory network: after the forward calculation is completed, the model parameters can be updated and adjusted by adopting an error back propagation algorithm. There are 4 sets of parameters that LSTM needs to learn, namely: wfAnd bf、WiAnd bi、WoAnd bo,WcAnd bc. For the purpose of derivation, the weight matrix W is usedf、Wi、Wo、WcWritten as two separate matrices: wfh、Wfx、Wih、Wix、Woh、Wox、Wch、Wcx。ht-1Is the LSTM output, x, of the previous time instanttIs input at the current time, bf、bi、bo、bcThe offset of the forgetting gate structure, the input gate structure, the output gate structure and the input unit at the current moment are respectively.
Defining an error term δ at time ttAs a derivative of the loss function with respect to the output value, i.e.Meanwhile, defining weighted input of each neuron and error terms thereof as:
when the error reversely propagates along the time, the error term delta at the t-1 moment is calculatedt-1Comprises the following steps:
in the formula:is a jacobian matrix.
Current time cell state ctFrom the state c of the cell at the previous momentt-1Multiplication by element of forget gate ftAnd current input unit status c'tMultiplying input Gate i by elementtTwo parts are formed. Due to ot、ft、it、c′tIs ht-1The function of (c) can be obtained using the full derivative formula:
further, there can be obtained:
symbolWhich means multiplication by element.
Substituting formula (7) into (6) can obtain:
by deltao,t、δf,t、δi,t、δc′,tBy definition of (a), it can be known that:
when the error is reversely transferred from the current l layer to the l-1 layer, the l-1 layer error is definedIs composed ofI.e., the derivative of the error function to the l-1 layer weighted input. Due to the fact thatAnd isAre all xtUsing the full derivative formula to obtain:
thus, obtain Woh、Wfh、Wih、WchThe gradient of each parameter is:
Wox、Wfx、Wix、Wcxthe gradient calculation formula is:
bf、bi、bo、bcthe gradient calculation formula is:
based on the forward calculation and error back propagation algorithm of LSTM, a deep LSTM network framework as shown in fig. 2 can be constructed.
55) And taking the point that the abnormal value detection results of any two models in the four models are abnormal as the final abnormal value detection result.
Examples
According to the embodiment of the invention, random noise and interference with different degrees are added in original data to form abnormal points based on current, voltage and active data acquired by a normally running metering device, and the interference and noise are tested by using the four models to check the precision. In the test, the final abnormal value detection result is the intersection of the detection results of the four models. And testing whether the four models can effectively detect the abnormal value points by setting random errors with different degrees so as to verify the effectiveness of the method. The method comprises the following specific steps:
1) test 1: the electric meter with the number of 15661 is selected, the time range is from 5/month and 3/2017 to 5/month and 31/month, and 2785 points are counted in total, wherein the average value of the A-phase voltage is 228.891V, the maximum value is 232.8V, and the minimum value is 221.9V. A normally distributed error with a mean value of 0 and a standard deviation of 6 (a-phase voltage) is randomly generated and these disturbances are placed randomly in the original time series of voltages. Table 5 shows the random error magnitude and the addition point.
TABLE 5 Voltage points with artificially added noise (small noise perturbations)
a) And (3) abnormal value discrimination test based on a prototype clustering method: the model parameters are set as: the clustering category is 4, the threshold attempt of the abnormal value point judgment criterion is set to be 2.75, and the maximum cycle number of clustering is 500. The distance function adopts the Euclidean distance:
the actual voltage value of a certain point, the voltage change value between a certain point and a previous point, and the voltage change value between a certain point and a previous point are taken as clustering attributes, the magnitude of the voltage value and the magnitude of the change speed are comprehensively considered, the clustering result is shown in fig. 3, the number of detected correct points is 3, and the IDs are 64, 372, and 2192.
b) And (3) abnormal value discrimination test based on a density clustering method: the model parameters are set as: the maximum distance is set to be 0.5, the sample point normalization range is (0,4), the minimum sample number of a category is 5, and the distance function is the Euclidean distance calculation method. As a result of the test, as shown in fig. 4(a) and 4(b), the anomaly IDs are 372, 663, 995, 997, and 2192.
c) And (3) abnormal value screening test based on a probability density method: fig. 5 is a probability density curve of the voltage change value, and it can be seen from fig. 5 that the voltage change value is centrally distributed near 0, the voltage change value basically follows a normal distribution, and the probability of the voltage change value occurring at a certain value can be obtained by combining with the probability density function. According to the 3 sigma criterion, assuming that the probability of the abnormal voltage change value appearing below one thousandth is very small, the end point values of the voltage change value can be calculated to be-1.7516 and 1.7075. Namely, the following is considered: the normal voltage variation range is [ -1.7516, 1.7075 ]. Outside this range is an abnormal voltage change.
Table 6 shows the abnormal value detection results obtained by the probability density method.
TABLE 6 probability Density method outlier search results (Small noise disturbance)
d) And (3) abnormal value discrimination test based on a deep learning algorithm: the voltage is predicted a little ahead using the LSTM deep learning algorithm. The model parameters are set as: the four-layer recurrent neural network comprises an input layer (96 multiplied by 1 sequence input), an LSTM layer (8 nodes), a common hidden layer (4 nodes) and an output layer (1 node). Input and output: the current value at the next time is predicted using the data of the last history 96 points (sampling interval 15 min). The model is optimized to be RMSProp (random gradient descent algorithm with momentum), the iteration number is 400, the training batch number is 512 (more than 2000 training samples in total, and one iteration is approximately divided into 4-5 batches), and the proportion of the verification set in the training samples is 5%. The objective function is the mean square error MSE of the model output value and the real value.
The data of 96 days before the predicted use, ID +96 is the actual time value, and FIG. 6 shows the prediction result.
FIG. 7 is a curve obtained by subtracting the true value from the predicted value of the LSTM model, and the abnormal IDs are 372, 866, 998, 2192 and 2193.
The four process significant anomaly points are summarized in table 7.
TABLE 7 abnormal point search results (small noise disturbance) of four detection methods
And (4) conclusion: with the double-cross inspection method, 3 outliers were detected out of 5 set outliers.
Missing point analysis is shown in table 8.
TABLE 8 missing points for anomaly detection (small noise disturbance)
Serial number | Adding an ID | U original value (V) | Random error (V) | U abnormal point value (V) |
3 | 663 | 229.3 | -1.4623 | 227.8377 |
4 | 1163 | 230.6 | 1.294 | 231.894 |
From the data, the error of the abnormal point No. 3 is 0.63%, and the error of the abnormal point No. 5 is 0.56%, and the error is small. The average value of the A-phase voltage is 228.891V, the maximum value is 232.8V, and the minimum value is 221.9V. And the average change value of the voltage is 0.4083V (the last voltage value minus the last voltage value), maximum +6.8V and minimum-6.7V. Therefore, the voltage values after the two random errors are added are slightly deviated from the original values, and are difficult to detect.
2) And (3) testing 2: the standard deviation was changed to 8 (a-phase voltage). The random error and the addition point are shown in table 9.
TABLE 9 Voltage points for artificially adding noise (loud disturbance)
Serial number | Adding an ID | U original value (V) | Random error (V) | U abnormal point value (V) |
1 | 64 | 227.9 | -0.9325 | 226.9675 |
2 | 372 | 228.2 | -5.0017 | 223.1983 |
3 | 663 | 229.3 | 9.4597 | 238.7597 |
4 | 1163 | 230.6 | -1.5734 | 229.0266 |
5 | 2192 | 228.6 | -4.1232 | 224.4768 |
a) Using a clustering-based outlier detection method: as a result of using the actual voltage value and the voltage change value as the cluster attributes and comprehensively considering the magnitude of the voltage value and the magnitude of the change rate, as shown in fig. 8, the number of detected correct points is 3 points, and the abnormality IDs are 372, 663, and 2192.
b) Using the density clustering outlier detection method, the results are shown in fig. 9(a) and (b), and the outlier IDs are 663, 994, 995, 997.
c) Using a probability density algorithm:
and (3) combining the probability density function to obtain the probability of a certain voltage change value: assuming that the probability of occurrence of one in a thousand or less is very small, consider: the normal voltage variation range is [ -1.9255, 1.7397], and those out of this range are abnormal values. Table 10 shows the result of discrimination of abnormal values by probability density method.
TABLE 10 outlier search results of probability density method (loud disturbance)
Fault ID | Active power value (kW) | Voltage value (V) | Current value (A) |
120 | 0.0512 | 231.2 | 0.061 |
372 | 0.0953 | 228.3 | 0.187 |
373 | 0.1035 | 223.1983 | 0.115 |
663 | 0.1092 | 229.4 | 0.183 |
664 | 0.0805 | 238.7597 | 0.126 |
828 | 0.0419 | 226.5 | 0.07 |
866 | 0.072 | 232.6 | 0.109 |
977 | 0.0407 | 228 | 0.065 |
998 | 0.0422 | 222.2 | 0.067 |
1163 | 0.0419 | 230.8 | 0.085 |
1302 | 0.0824 | 228.1 | 0.12 |
1314 | 0.0574 | 227.2 | 0.073 |
1315 | 0.0639 | 225.4 | 0.132 |
1780 | 0.0709 | 228.6 | 0.14 |
2160 | 0.0512 | 228.8 | 0.096 |
2192 | 0.1183 | 228.4 | 0.196 |
d) Using the LSTM algorithm: the data of 96 days before the use is predicted, and ID +96 is an actual time value. Fig. 10 shows the true voltage value and the LSTM prediction result, fig. 11 shows the LSTM model error, i.e., the predicted model value minus the true value to obtain a curve, and the abnormal IDs are 372, 663, 866, 2192, and 2768.
A summary of the four method significant anomaly points is shown in Table 11.
TABLE 11 abnormal point search results (loud noise disturbance) for four detection methods
And (4) test conclusion: 3 abnormal points are detected in 5 abnormal points by adopting a double-crossing detection method.
The missing point analysis is shown in table 12.
TABLE 12 missing points for anomaly detection (loud disturbance)
Serial number | Adding an ID | U original value (V) | U random error (V) | U abnormal point value (V) |
1 | 64 | 227.9 | -0.9325 | 226.9675 |
4 | 1163 | 230.6 | -1.5734 | 229.0266 |
From the data, the error of the abnormality point No. 1 was 0.41%, and the error of the abnormality point No. 5 was 0.68%, which was small. The average value of the A-phase voltage is 228.891V, the maximum value is 232.8V, and the minimum value is 221.9V. And the average change value of the voltage is 0.4083V (the last voltage value minus the last voltage value), maximum +6.8V and minimum-6.7V. Therefore, the voltage values after the two random errors are added are slightly deviated from the original values, and are difficult to detect.
3) Current and power test results of real electric meter
And carrying out abnormal value detection research on the voltage, current and power data to obtain test current and power abnormal points, wherein the test ammeter comprises a known fault ammeter and an undiscovered fault ammeter.
(1) Abnormal large rate of change of current
The current curve of the electricity meter with the number of 49932 was analyzed, and the current curve and the current variation curve of 5 months and 7 days are shown in fig. 12. As can be seen from the figure, the current change value is large at the 37 th point, and it is determined as an abnormal value point.
(2) Power rate of change anomaly
Fig. 13 shows the power and the power variation value of the electric meter numbered 29047 on day 5/month and 3, and as can be seen from the figure, the power variation value is large at points 62, 63 and 64, and it is determined as an abnormal value point.
Fig. 14 is a graph of the abnormal value screening power, current and variation thereof of the 45000 electric meter at 5 months and 22 days. As can be seen from fig. 14(a), the power variation value is large at the 46 th point, belonging to the obvious burr point, and the value thereof is an abnormal value. At this time, since the current change value in fig. 14(b) is small and does not reach the change value determination threshold, the current change at point 46 is within the normal range and is not determined to be abnormal. In summary, the power abnormality is determined at point 46.
(3) Abnormal current power dependence
Fig. 15 shows the a-phase power curve and a-phase current curve of 29047 meter at 5 months and 2 days, and it can be seen that there are significant abnormalities in the current and power dependence at four points 22,34,55 and 60, namely, there are two cases:
1) when the current obviously rises or maintains a larger value, the power shows an abnormal drop phenomenon or maintains a lower value level;
2) when the current is obviously reduced or maintained at a lower value, the power shows an abnormal rising phenomenon or is maintained at a higher value level.
Similarly, the current meter, numbered 64258, in fig. 16, shows the a-phase current and a-phase power curve for 5 months and 6 days. The 34 th and 35 th day show that the correlation abnormal phenomenon exists.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (9)
1. A method for discriminating abnormal data acquisition of distribution transformer based on multi-criterion fusion is characterized by comprising the following steps:
acquiring original data for distribution transformer;
preprocessing the original data adopted by the distribution transformer;
randomly adding noise points to the preprocessed distribution transformation sampling original data to form a sampling data sequence containing abnormal points;
respectively adopting four models of a prototype clustering method, a density clustering method, a probability density method and a deep learning method to discriminate abnormal points of the sampling data sequence containing the abnormal points;
determining abnormal data for distribution transformer; and the distribution transformer adopts the intersection of abnormal data which are the discrimination results of the abnormal points of any two models, and then takes the union of the intersection determined by the combination of every two models.
2. The method for discriminating the abnormal data of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the obtaining of the original data of the distribution transformer comprises:
the metering device collects three-phase current, three-phase voltage and active power original data based on normal operation, and the collection time interval is 15 min.
3. The method for discriminating the abnormal data of the distribution transformer based on the multi-criterion fusion as claimed in claim 2, wherein the preprocessing the original data of the distribution transformer comprises:
carrying out missing value processing and removing obvious abnormal values on the original data adopted by the distribution transformer;
the missing value processing of the original data adopted by the distribution transformer comprises the following steps: missing individual data in the continuous time data set, and filling up the missing data by adopting a linear interpolation method; directly eliminating a large amount of data missing in the continuous time data set;
the removing of the obvious abnormal value refers to removing data which are displayed as-9999 from the three-phase current, three-phase voltage and active power original data.
4. The method for discriminating the abnormal data of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein when the large-scale data loss occurs in the original data of the distribution transformer, the curve before the data loss is selected for calculation.
5. The method for discriminating the abnormal data of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the noise points are subject to normal distribution.
6. The method for selecting the abnormal points of the data acquisition sequences for the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the method for selecting the abnormal points of the data acquisition sequences containing the abnormal points by using the prototype clustering method comprises the following steps:
determining a clustering attribute; the method comprises the following steps: selecting an actual voltage value of a point to be detected, a voltage change value of the point to be detected and a previous voltage change value of the point to be detected and a previous voltage change value of the point to be detected as clustering attributes of the voltage time sequence, selecting an actual current value of the point to be detected, a current change value of the point to be detected and the previous voltage change value as clustering attributes of the current time sequence, and selecting an actual active power value of the point to be detected, an active power change value of;
according to the clustering attributes, clustering the time sequences to be detected into 4 classes by adopting a k-means algorithm, and determining the mass centers of the classes;
calculating the distance from each point to be detected to the nearest clustering center and the relative distance from each point to be detected to the nearest clustering center according to each type of centroid;
comparing the relative distance from each point to be detected to the nearest clustering center with a given threshold value; and if the relative distance from a certain point to be detected to the nearest clustering center is greater than a given threshold value, the point to be detected is an outlier, namely an outlier.
7. The method for selecting the abnormal data collection of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the selecting the abnormal points of the data collection sequences containing the abnormal points by using a density clustering method comprises:
respectively drawing a voltage-current plane distribution diagram, a current-active power plane distribution diagram and a voltage-active power plane distribution diagram;
clustering points on the histogram, including: if the distance between two points in the plane distribution map does not exceed the set maximum distance, classifying the points into one class; points on the voltage-current plane distribution diagram are a certain phase current time sequence and a certain voltage time sequence; points on the current-active power plane distribution diagram are a certain phase current time sequence and an active power time sequence; points on the voltage-active power plane distribution diagram are a certain phase voltage time sequence and an active power time sequence;
and (5) performing loop iteration to find out points which do not belong to any class, namely abnormal points.
8. The method for selecting the abnormal data collection of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the selecting the abnormal points of the data collection sequences containing the abnormal points by using a probability density method comprises:
determining a model input and a model output; the model inputs are: the current time sequence is the change value of the current at the point to be determined and the current at the previous point; the voltage time sequence is the change value of the voltage at the point to be determined and the voltage at the previous point; the active power time sequence is the change value of the voltage of a point to be determined and the voltage of the previous point; the model output is: normal range of variation values;
respectively fitting probability distribution of voltage, current and active power data by adopting a kernel density function, and obtaining a probability density function;
for any value d to be detected, integral calculation is carried out on the probability density function to obtain the probability of the occurrence of the [ d, + ∞ ] numerical range, and the probability is compared with a threshold value, namely whether the probability is lower than the probability 0.003 corresponding to 3 sigma or not; if yes, the numerical value to be detected is an abnormal point.
9. The method for selecting the abnormal data collection of the distribution transformer based on the multi-criterion fusion as claimed in claim 1, wherein the selecting the abnormal points of the data collection sequence with the added abnormal points by using a deep learning method comprises:
predicting future current, voltage or power data by adopting a deep learning model based on a long-term and short-term memory network trained by current, voltage and power time sequences, and comparing errors of predicted values and true values; if the deviation of the predicted value from the true value exceeds a set threshold value, the predicted value is an abnormal point;
the deep learning model of the long-short term memory network is as follows: after completing the forward calculation, the model parameters are updated and adjusted by adopting an error back propagation algorithm, and the method comprises the following steps:
neuron weighted input net of long-short term memory network at time tf,t,neti,t,netc′,t,neto,tComprises the following steps:
wherein, Wox、Wfx、Wix、Wcx、Woh、Wfh、Wih、WchRepresents a weight, ht-1Is the LSTM output, x, of the previous time instanttIs input at the current time, bf、bi、bo、bcThe offset of the forgetting gate structure, the input gate structure, the output gate structure and the input unit at the current moment are respectively;
neuron error term delta of long-term and short-term memory network at time tf,t,δi,t,δc′,t,δo,tComprises the following steps:
wherein E is a prediction error;
error term delta at time t-1 when the error propagates in the reverse direction in timet-1Comprises the following steps:
wherein,is a Jacobian matrix;
when the error is reversely transferred from the current l layer to the l-1 layer, the l-1 layer errorComprises the following steps:
finally, the weight W is obtainedoh、Wfh、Wih、WchComprises the following steps:
wherein, Woh,t,Wfh,t,Wih,t,Wch,tRespectively representing the weight of T time, and superscript T representing transposition;
weight Wox、Wfx、Wix、WcxComprises the following steps:
bf、bi、bo、bccomprises the following steps:
wherein, bo,t,bf,t,bi,t,bc,tRespectively, representing the error term at time t.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740107.2A CN110458230A (en) | 2019-08-12 | 2019-08-12 | A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910740107.2A CN110458230A (en) | 2019-08-12 | 2019-08-12 | A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110458230A true CN110458230A (en) | 2019-11-15 |
Family
ID=68485974
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910740107.2A Pending CN110458230A (en) | 2019-08-12 | 2019-08-12 | A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110458230A (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807014A (en) * | 2019-09-24 | 2020-02-18 | 国网北京市电力公司 | Cross validation based station data anomaly discrimination method and device |
CN111241656A (en) * | 2019-12-28 | 2020-06-05 | 国网江西省电力有限公司电力科学研究院 | Distribution transformer outlet voltage abnormal point detection algorithm |
CN111967509A (en) * | 2020-07-31 | 2020-11-20 | 北京赛博星通科技有限公司 | Method and device for processing and detecting data acquired by industrial equipment |
CN111984925A (en) * | 2020-07-29 | 2020-11-24 | 江苏方天电力技术有限公司 | Circuit abnormity positioning method based on loop impedance, storage medium and computing equipment |
CN112084229A (en) * | 2020-07-27 | 2020-12-15 | 北京市燃气集团有限责任公司 | Method and device for identifying abnormal gas consumption behaviors of town gas users |
CN112464848A (en) * | 2020-12-07 | 2021-03-09 | 国网四川省电力公司电力科学研究院 | Information flow abnormal data monitoring method and device based on density space clustering |
CN112669879A (en) * | 2020-12-24 | 2021-04-16 | 山东大学 | Air conditioner indoor unit noise anomaly detection method based on time-frequency domain deep learning algorithm |
CN112819088A (en) * | 2021-02-20 | 2021-05-18 | 苏州安极能新能源发展有限公司 | Anomaly detection algorithm based on power data |
CN112818052A (en) * | 2021-02-25 | 2021-05-18 | 云南电网有限责任公司电力科学研究院 | Abnormal voltage data detection method and device |
CN112819373A (en) * | 2021-02-25 | 2021-05-18 | 云南电网有限责任公司电力科学研究院 | Distribution network voltage abnormal data detection method and device |
CN113076945A (en) * | 2021-03-17 | 2021-07-06 | 华夏芯(北京)通用处理器技术有限公司 | Camera direct-reading meter reading instrument abnormal point removing method based on improved RANSAC |
CN113434496A (en) * | 2021-07-15 | 2021-09-24 | 广东电网有限责任公司 | Distribution transformer weight overload real-time monitoring system method, system and computer medium |
CN113554117A (en) * | 2021-08-16 | 2021-10-26 | 中国南方电网有限责任公司 | Abnormal load data identification method and electronic equipment |
CN113762507A (en) * | 2021-08-24 | 2021-12-07 | 浙江中辰城市应急服务管理有限公司 | Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction |
CN114997253A (en) * | 2021-02-23 | 2022-09-02 | 哈尔滨工业大学 | Intelligent state anomaly detection method, monitoring system and monitoring method for satellite constellation |
CN116150239A (en) * | 2022-12-16 | 2023-05-23 | 彭州华润燃气有限公司 | Data mining method for gas stealing behavior |
WO2023246070A1 (en) * | 2022-06-22 | 2023-12-28 | 中国移动通信集团广东有限公司 | Abnormal power consumption station detection method and apparatus, electronic device, and readable storage medium |
CN118247044A (en) * | 2024-04-15 | 2024-06-25 | 武汉卓尔数字传媒科技有限公司 | Supply chain financial credit granting method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101561878A (en) * | 2009-05-31 | 2009-10-21 | 河海大学 | Unsupervised anomaly detection method and system based on improved CURE clustering algorithm |
CN103901880A (en) * | 2014-04-01 | 2014-07-02 | 浙江大学 | Industrial process fault detection method based on multiple classifiers and D-S evidence fusion |
CN108536741A (en) * | 2018-03-09 | 2018-09-14 | 国网江苏省电力公司无锡供电公司 | A kind of energy method for monitoring abnormality |
-
2019
- 2019-08-12 CN CN201910740107.2A patent/CN110458230A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101561878A (en) * | 2009-05-31 | 2009-10-21 | 河海大学 | Unsupervised anomaly detection method and system based on improved CURE clustering algorithm |
CN103901880A (en) * | 2014-04-01 | 2014-07-02 | 浙江大学 | Industrial process fault detection method based on multiple classifiers and D-S evidence fusion |
CN108536741A (en) * | 2018-03-09 | 2018-09-14 | 国网江苏省电力公司无锡供电公司 | A kind of energy method for monitoring abnormality |
Non-Patent Citations (1)
Title |
---|
罗慧等: "基于长短期记忆网络的智能用电数据甄别方法", 《广东电力》 * |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807014A (en) * | 2019-09-24 | 2020-02-18 | 国网北京市电力公司 | Cross validation based station data anomaly discrimination method and device |
CN111241656A (en) * | 2019-12-28 | 2020-06-05 | 国网江西省电力有限公司电力科学研究院 | Distribution transformer outlet voltage abnormal point detection algorithm |
CN111241656B (en) * | 2019-12-28 | 2023-03-14 | 国网江西省电力有限公司电力科学研究院 | Distribution transformer outlet voltage abnormal point detection algorithm |
CN112084229A (en) * | 2020-07-27 | 2020-12-15 | 北京市燃气集团有限责任公司 | Method and device for identifying abnormal gas consumption behaviors of town gas users |
CN111984925A (en) * | 2020-07-29 | 2020-11-24 | 江苏方天电力技术有限公司 | Circuit abnormity positioning method based on loop impedance, storage medium and computing equipment |
CN111984925B (en) * | 2020-07-29 | 2024-03-12 | 江苏方天电力技术有限公司 | Circuit abnormality positioning method based on loop impedance, storage medium and computing device |
CN111967509A (en) * | 2020-07-31 | 2020-11-20 | 北京赛博星通科技有限公司 | Method and device for processing and detecting data acquired by industrial equipment |
CN111967509B (en) * | 2020-07-31 | 2024-10-18 | 北京赛博星通科技有限公司 | Method and device for processing and detecting data acquired by industrial equipment |
CN112464848A (en) * | 2020-12-07 | 2021-03-09 | 国网四川省电力公司电力科学研究院 | Information flow abnormal data monitoring method and device based on density space clustering |
CN112464848B (en) * | 2020-12-07 | 2023-04-07 | 国网四川省电力公司电力科学研究院 | Information flow abnormal data monitoring method and device based on density space clustering |
CN112669879A (en) * | 2020-12-24 | 2021-04-16 | 山东大学 | Air conditioner indoor unit noise anomaly detection method based on time-frequency domain deep learning algorithm |
CN112669879B (en) * | 2020-12-24 | 2022-06-03 | 山东大学 | Air conditioner indoor unit noise anomaly detection method based on time-frequency domain deep learning algorithm |
CN112819088A (en) * | 2021-02-20 | 2021-05-18 | 苏州安极能新能源发展有限公司 | Anomaly detection algorithm based on power data |
CN114997253A (en) * | 2021-02-23 | 2022-09-02 | 哈尔滨工业大学 | Intelligent state anomaly detection method, monitoring system and monitoring method for satellite constellation |
CN112819373A (en) * | 2021-02-25 | 2021-05-18 | 云南电网有限责任公司电力科学研究院 | Distribution network voltage abnormal data detection method and device |
CN112818052A (en) * | 2021-02-25 | 2021-05-18 | 云南电网有限责任公司电力科学研究院 | Abnormal voltage data detection method and device |
CN113076945A (en) * | 2021-03-17 | 2021-07-06 | 华夏芯(北京)通用处理器技术有限公司 | Camera direct-reading meter reading instrument abnormal point removing method based on improved RANSAC |
CN113076945B (en) * | 2021-03-17 | 2024-05-14 | 华夏芯(北京)通用处理器技术有限公司 | Method for eliminating abnormal points of camera direct-reading meter reading instrument based on improved RANSAC |
CN113434496A (en) * | 2021-07-15 | 2021-09-24 | 广东电网有限责任公司 | Distribution transformer weight overload real-time monitoring system method, system and computer medium |
CN113554117A (en) * | 2021-08-16 | 2021-10-26 | 中国南方电网有限责任公司 | Abnormal load data identification method and electronic equipment |
CN113762507B (en) * | 2021-08-24 | 2023-12-29 | 浙江中辰城市应急服务管理有限公司 | Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction |
CN113762507A (en) * | 2021-08-24 | 2021-12-07 | 浙江中辰城市应急服务管理有限公司 | Semi-supervised deep learning arc voltage anomaly detection method based on phase space reconstruction |
WO2023246070A1 (en) * | 2022-06-22 | 2023-12-28 | 中国移动通信集团广东有限公司 | Abnormal power consumption station detection method and apparatus, electronic device, and readable storage medium |
CN116150239B (en) * | 2022-12-16 | 2023-09-22 | 彭州华润燃气有限公司 | Data mining method for gas stealing behavior |
CN116150239A (en) * | 2022-12-16 | 2023-05-23 | 彭州华润燃气有限公司 | Data mining method for gas stealing behavior |
CN118247044A (en) * | 2024-04-15 | 2024-06-25 | 武汉卓尔数字传媒科技有限公司 | Supply chain financial credit granting method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110458230A (en) | A kind of distribution transforming based on the fusion of more criterions is with adopting data exception discriminating method | |
CN110097297B (en) | Multi-dimensional electricity stealing situation intelligent sensing method, system, equipment and medium | |
CN110223196B (en) | Anti-electricity-stealing analysis method based on typical industry feature library and anti-electricity-stealing sample library | |
Tong et al. | Detection and classification of transmission line transient faults based on graph convolutional neural network | |
CN113298297B (en) | Wind power output power prediction method based on isolated forest and WGAN network | |
CN110879377B (en) | Metering device fault tracing method based on deep belief network | |
Zhao et al. | Hierarchical anomaly detection and multimodal classification in large-scale photovoltaic systems | |
Zhu et al. | Networked time series shapelet learning for power system transient stability assessment | |
Shi et al. | Expected output calculation based on inverse distance weighting and its application in anomaly detection of distributed photovoltaic power stations | |
Jia et al. | Defect prediction of relay protection systems based on LSSVM-BNDT | |
CN114004137A (en) | Multi-source meteorological data fusion and pretreatment method | |
CN112101420A (en) | Abnormal electricity user identification method for Stacking integration algorithm under dissimilar model | |
CN110807014B (en) | Cross validation based station data anomaly discrimination method and device | |
Kumar et al. | Deep Learning based Fault Detection in Power Transmission Lines | |
Zhou et al. | Real-time anomaly detection in distribution grids using long short term memory network | |
CN106646106B (en) | Electric network fault detection method based on outlier's detection technology | |
Wu et al. | Research on a location method for complex voltage sag sources based on random matrix theory | |
CN116893293A (en) | Transient voltage stability evaluation method based on spatial attention correction neural network | |
CN117764547A (en) | Photovoltaic string fault diagnosis method and system | |
CN115598459A (en) | Power failure prediction method for 10kV feeder line fault of power distribution network | |
Bashkari et al. | Distribution power system outage diagnosis based on root cause analysis | |
Amoateng et al. | An Intelligent Event Detection Framework To Improve Situational Awareness In PMU Power Distribution Networks | |
CN117371623B (en) | Electric energy meter running state early warning method and system | |
Shi et al. | A Novel Approach to Detect Electricity Theft Based on Conv-Attentional Transformer | |
Dhingra et al. | A machine learning based fault identification framework for smart grid automation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191115 |