CN111507412A - Voltage missing value filling method based on historical data auxiliary scene analysis - Google Patents
Voltage missing value filling method based on historical data auxiliary scene analysis Download PDFInfo
- Publication number
- CN111507412A CN111507412A CN202010311551.5A CN202010311551A CN111507412A CN 111507412 A CN111507412 A CN 111507412A CN 202010311551 A CN202010311551 A CN 202010311551A CN 111507412 A CN111507412 A CN 111507412A
- Authority
- CN
- China
- Prior art keywords
- attribute
- data
- date
- missing
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000004458 analytical method Methods 0.000 title claims abstract description 30
- 238000005452 bending Methods 0.000 claims abstract description 8
- 238000010219 correlation analysis Methods 0.000 claims abstract description 5
- 238000012216 screening Methods 0.000 claims abstract description 4
- 230000002354 daily effect Effects 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 8
- 230000037430 deletion Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 7
- 230000000717 retained effect Effects 0.000 claims description 5
- 230000003203 everyday effect Effects 0.000 claims description 2
- 230000007812 deficiency Effects 0.000 abstract description 3
- 230000000875 corresponding effect Effects 0.000 description 12
- 238000004364 calculation method Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000005429 filling process Methods 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/219—Managing data history or versioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Evolutionary Computation (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
The invention discloses a voltage deficiency value filling method based on historical data auxiliary scene analysis, which comprises the following steps: s1, acquiring historical data of the power grid; s2, calculating the fluctuation cross-correlation coefficient of each known attribute data and the missing attribute data through a fluctuation cross-correlation analysis algorithm; s3, screening out attribute data with large fluctuation reciprocity; s4, calculating a combined weight; s5, performing scene analysis on the missing date and searching for similar scenes in the historical data of the power grid; s6, measuring the similarity of the data of the other attributes in the missing time period through the dynamic time bending distance in the similar scene; s7, calculating comprehensive similarity by combining the combined weight; and S8, finding out the date with the highest comprehensive similarity, and filling up the missing attribute data by combining the data at the same time on the date with the horizontal data. The method can fully utilize the historical data of voltage-related attributes to fill the voltage missing value, and improves the accuracy of the voltage filling value.
Description
Technical Field
The invention relates to a historical data assisted scene analysis-based voltage missing value filling method, and belongs to the voltage identification technology of a power system.
Background
Along with the continuous development of power grids, the scale of the power grids is increased year by year, in the field of regulation and control, the accuracy and the integrity of data are particularly important for power grid control, but along with the exponential increase of the collected data quantity, the problem of voltage data loss caused by manual input and faults of a collecting device occurs occasionally, so that the lost data needs to be identified or supplemented, the traditional maximum Expectation (EM) Algorithm, the K neighbor algorithms (KNN, K neighbor Neighbors) and other methods provide solutions, but as less historical data are used as analysis bases, the filling effect is not ideal. In recent years, the research enthusiasm of big data is raised in all countries in the world, the big data technology injects fresh blood for the development of the smart grid and obtains better effect, so that a voltage missing value filling method based on historical data auxiliary scene analysis is provided, the filling precision of the voltage missing value is further improved, and the development requirement of the power grid is met.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a voltage deficiency value filling method based on historical data auxiliary scene analysis, so that the precision of filling data is improved, and the development requirement of a power grid is met.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a voltage missing value filling method based on historical data auxiliary scene analysis comprises the following steps:
s1, acquiring historical data of the power grid, and entering the step S2;
s2, calculating the fluctuation cross-correlation coefficient of each known attribute data and the missing attribute data at the same time through a fluctuation cross-correlation analysis algorithm, and entering the step S3;
s3, if the fluctuation cross-correlation coefficient of a certain known attribute data and the missing attribute data exceeds the comparison threshold, keeping the known attribute data, and entering the step S4; otherwise, abandoning the known attribute data;
s4, calling the attribute corresponding to the M reserved known attribute data as a Know attribute, calling the attribute corresponding to the missing attribute data as an Unknow attribute, and respectively calculating the combined weight of each Know attribute and the Unknow attribute;
s5, carrying out scene analysis on the dates containing the Unknow attributes, and searching H dates of the most similar scenes in the historical data of the power grid; the date containing the Unknow attribute is called a missing date, and the dates of the found H most similar scenes are called H similar dates;
s6, determining the time period of the missing attribute data in the missing date, and measuring the similarity of each Know attribute data of the missing date and each Know attribute data of each similar date through the dynamic time bending distance for the same time period of each similar date;
s7, combining the combined weight of each Know attribute and the Unknow attribute, and calculating Unk of each similar datenowComprehensive similarity of attributes;
s8, finding UnknowAnd integrating the date with the highest attribute comprehensive similarity, and combining the data at the same time on the date with the horizontal data to fill up the missing attribute data.
The invention starts from the historical data of the power grid, fully utilizes the correlation among the attribute data in the power grid, selects the attribute data with stronger correlation as the reference basis for filling the missing attribute data, calculates the combined weight to further quantify the correlation degree of the attribute pieces, ensures higher utilization degree of the strongly correlated attribute data, simultaneously measures the similarity degree of the data of each attribute at the missing moment and the historical data through the dynamic time complete distance, and finds out the data at the moment most similar to the missing moment to replace the data at the missing moment by matching with the combined weight. The method and the device fully utilize the correlation between the missing attribute data and other attribute data to solve the problem of filling the missing attribute data, and improve the accuracy of filling the missing attribute data.
Specifically, in step S1, the historical data of the power grid is derived from voltage data detection, bus balance detection, constraint preprocessing, proportion anomaly detection, initial power flow precision detection, and the like, and the historical data of the power grid needs to be preprocessed, suspected error data is selected, and whether subsequent optimization calculation can be performed is determined.
Specifically, in step S2, the calculation process of the fluctuating cross-correlation coefficient is as follows:
s21, for two equal-length time series xiAnd yiWherein i ═ 1,2, …, N;
s22, calculating xi、yiSum of differences from the mean:
wherein: l represents a sampling length, and Δ x (l), Δ y (l) represent xiAnd yiThe sum of the differences from the mean value at the sample length l,andrespectively represent xiAnd yiAverage value of (d);
s23, calculating to respectively represent xi、yiForward difference of autocorrelation:
Δx(l,l0)=x(l0+l)-x(l0),l0=1,2,…,N-l
Δy(l,l0)=y(l0+l)-y(l0),l0=1,2,…,N-l
wherein: 1,2, …, N-1 for each sampling periodl are all provided with0N-l differences, Δ x (l, l)0)、Δy(l,l0) Respectively represent xiAnd yiForward difference of the autocorrelation of (a);
s24, calculating xi、yiThe covariance of (a):
s25, calculating xi、yiFluctuating cross-correlation coefficient of (a): if xi、yiWhen there is a certain correlation, Covxy(l) Satisfy power law distribution
Wherein: h isxyDenotes xiAnd yiThe degree of correlation, i.e. the fluctuation correlation coefficient, is obtained by fitting a power law distribution to obtain a fluctuation correlation coefficient hxy(ii) a When h is generatedxyWhen 0, x is representediAnd yiNot related; when h is generatedxyWhen > 0, denotes xiAnd yiPositive correlation; when h is generatedxyWhen < 0, it represents xiAnd yiNegative correlation; h isxyLarger value indicates xiAnd yiThe higher the degree of correlation.
Considering that the attribute data is more, in order to avoid the influence of the attribute data with lower correlation on the filling result of the missing attribute data, setting a comparison threshold value of the fluctuation cross-correlation coefficient, and if the fluctuation cross-correlation coefficient of the known attribute data and the missing attribute data is lower than the comparison threshold value, considering that the reference value of the known attribute data is lower or has no reference value and abandoning the known attribute data; after the threshold comparison determination, M attribute data remain, and the corresponding attributes are referred to as M Know attributes, which are numbered from 1 to M. Since the correlation between the attribute data and the missing attribute data is different, the reference value and the utilization value are different, and a combination weight of the missing attribute needs to be set to ensure sufficient and reasonable utilization of the historical data.
Specifically, the larger the fluctuating cross-correlation coefficient is, the stronger the correlation between the known attribute data and the missing attribute data is, and the higher reference value of the known attribute data should be when the missing attribute data is filled, so that the weight should be higher; in step S4, the combination weight w of the Know attribute j and the Unknow attributejCalculated by the following formula:
wherein: m denotes the number of Know attributes (i.e. the number of attributes corresponding to the retained known attribute data), j is 1,2, …, M, cjAnd a fluctuation correlation coefficient representing the Know attribute j and the Unknow attribute (that is, the fluctuation correlation coefficient of the known attribute data corresponding to the Know attribute j and the unknown attribute data corresponding to the Unknow attribute).
Specifically, in step S5, the scene analysis is performed on the date containing the unknown attribute, which includes the following steps:
s51, carrying out scene classification on the historical data of the power grid according to the daily load condition; inputting a date containing an Unknow attribute and analyzing daily load conditions; considering that the historical data is huge in size and low in value density, if the historical data is traversed, the efficiency is low, and the effect is little; therefore, daily load condition analysis is carried out, namely, scenes are judged and classified into working days, general rest days and special festivals and holidays;
s52, judging whether the scene of the date is a holiday: if yes, the scene of the date is determined as a working day, and the step S54 is entered; otherwise, go to step S53;
s53, judging whether the scene of the date is a special holiday: if the date is the special holiday, the scene of the date is determined to be the special holiday, and the process goes to step S54; otherwise, the scene of the date is determined as a general holiday, and the step S54 is entered;
and S54, searching H most similar scene dates in the historical data of the power grid, namely searching H holidays, special holidays or general holidays.
Description of holidays for special festivals: the holidays of the festivals specified by other countries like the New year, the spring festival, the Qingming festival, the labor festival, the Dragon festival, the mid-autumn festival, the national festival and the like are special holidays.
Specifically, in step S6, the measuring the similarity between each piece of Know attribute data of the missing date and each piece of Know attribute data of each similar date by the dynamic time warping distance includes the following steps:
s61, because the dynamic time warping distance is used for measuring the similarity degree of two time series, and we lack data at a certain moment, the moment when the missing attribute data occurs is set as tnAt time tnSelecting n time points (i.e. t) from time to timen+1,tn+2,…,t2n) At tnSelecting n time points (i.e. t) from time to timen-1,tn-2,…,t0) Finally, the time period (t) of the missing attribute data in the missing date is formed0,t2n) Contains t0,t1,t2,…,t2nA total of 2n +1 time points; setting M Know attributes retained after the judgment and screening of the comparison threshold value as A1,A2,…,AMThe Unknow attribute is denoted as A0;
S62, Know attribute A1,A2,…,AMT in h-th similar period0,t1,t2,…,t2nThe attribute data of the time are respectively recorded as D(1,h),D(2,h),…D(M,h),d(j,h,g)Represents t of the Know attribute j in the h-th similar dategAttribute data of time, j is 1,2, …, M, H is 1,2, …, H, g is 0,1,2, …,2 n;
s63, measuring Know attribute A through dynamic time bending distancejT in h-th similar period0,t1,t2,…,t2nAttribute data D of time(j,h)And t in the deletion period0,t1,t2,…,t2nAttribute data D of time(j,p)Similarity of (2)(j,h)And p represents the deletion date.
Specifically, in step S7, the overall similarity of the unknown attributes on each similar date is calculated by the following formula:
wherein: chRepresenting the integrated similarity of the Unknow attributes in the h-th similarity date.
Specifically, historical data of a certain attribute at the same time point every day is taken as a longitudinal historical data section of the attribute, and transverse historical data is obtained by dividing the data at the same time according to the attribute; according to the missing attribute data filling strategy, longitudinal historical data are fully utilized, and the missing attribute data are not only related to the longitudinal historical data, but also related to transverse historical data, so that a missing attribute data filling value obtained by combining the longitudinal historical data and the transverse historical data is closer to a true value; in step S8, after the date with the highest comprehensive similarity of the Unknow attributes is found, the Unknow attributes are extracted at the date tnData of time T1As vertical padding data; meanwhile, linear fitting of a curve is adopted for the Unknow attribute to find out the date tnData of time T2As the horizontal padding data, the final padding value for solving the missing attribute data is:
T=α×T1+β×T2
α+β=1
wherein: t is tnThe time is the occurrence time of the missing attribute data, α is T1β is T2The weight of (c).
Has the advantages that: the voltage missing value filling method based on historical data auxiliary scene analysis can fully utilize the historical data of a power grid and improve the accuracy of missing attribute data filling; the invention establishes the relation between attributes through a fluctuation cross-correlation analysis algorithm, quantifies the correlation degree by introducing a combined weight, measures the similarity degree of the missing moment data and the historical data through a dynamic time bending distance, and finally selects the data at the most similar moment to replace the missing data to complete the filling of the missing data.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic flow diagram of a wave cross-correlation algorithm;
FIG. 3 is a schematic flow chart of a scene analysis process;
FIG. 4 is a schematic flow chart of similarity calculation;
FIG. 5 is a schematic flow chart of the integrated similarity calculation;
FIG. 6 is a schematic diagram of a missing value padding process;
FIG. 7 is a comparison graph of filling accuracy for different algorithms;
FIG. 8 is a comparison graph of the filling-up results and the true values of the algorithm proposed by the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1 to 6, a method for filling a voltage missing value based on historical data assisted scene analysis includes the following steps:
s1, historical data of the power grid are obtained, and the process goes to step S2.
S2, the fluctuation cross-correlation coefficient of each known attribute data and the missing attribute data at the same time is calculated by a fluctuation cross-correlation analysis algorithm, and the process advances to step S3.
As shown in fig. 2, the fluctuating cross-correlation coefficient is calculated as follows:
s21, for two equal-length time series xiAnd yiWherein i ═ 1,2, …, N;
s22, calculating xi、yiSum of differences from the mean:
wherein: l represents a sampling length, and Δ x (l), Δ y (l) represent xiAnd yiThe sum of the differences from the mean value at the sample length l,andrespectively represent xiAnd yiAverage value of (d);
s23, calculating to respectively represent xi、yiForward difference of autocorrelation:
Δx(l,l0)=x(l0+l)-x(l0),l0=1,2,…,N-l
Δy(l,l0)=y(l0+l)-y(l0),l0=1,2,…,N-l
wherein: 1,2, …, N-1, l for each sampling period l0N-l differences, Δ x (l, l)0)、Δy(l,l0) Respectively represent xiAnd yiForward difference of the autocorrelation of (a);
s24, calculating xi、yiThe covariance of (a):
s25, calculating xi、yiFluctuating cross-correlation coefficient of (a): if xi、yiWhen there is a certain correlation, Covxy(l) Satisfy power law distribution
Wherein: h isxyDenotes xiAnd yiThe degree of correlation, i.e. the fluctuation correlation coefficient, is obtained by fitting a power law distribution to obtain a fluctuation correlation coefficient hxy(ii) a When h is generatedxyWhen 0, x is representediAnd yiNot related; when h is generatedxyWhen > 0, denotes xiAnd yiPositive correlation; when h is generatedxyWhen < 0, it represents xiAnd yiNegative correlation; h isxyLarger value indicates xiAnd yiThe higher the degree of correlation.
S3, if the fluctuation cross-correlation coefficient of a certain known attribute data and the missing attribute data exceeds the comparison threshold, keeping the known attribute data, and entering the step S4; otherwise, the known attribute data is discarded.
Considering that the attribute data is more, in order to avoid the influence of the attribute data with lower correlation on the filling result of the missing attribute data, setting a comparison threshold value of the fluctuation cross-correlation coefficient, and if the fluctuation cross-correlation coefficient of the known attribute data and the missing attribute data is lower than the comparison threshold value, considering that the reference value of the known attribute data is lower or has no reference value and abandoning the known attribute data; after the threshold comparison determination, M attribute data remain, and the corresponding attributes are referred to as M Know attributes, which are numbered from 1 to M.
And S4, respectively calculating the combination weight of each Know attribute and the Unknow attribute, wherein the attribute corresponding to the reserved M known attribute data is called the Know attribute, the attribute corresponding to the missing attribute data is called the Unknow attribute, and the combination weight is calculated by the combination weight.
Combination weight w of Know attribute j and Unknow attributejCalculated by the following formula:
wherein: m denotes the number of Know attributes (i.e. the number of attributes corresponding to the retained known attribute data), j is 1,2, …, M, cjAnd a fluctuation correlation coefficient representing the Know attribute j and the Unknow attribute (that is, the fluctuation correlation coefficient of the known attribute data corresponding to the Know attribute j and the unknown attribute data corresponding to the Unknow attribute).
S5, carrying out scene analysis on the dates containing the Unknow attributes, and searching H dates of the most similar scenes in the historical data of the power grid; the date containing the property of Unknow is called the missing date, and the dates of the found H most similar scenes are called H similar dates.
As shown in fig. 3, the scene analysis includes the following steps:
s51, carrying out scene classification on the historical data of the power grid according to the daily load condition; inputting a date containing an Unknow attribute and analyzing daily load conditions; considering that the historical data is huge in size and low in value density, if the historical data is traversed, the efficiency is low, and the effect is little; therefore, daily load condition analysis is carried out, namely, scenes are judged and classified into working days, general rest days and special festivals and holidays;
s52, judging whether the scene of the date is a holiday: if yes, the scene of the date is determined as a working day, and the step S54 is entered; otherwise, go to step S53;
s53, judging whether the scene of the date is a special holiday: if the date is the special holiday, the scene of the date is determined to be the special holiday, and the process goes to step S54; otherwise, the scene of the date is determined as a general holiday, and the step S54 is entered;
and S54, searching H most similar scene dates in the historical data of the power grid, namely searching H holidays, special holidays or general holidays.
Description of the drawings: the dates from Monday to Friday or other holiday-mediated rest are working days; the common saturday is the common rest day; the holidays of the festivals specified by other countries like the New year, the spring festival, the Qingming festival, the labor festival, the Dragon festival, the mid-autumn festival, the national festival and the like are special holidays.
And S6, determining the time period of the missing attribute data in the missing date, and measuring the similarity of each Know attribute data of the missing date and each Know attribute data of each similar date through the dynamic time bending distance for the same time period of each similar date.
As shown in fig. 4, the similarity calculation includes the steps of:
s61, because the dynamic time warping distance is used for measuring the similarity degree of two time series, and we lack data at a certain moment, the moment when the missing attribute data occurs is set as tnAt time tnSelecting n time points (i.e. t) from time to timen+1,tn+2,…,t2n) At tnSelecting n time points (i.e. t) from time to timen-1,tn-2,…,t0) Finally, the time period (t) of the missing attribute data in the missing date is formed0,t2n) Contains t0,t1,t2,…,t2nA total of 2n +1 time points; setting M Know attributes retained after the judgment and screening of the comparison threshold value as A1,A2,…,AMThe Unknow attribute is denoted as A0;
S62, Know attribute A1,A2,…,AMT in h-th similar period0,t1,t2,…,t2nAttributes of time of dayData are respectively marked as D(1,h),D(2,h),…D(M,h),d(jhg)Represents t of the Know attribute j in the h-th similar dategAttribute data of time, j is 1,2, …, M, H is 1,2, …, H, g is 0,1,2, …,2 n;
s63, measuring Know attribute A through dynamic time bending distancejT in h-th similar period0,t1,t2,…,t2nAttribute data D of time(j,h)And t in the deletion period0,t1,t2,…,t2nAttribute data D of time(j,p)Similarity of (2)(j,h)And p represents the deletion date.
And S7, calculating the comprehensive similarity of the unknown attributes of the similar dates by combining the combined weight of the unknown attributes and the unknown attributes.
As shown in fig. 5, the overall similarity of the Unknow attributes for each similar date is calculated by the following formula:
wherein: chRepresenting the integrated similarity of the Unknow attributes in the h-th similarity date.
And S8, finding out the date with the highest comprehensive similarity of the Unknow attributes, and filling up the missing attribute data by combining the data at the same time on the date with the horizontal data.
As shown in fig. 6, the filling process of missing data by using horizontal and vertical data includes the following steps:
s81, inputting power grid data;
s82, carrying out data type division on the power grid data, wherein historical longitudinal data form a historical longitudinal database, and historical transverse data form a historical transverse database;
s83, extracting Unkn after finding out the date with highest comprehensive similarity of Unknow attributes for the historical longitudinal databaseThe ow attribute is at the date tnData of time T1Selecting an appropriate weight ratio α as longitudinal filling data;
s84, for the transverse historical database, finding the date t by adopting linear fitting of a curve to the Unknow attributenData of time T2As the transverse filling data, selecting an appropriate weight ratio β;
s85, solving the final filling value of the missing attribute data as follows:
T=α×T1+β×T2
α+β=1
wherein: t is tnThe time is the occurrence time of the missing attribute data, α is T1β is T2The weight of (c).
The method is applied to the filling analysis of the voltage value missing condition of the power grid in a certain area, historical data of a real power grid, which is about 1 and a half years old, is selected as a historical data set, the sampling period is 5 minutes, the data filling object is the voltage missing value of a 10kV bus, the fluctuation cross-correlation coefficient is calculated for the related attributes of the voltage missing data, and the finally obtained related attributes are as follows: { reactive load, active load, current value }. In order to embody the advantages of the Algorithm provided by the invention, a traditional maximum Expectation (EM) Algorithm and a K Nearest Neighbors (KNN) Algorithm are selected for comparative analysis.
In order to fully detect the effectiveness of the method provided by the invention, a random deletion strategy is adopted to delete 1%, 5%, 10%, 15%, 20%, 25% and 30% of data in the data set. And evaluating a filling result by adopting filling accuracy under the condition of different voltage deficiency degrees, wherein the evaluation method of the filling accuracy comprises the following steps:
wherein: n isrN is the number of voltage loss values to estimate the correct number. In order to ensure the reliability of the experimental result, 5 times of calculation is carried out under the condition of different voltage loss rates, and the average value of the 5 times of calculation is used as the final experimental result.The experimental result is shown in fig. 7, and it can be seen that the filling accuracy of the method provided by the invention is obviously better than that of the conventional algorithm. To further demonstrate the effect of the method of the present invention, the analysis was performed by taking the case of a deletion rate of 15% as an example. Fig. 8 shows comparison analysis of 27 consecutive groups of voltage data in a certain missing condition, and it is obvious from the results in the figure that the curve drawn by the method provided by the present invention has good fitting degree with the true value curve, the filling result is close to the true value, and the filling effect is good.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (7)
1. A voltage missing value filling method based on historical data auxiliary scene analysis is characterized by comprising the following steps: the method comprises the following steps:
s1, acquiring historical data of the power grid, and entering the step S2;
s2, calculating the fluctuation cross-correlation coefficient of each known attribute data and the missing attribute data at the same time through a fluctuation cross-correlation analysis algorithm, and entering the step S3;
s3, if the fluctuation cross-correlation coefficient of a certain known attribute data and the missing attribute data exceeds the comparison threshold, keeping the known attribute data, and entering the step S4; otherwise, abandoning the known attribute data;
s4, calling the attribute corresponding to the M reserved known attribute data as a Know attribute, calling the attribute corresponding to the missing attribute data as an Unknow attribute, and respectively calculating the combined weight of each Know attribute and the Unknow attribute;
s5, carrying out scene analysis on the dates containing the Unknow attributes, and searching H dates of the most similar scenes in the historical data of the power grid; the date containing the Unknow attribute is called a missing date, and the dates of the found H most similar scenes are called H similar dates;
s6, determining the time period of the missing attribute data in the missing date, and measuring the similarity of each Know attribute data of the missing date and each Know attribute data of each similar date through the dynamic time bending distance for the same time period of each similar date;
s7, calculating the comprehensive similarity of the unknown attributes on the similar dates by combining the combined weight of the unknown attributes and the unknown attributes;
and S8, finding out the date with the highest comprehensive similarity of the Unknow attributes, and filling up the missing attribute data by combining the data at the same time on the date with the horizontal data.
2. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: in step S2, the fluctuation cross-correlation coefficient is calculated as follows:
s21, for two equal-length time series xiAnd yiWherein i ═ 1,2, …, N;
s22, calculating xi、yiSum of differences from the mean:
wherein: l represents a sampling length, and Δ x (l), Δ y (l) represent xiAnd yiThe sum of the differences from the mean value at the sample length l,andrespectively represent xiAnd yiAverage value of (d);
s23, calculating to respectively represent xi、yiForward difference of autocorrelation:
Δx(l,l0)=x(l0+l)-x(l0),l0=1,2,…,N-l
Δy(l,l0)=y(l0+l)-y(l0),l0=1,2,…,N-l
wherein: 1,2, …, N-1, l for each sampling period l0N-l differences, Δ x (l, l)0)、Δy(l,l0) Respectively represent xiAnd yiForward difference of the autocorrelation of (a);
s24, calculating xi、yiThe covariance of (a):
s25, calculating xi、yiFluctuating cross-correlation coefficient of (a): if xi、yiWhen there is a certain correlation, Covxy(l) Satisfy power law distribution
Wherein: h isxyDenotes xiAnd yiThe degree of correlation, i.e. the fluctuation correlation coefficient, is obtained by fitting a power law distribution to obtain a fluctuation correlation coefficient hxy(ii) a When h is generatedxyWhen 0, x is representediAnd yiNot related; when h is generatedxyWhen > 0, denotes xiAnd yiPositive correlation; when h is generatedxyWhen < 0, it represents xiAnd yiNegative correlation; h isxyLarger value indicates xiAnd yiThe higher the degree of correlation.
3. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: in step S4, the combination weight w of the Know attribute j and the Unknow attributejCalculated by the following formula:
wherein: m represents the number j of Know attributes 1,2, …, M, cjAnd expressing the fluctuation correlation coefficient of the Know attribute j and the Unknow attribute.
4. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: in step S5, the scene analysis is performed on the date containing the unknown attribute, which includes the following steps:
s51, carrying out scene classification on the historical data of the power grid according to the daily load condition; inputting a date containing an Unknow attribute and analyzing daily load conditions;
s52, judging whether the scene of the date is a holiday: if yes, the scene of the date is determined as a working day, and the step S54 is entered; otherwise, go to step S53;
s53, judging whether the scene of the date is a special holiday: if the date is the special holiday, the scene of the date is determined to be the special holiday, and the process goes to step S54; otherwise, the scene of the date is determined as a general holiday, and the step S54 is entered;
and S54, searching H most similar scene dates in the historical data of the power grid, namely searching H holidays, special holidays or general holidays.
5. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: in step S6, the method for measuring the similarity between each piece of Know attribute data of the missing date and each piece of Know attribute data of each similar date by using the dynamic time warping distance includes the following steps:
s61, setting the time of the missing attribute data as tnAt time tnSelecting n time points backwards in time, and at tnSelecting n time points from the moment forward, and finally forming a time period (t) of the missing attribute data in the missing date0,t2n) Contains t0,t1,t2,…,t2nA total of 2n +1 time points; setting M Know attributes retained after the judgment and screening of the comparison threshold value as A1,A2,…,AMThe Unknow attribute is denoted as A0;
S62, Know attribute A1,A2,…,AMT in h-th similar period0,t1,t2,…,t2nThe attribute data of the time are respectively recorded as D(1,h),D(2,h),…D(M,h),d(j,h,g)Represents t of the Know attribute j in the h-th similar dategAttribute data of time, j is 1,2, …, M, H is 1,2, …, H, g is 0,1,2, …,2 n;
s63, measuring Know attribute A through dynamic time bending distancejT in h-th similar period0,t1,t2,…,t2nAttribute data D of time(j,h)And t in the deletion period0,t1,t2,…,t2nAttribute data D of time(j,p)Similarity of (2)(j,h)And p represents the deletion date.
6. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: in step S7, the overall similarity of the unknown attribute at each similar date is calculated by the following formula:
wherein: chRepresenting the integrated similarity of the Unknow attributes in the h-th similarity date.
7. The voltage missing value filling method based on historical data auxiliary scene analysis according to claim 1, wherein: taking the historical data of a certain attribute at the same time point every day as a longitudinal historical data section of the attribute, wherein the transverse historical data is obtained by dividing the data at the same time according to the attribute; in step S8, after the date with the highest comprehensive similarity of the Unknow attributes is found, the Unknow attributes are extracted at the date tnData of time T1As vertical padding data; meanwhile, linear fitting of a curve is adopted for the Unknow attribute to find out the date tnData of time T2As the horizontal padding data, the final padding value for solving the missing attribute data is:
T=α×T1+β×T2
α+β=1
wherein: t is tnThe time is the occurrence time of the missing attribute data, α is T1β is T2The weight of (c).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010311551.5A CN111507412B (en) | 2020-04-20 | 2020-04-20 | Voltage missing value filling method based on historical data auxiliary scene analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010311551.5A CN111507412B (en) | 2020-04-20 | 2020-04-20 | Voltage missing value filling method based on historical data auxiliary scene analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111507412A true CN111507412A (en) | 2020-08-07 |
CN111507412B CN111507412B (en) | 2021-02-19 |
Family
ID=71871170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010311551.5A Active CN111507412B (en) | 2020-04-20 | 2020-04-20 | Voltage missing value filling method based on historical data auxiliary scene analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111507412B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113077357A (en) * | 2021-03-29 | 2021-07-06 | 国网湖南省电力有限公司 | Power time sequence data abnormity detection method and filling method thereof |
CN113177598A (en) * | 2021-05-06 | 2021-07-27 | 国网福建省电力有限公司 | Method and terminal for compensating error electric quantity |
CN113568898A (en) * | 2021-07-30 | 2021-10-29 | 浙江华云信息科技有限公司 | Electric power data leakage point completion method, device, equipment and readable storage medium |
CN113761023A (en) * | 2021-08-24 | 2021-12-07 | 国网甘肃省电力公司 | Photovoltaic power generation short-term power prediction method based on improved generalized neural network |
CN114065878A (en) * | 2022-01-17 | 2022-02-18 | 国网山东省电力公司泰安供电公司 | Electric quantity missing value filling method based on multi-parameter Internet of things fusion technology |
CN114611396A (en) * | 2022-03-15 | 2022-06-10 | 国网安徽省电力有限公司蚌埠供电公司 | Line loss analysis method based on big data |
CN116683452A (en) * | 2023-08-03 | 2023-09-01 | 国网山东省电力公司营销服务中心(计量中心) | Method and system for repairing solar heat lost electric quantity |
CN117390502A (en) * | 2023-12-13 | 2024-01-12 | 国网江苏省电力有限公司苏州供电分公司 | Resiofnn network-based voltage data missing value filling method and system |
CN117932246A (en) * | 2024-03-21 | 2024-04-26 | 广东鹰视能效科技有限公司 | Electric quantity data recalculation method and system |
CN118071176A (en) * | 2024-04-15 | 2024-05-24 | 国网浙江省电力有限公司金华供电公司 | Data processing method and system applicable to platform area source network load storage integrated management platform |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9652828B1 (en) * | 2015-12-29 | 2017-05-16 | Motorola Solutions, Inc. | Method and apparatus for imaging a scene |
WO2018025019A1 (en) * | 2016-08-01 | 2018-02-08 | Liverpool John Moores University | Analysing energy/utility usage |
CN107808105A (en) * | 2017-10-18 | 2018-03-16 | 南京邮电大学 | False data detection method based on prediction in a kind of intelligent grid |
CN109002937A (en) * | 2018-09-07 | 2018-12-14 | 深圳供电局有限公司 | Power grid load prediction method and device, computer equipment and storage medium |
WO2019099107A1 (en) * | 2017-11-17 | 2019-05-23 | Google Llc | Real-time anomaly detection and correlation of time-series data |
CN109816017A (en) * | 2019-01-24 | 2019-05-28 | 电子科技大学 | Power grid missing data complementing method based on fuzzy clustering and Lagrange's interpolation |
CN110276412A (en) * | 2019-06-28 | 2019-09-24 | 中煤科工集团重庆研究院有限公司 | Gas monitoring data disorder filling method |
US20190378022A1 (en) * | 2018-06-11 | 2019-12-12 | Oracle International Corporation | Missing value imputation technique to facilitate prognostic analysis of time-series sensor data |
CN110610280A (en) * | 2018-10-31 | 2019-12-24 | 山东大学 | Short-term prediction method, model, device and system for power load |
CN110781449A (en) * | 2019-11-05 | 2020-02-11 | 国网冀北电力有限公司智能配电网中心 | Estimation method for user data loss in distribution area line loss calculation |
-
2020
- 2020-04-20 CN CN202010311551.5A patent/CN111507412B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9652828B1 (en) * | 2015-12-29 | 2017-05-16 | Motorola Solutions, Inc. | Method and apparatus for imaging a scene |
WO2018025019A1 (en) * | 2016-08-01 | 2018-02-08 | Liverpool John Moores University | Analysing energy/utility usage |
CN107808105A (en) * | 2017-10-18 | 2018-03-16 | 南京邮电大学 | False data detection method based on prediction in a kind of intelligent grid |
WO2019099107A1 (en) * | 2017-11-17 | 2019-05-23 | Google Llc | Real-time anomaly detection and correlation of time-series data |
US20190378022A1 (en) * | 2018-06-11 | 2019-12-12 | Oracle International Corporation | Missing value imputation technique to facilitate prognostic analysis of time-series sensor data |
CN109002937A (en) * | 2018-09-07 | 2018-12-14 | 深圳供电局有限公司 | Power grid load prediction method and device, computer equipment and storage medium |
CN110610280A (en) * | 2018-10-31 | 2019-12-24 | 山东大学 | Short-term prediction method, model, device and system for power load |
CN109816017A (en) * | 2019-01-24 | 2019-05-28 | 电子科技大学 | Power grid missing data complementing method based on fuzzy clustering and Lagrange's interpolation |
CN110276412A (en) * | 2019-06-28 | 2019-09-24 | 中煤科工集团重庆研究院有限公司 | Gas monitoring data disorder filling method |
CN110781449A (en) * | 2019-11-05 | 2020-02-11 | 国网冀北电力有限公司智能配电网中心 | Estimation method for user data loss in distribution area line loss calculation |
Non-Patent Citations (4)
Title |
---|
FADOUA RAFII等: "Collection of historical weather data: issues with missing values", 《SCA "19: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS》 * |
TIE LI等: "Fill Missing Data for Wind Farms Using Long Short-Term Memory Based Recurrent Neural Network", 《2019 IEEE 3RD INTERNATIONAL ELECTRICAL AND ENERGY CONFERENCE》 * |
张峰等: "水资源消耗预测的异常值检测及缺失数据填补方法", 《统计与决策》 * |
赵少东等: "电力系统的计量缺失数据智能修复研究与应用", 《科技创新导报》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113077357A (en) * | 2021-03-29 | 2021-07-06 | 国网湖南省电力有限公司 | Power time sequence data abnormity detection method and filling method thereof |
CN113077357B (en) * | 2021-03-29 | 2023-11-28 | 国网湖南省电力有限公司 | Power time sequence data anomaly detection method and filling method thereof |
CN113177598B (en) * | 2021-05-06 | 2023-05-02 | 国网福建省电力有限公司 | Error electric quantity supplementing method and terminal |
CN113177598A (en) * | 2021-05-06 | 2021-07-27 | 国网福建省电力有限公司 | Method and terminal for compensating error electric quantity |
CN113568898A (en) * | 2021-07-30 | 2021-10-29 | 浙江华云信息科技有限公司 | Electric power data leakage point completion method, device, equipment and readable storage medium |
CN113761023A (en) * | 2021-08-24 | 2021-12-07 | 国网甘肃省电力公司 | Photovoltaic power generation short-term power prediction method based on improved generalized neural network |
CN114065878A (en) * | 2022-01-17 | 2022-02-18 | 国网山东省电力公司泰安供电公司 | Electric quantity missing value filling method based on multi-parameter Internet of things fusion technology |
CN114611396A (en) * | 2022-03-15 | 2022-06-10 | 国网安徽省电力有限公司蚌埠供电公司 | Line loss analysis method based on big data |
CN116683452A (en) * | 2023-08-03 | 2023-09-01 | 国网山东省电力公司营销服务中心(计量中心) | Method and system for repairing solar heat lost electric quantity |
CN116683452B (en) * | 2023-08-03 | 2023-11-10 | 国网山东省电力公司营销服务中心(计量中心) | Method and system for repairing solar heat lost electric quantity |
CN117390502A (en) * | 2023-12-13 | 2024-01-12 | 国网江苏省电力有限公司苏州供电分公司 | Resiofnn network-based voltage data missing value filling method and system |
CN117932246A (en) * | 2024-03-21 | 2024-04-26 | 广东鹰视能效科技有限公司 | Electric quantity data recalculation method and system |
CN118071176A (en) * | 2024-04-15 | 2024-05-24 | 国网浙江省电力有限公司金华供电公司 | Data processing method and system applicable to platform area source network load storage integrated management platform |
Also Published As
Publication number | Publication date |
---|---|
CN111507412B (en) | 2021-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111507412B (en) | Voltage missing value filling method based on historical data auxiliary scene analysis | |
CN106055918B (en) | Method for identifying and correcting load data of power system | |
CN115099500B (en) | Water level prediction method based on weight correction and DRSN-LSTM model | |
CN106600037B (en) | Multi-parameter auxiliary load prediction method based on principal component analysis | |
CN110555989B (en) | Xgboost algorithm-based traffic prediction method | |
CN109636063A (en) | A kind of method of short-term load forecasting | |
CN111210170B (en) | Environment-friendly management and control monitoring and evaluation method based on 90% electricity distribution characteristic index | |
CN110824586B (en) | Rainfall prediction method based on improved decision tree algorithm | |
CN108011367A (en) | A kind of Characteristics of Electric Load method for digging based on depth decision Tree algorithms | |
CN113705931B (en) | Method for predicting runoff elements by using K nearest neighbor method | |
CN112305441A (en) | Power battery health state assessment method under integrated clustering | |
CN110212592A (en) | Fired power generating unit Load Regulation maximum rate estimation method and system based on piecewise linearity expression | |
CN116578870A (en) | Distribution network voltage abnormal data filling method based on fluctuation cross-correlation analysis | |
CN111553434A (en) | Power system load classification method and system | |
CN110457374B (en) | Method for identifying typical rainstorm process in time period | |
CN118365203A (en) | Coal mine water inrush source distinguishing method and system based on fisher model | |
CN117370759A (en) | Gas consumption prediction method based on artificial intelligence | |
CN113468821B (en) | Decision regression algorithm-based slope abortion sand threshold determination method | |
CN113673551B (en) | Power metering bad data identification method and system | |
CN116777027A (en) | Load prediction method and system for abnormal days | |
CN108493933A (en) | A kind of Characteristics of Electric Load method for digging based on depth decision Tree algorithms | |
CN114372357A (en) | Industrial load decomposition method based on factor hidden Markov model | |
CN113312587A (en) | Sensor acquisition data missing value processing method based on ARIMA prediction and regression prediction | |
CN113537575A (en) | Method for predicting trend load of grid connection of distributed photovoltaic and electric automobile | |
CN112712213A (en) | Method and system for predicting energy consumption of deep migration learning of centralized air-conditioning house |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |