CN106778053B - A kind of alert correlation variable detection method and system based on correlation - Google Patents
A kind of alert correlation variable detection method and system based on correlation Download PDFInfo
- Publication number
- CN106778053B CN106778053B CN201710206963.0A CN201710206963A CN106778053B CN 106778053 B CN106778053 B CN 106778053B CN 201710206963 A CN201710206963 A CN 201710206963A CN 106778053 B CN106778053 B CN 106778053B
- Authority
- CN
- China
- Prior art keywords
- correlation
- variable
- alarm
- data segment
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 55
- 230000002159 abnormal effect Effects 0.000 claims abstract description 32
- 230000011218 segmentation Effects 0.000 claims description 35
- 238000012360 testing method Methods 0.000 claims description 24
- 230000002596 correlated effect Effects 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000001419 dependent effect Effects 0.000 claims description 10
- 238000013519 translation Methods 0.000 claims description 8
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000013075 data extraction Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000009795 derivation Methods 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 abstract description 15
- 238000013461 design Methods 0.000 abstract description 6
- 238000012545 processing Methods 0.000 description 8
- 239000000779 smoke Substances 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000002349 favourable effect Effects 0.000 description 4
- UGFAIRIUMAVXCW-UHFFFAOYSA-N Carbon monoxide Chemical compound [O+]#[C-] UGFAIRIUMAVXCW-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 239000003546 flue gas Substances 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000003646 Spearman's rank correlation coefficient Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
Landscapes
- Alarm Systems (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
The invention discloses a kind of alert correlation variable detection method and system based on correlation, wherein, this method is by establishing binary time series to associated variable, binding time sequence segment method and related coefficient tendency method, abnormal data section is accurately obtained from historical data fast automaticly, to carry out anomaly data detection, to realize that the dynamic alert threshold design of multivariable alarm system provides advantageous condition, to reduce interference alarm, the efficiency that site operation personnel handles alarm is improved, has ensured production security.
Description
Technical Field
The invention belongs to the field of signal processing, and particularly relates to a correlation-based alarm correlation variable detection method and system.
Background
The alarm system plays a vital role in guaranteeing safe production and efficient operation of the coal-fired generator set, and due to mutual influence among related variables in the actual industrial process, the traditional single-variable alarm threshold design method can generate a large amount of interference alarms (missed alarms and false alarms) and cause excessive alarms, so that the attention of field operators is influenced, and the difficulty in correctly handling abnormal production conditions is increased.
Based on the fact that the correlation between variables is often obviously changed compared with the normal working condition when the equipment or the system is abnormal in the process, it is necessary to find a detection method for automatically screening data segments in the normal condition and the abnormal condition by using the correlation-based alarm correlation variable detection method.
Disclosure of Invention
In order to solve the above problems, a first object of the present invention is to provide a method for detecting alarm related variables based on correlation. The method can quickly and automatically acquire the abnormal data section from the historical data, thereby carrying out abnormal data detection, providing favorable conditions for realizing the dynamic alarm threshold design of the multivariable alarm system, reducing interference alarm, improving the alarm processing efficiency of field operators and ensuring the production safety.
The invention relates to a correlation-based alarm correlation variable detection method, which is completed in a server or a processor and specifically comprises the following steps:
step 1: extracting alarm variables with preset time length and data of a plurality of related variables related to the alarm variables from historical detection data, and selecting one group of alarm variables and related variables as detection objects;
step 2: judging the dynamic delay relation between the selected alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T to be T';
and step 3: segmenting the binary time series T' under the constraint of a minimum time interval caused by noise;
and 4, step 4: obtaining a correlation coefficient and a correlation trend of each segment;
and 5: and obtaining the abnormal data segment and the related information thereof according to the comparison between the correlation trend and the actual trend.
Further, in the step 2, if a dynamic delay relationship exists between the alarm variable and the related variable, translating the alarm variable with a preset time length or the time length of the related variable and keeping the dynamic delay relationship between the alarm variable and the related variable unchanged; if no dynamic delay relationship exists between the alarm variable and the dependent variable, no translation is required.
According to the method, the dynamic delay relation between the alarm variable and the related variable is judged firstly, so that an accurate binary time sequence T is established, an abnormal data segment can be accurately obtained, the alarm processing efficiency of field operators is improved, and the production safety is guaranteed.
Further, before the step 3, the method further comprises: calculating the minimum time interval caused by noise, wherein the specific process comprises the following steps:
(3.1.1) acquiring each lower inflection point of the alarm variable of the current preset time length, and solving the distance between adjacent lower inflection points to further form an array d;
(3.1.2) sequencing the array d and removing repeated elements to obtain an array d 0; calculating the distance dm between the point with the maximum slope change of the array d0 and the nearest lower inflection point, wherein dm is the minimum time interval in the alarm variable;
(3.1.3) repeating steps (3.1.1) and (3.1.2) on the dependent variable to obtain a minimum time interval dh for the dependent variable;
(3.1.4) the larger of dm and dh is taken as the minimum time interval of the binary time series T'.
The minimum time interval is used to reduce the influence of noise on the segmentation result, and when the actual industrial process data is processed, the time interval between key points is too short due to the interference of the noise, so that the key point search is not performed in the neighborhood of the minimum time interval.
Further, the process of segmenting the binary time sequence T' in the step 3 includes:
taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
The weak correlation data segment and the middle correlation data segment are divided, so that the missing segmentation phenomenon can be avoided, and the overfitting phenomenon can be avoided by the strong correlation data segment.
Further, the process of segmenting the binary time sequence T' in the step 3 further includes:
regarding the data segment to be divided, using a linear interpolation method to take the projection of the data points in the binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as fitting points;
finding the farthest point by using the orthogonal distance as the key turning point of the next segmentation, then determining whether the data segment to be segmented has a strongly correlated data segment, and updating the key turning point;
and repeating the steps until the data segment to be divided does not exist any more.
Under the constraint of the minimum time interval, the invention only aims at the weakly correlated data segment, the data segment with overlong time length and insignificant correlation, and the data segment with overlong time length and significant correlation, but the divided subdata segments are still strongly correlated data segments to carry out time sequence division, obtain the key turning points of the data segments to be divided, and carry out piecewise linear representation on the original time sequence, thereby avoiding overfitting and neglecting segmentation.
Further, in the step 4, a specific process of obtaining a correlation coefficient and a correlation trend of each segment includes:
(4.1): dividing the time sequence according to the finally obtained key turning point, and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
(4.2): and carrying out unilateral hypothesis test on the correlation of the variables, setting a significance level, confirming the correlation between the variables according to the unilateral hypothesis test result and the significance level, and determining the trend of the correlation coefficient.
The method calculates the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula, performs unilateral hypothesis test on the correlation of variables, sets the significance level, confirms the correlation between the variables according to the unilateral hypothesis test result and the significance level, determines the trend of the correlation coefficient, provides accurate data for accurately acquiring the abnormal data segment and the related information thereof, and further improves the alarm efficiency.
It is a second object of the present invention to provide a correlation-based alarm correlation variable detection system.
The invention relates to a correlation-based alarm correlation variable detection system, which comprises:
the data extraction module is used for extracting alarm variables of a preset time length and data of a plurality of related variables related to the alarm variables from historical detection data, and selecting one group of alarm variables and related variables as detection objects;
the time sequence establishing module is used for judging the dynamic delay relation between the selected alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T into T';
a time series segmentation module for segmenting the binary time series T' under the constraint of a minimum time interval caused by noise;
the correlation coefficient calculating module is used for calculating the correlation coefficient and the correlation trend of each segment;
and the abnormal data acquisition module is used for acquiring the abnormal data segment and the related information thereof according to the comparison between the correlation trend and the actual trend.
In the time sequence establishing module, if a dynamic delay relation exists between the alarm variable and the related variable, translating the alarm variable with a preset time length or the time length of the related variable and keeping the dynamic delay relation between the alarm variable and the related variable unchanged; if no dynamic delay relationship exists between the alarm variable and the dependent variable, no translation is required.
According to the method, the dynamic delay relation between the alarm variable and the related variable is judged firstly, so that an accurate binary time sequence T is established, an abnormal data segment can be accurately obtained, the alarm processing efficiency of field operators is improved, and the production safety is guaranteed.
Further, the system further comprises: the minimum time interval calculation module is used for acquiring each lower inflection point of the alarm variable of the current preset time length, and solving the distance between adjacent lower inflection points to further form an array d;
sorting the array d and removing repeated elements to obtain an array d 0; calculating the distance dm between the point with the maximum slope change of the array d0 and the nearest adjacent lower inflection point, wherein dm is the minimum time interval in the alarm variable;
acquiring lower inflection points of the related variables and the distance between adjacent lower inflection points, and further acquiring the minimum time interval dh of the related variables;
taking the larger value of dm and dh as the minimum time interval of the binary time sequence;
the minimum time interval is used to reduce the influence of noise on the segmentation result, and when the actual industrial process data is processed, the time interval between key points is too short due to the interference of the noise, so that the key point search is not performed in the neighborhood of the minimum time interval.
Further, the time series segmentation module includes: the data segment to be divided acquisition module is used for taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
Further, the time-series segmentation module further includes:
the fitting point solving module is used for taking the projection of the data points in the standardized binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as the fitting points by utilizing a linear interpolation method aiming at the data segment to be divided;
and the key turning point calculation updating module is used for finding the farthest point by utilizing the orthogonal distance to serve as the key turning point of the next segmentation, then determining whether a third data segment to be segmented exists in the data segment to be segmented or not, and updating the key turning point until the data segment to be segmented does not exist any more.
Under the constraint of the minimum time interval, the invention only aims at the weakly correlated data segment, the data segment with overlong time length and insignificant correlation, and the data segment with overlong time length and significant correlation, but the divided subdata segments are still strongly correlated data segments to carry out time sequence division, obtain the key turning points of the data segments to be divided, and carry out piecewise linear representation on the original time sequence, thereby avoiding overfitting and neglecting segmentation.
Further, the correlation coefficient obtaining module includes:
the segment correlation coefficient calculation module is used for dividing the time sequence according to the finally obtained key turning point and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
and the correlation coefficient trend determining module is used for carrying out unilateral hypothesis test on the correlation of the variables, setting the significance level and determining the correlation coefficient trend according to the correlation between the unilateral hypothesis test result and the significance level confirmation variables.
The method calculates the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula, performs unilateral hypothesis test on the correlation of variables, sets the significance level, confirms the correlation between the variables according to the unilateral hypothesis test result and the significance level, determines the trend of the correlation coefficient, provides accurate data for accurately acquiring the abnormal data segment and the related information thereof, and further improves the alarm efficiency.
Compared with the prior art, the invention has the beneficial effects that:
the invention selects the correlation among the industrial variables as the characteristic for judging whether the working point state is abnormal, establishes the multivariate time sequence for the associated variables, combines the time sequence segmentation method and the correlation coefficient trend method, reduces the overfitting phenomenon and the missing segmentation phenomenon to the maximum extent, and quickly and automatically acquires the abnormal data segment from the historical data accurately, thereby carrying out abnormal data detection to provide favorable conditions for realizing the dynamic alarm threshold design of the multivariate alarm system, reducing the interference alarm, improving the alarm processing efficiency of field operators and ensuring the production safety.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a flow chart of a correlation-based alarm correlation variable detection method of the present invention;
FIG. 2(a) is a graph of a variable time series and a segmentation result of the inlet smoke temperature of the air preheater of the present invention;
FIG. 2(b) is a graph of a time series of variables and a segmentation result of the outlet flue gas temperature of the air preheater of the present invention;
FIG. 3 illustrates correlation coefficients and confidence intervals for each segment in an embodiment of the present invention;
FIG. 4 is a graph illustrating the correlation trend of variables in each segment according to an embodiment of the present invention;
FIG. 5(a) is a first set of multivariate time series scattergrams and a fitted straight line according to the invention;
FIG. 5(b) is a second set of multivariate time series scattergrams and fitted straight lines of the invention;
FIG. 5(c) is a third set of multivariate time series scattergrams and fitted straight lines of the invention;
FIG. 5(d) is a fourth set of multivariate time series scattergrams and fitted straight lines according to the invention;
FIG. 6 is a schematic diagram of a system for detecting alarm correlation variables based on correlation according to the present invention;
fig. 7 is a schematic structural diagram of a correlation coefficient calculating module according to the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
FIG. 1 is a flow chart of a correlation-based alarm correlation variable detection method of the present invention.
The method for detecting alarm associated variables based on correlation shown in fig. 1 is implemented in a server or a processor, and specifically includes:
step 1: and extracting alarm variables of preset time length and data of a plurality of related variables associated with the alarm variables from the historical detection data and selecting a group of related variables and alarm variables as detection objects.
Specifically, raw data of a plurality of related variables with the time length N before the current working point is extracted, and a group of alarm variables and a group of related variables are selected for detection.
Taking a specific application scenario of the method in a specific example of the invention as an example of a power plant:
the inlet smoke temperature of an air preheater in a power plant is selected as an alarm variable, and the outlet air temperature of the air preheater is a relevant variable. In a primary shutdown accident in a power plant, selecting a sampling period before shutdown as 1 second and a sample volume as N3600 data from historical data, and selecting a group of air preheater inlet smoke temperature and air preheater outlet air temperature for abnormal data detection.
Step 2: judging the dynamic delay relation between the alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T to be T';
specifically, whether a dynamic delay relationship exists between variables is judged, if yes, a section of initial data of the two variables is required to be obtained to obtain delay time h, and then the time length of the existing alarm variable or related variable with the length of N is translated by h. If no dynamic delay relation exists between the variables, translation is not needed; h is a positive number;
and establishing a binary time sequence T for the two groups of variables which are translated or do not need translation, and normalizing the binary time sequence into T'.
In the specific implementation process, the delay time is 0 because no dynamic delay relation exists between the inlet smoke temperature of the air preheater and the outlet air temperature of the air preheater. And then constructing a binary time sequence for the two groups of data, and standardizing the binary time sequence, wherein the standardized binary time sequence is marked as T' ═ T1(t),T2(t)]Wherein t is 1, …, N, N is a positive integer; t is1Representing the inlet flue gas temperature, T, of the air preheater2Representing the air preheater outlet air temperature.
And step 3: the normalized binary time series T is segmented under the constraint of a minimum time interval caused by noise.
Wherein, before step 3, the method further comprises: calculating the minimum time interval caused by noise, wherein the specific process comprises the following steps:
(3.1.1) acquiring each lower inflection point of the alarm variable of the current preset time length, and obtaining the distance between adjacent lower inflection points to form an array d ═ d1, d2, d3, … … and dx;
(3.1.2) sorting the array d and removing the repeated elements to obtain an array d0 ═ { d1, d2, d3, … …, dx0}, wherein x0 < x, wherein x0 and x are positive integers; calculating the point m with the maximum slope change of the array d0 and the distance dm between the corresponding adjacent lower inflection points, wherein dm is the minimum time interval in the alarm variables;
(3.1.3) repeating steps (3.1.1) and (3.1.2) on the dependent variable to obtain a minimum time interval dh for the dependent variable;
(3.1.4) taking the larger of dm and dh as the minimum time interval delta of the binary time series T', 0 < delta < N, N being a positive integer.
In step (3.1.2), the method of finding the point m where the slope of the array d0 changes most may employ: a line graph with d0 as the abscissa and x0 as the ordinate is plotted, and the point m where the change in slope is the greatest is found in the line graph.
It should be noted that the above method for obtaining the point m at which the slope of the array d0 changes most is only one embodiment, and other existing methods may be used to implement the method.
The minimum time interval delta caused by noise is calculated to be 40s, the minimum time interval is used for searching key points, and after the calculation of the correlation coefficient and the screening of the data segments to be divided, 5 key points, namely 4 segments are finally determined to be obtained. The time series segment diagram is shown in fig. 2(a) and 2 (b).
The minimum time interval is used to reduce the influence of noise on the segmentation result, and when the actual industrial process data is processed, the time interval between key points is too short due to the interference of the noise, so that the key point search is not performed in the neighborhood of the minimum time interval.
Specifically, the process of segmenting the binary time sequence T' in step 3 includes:
taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
For example: under the current division condition, calculating a correlation coefficient among all the segments, and further acquiring data segments to be divided; the data segment to be divided comprises three types of weak correlation data segment, middle correlation data segment and strong correlation data segment:
first, the correlation coefficient is satisfied to be 0.5 ≥ rhosA weakly correlated data segment of 0.3 or more;
second, satisfyAnd 0.9 > ρsGreater than 0.5 or 0.1 < rhosA data segment of < 0.3;
third, satisfy zs> N/3 and rhosNot less than 0.9 or rhosAfter the data segment less than or equal to 0.1 is divided, the subdata segments are strongly correlated; where ρ issThe correlation coefficient between variables in the s-th segment of the normalized binary time sequence T' is less than or equal to 0.1; z is a radical ofsThe number of samples in the s-th subsection; n is the preset time length of the binary time sequence T'; n and s are positive integers.
It should be noted that, the present invention may also set other preset correlation coefficient ranges, and divide the data segment into three data segments, namely, a weak correlation data segment, a middle correlation data segment, and a strong correlation data segment.
Variable X in the s-th segment of time series TiAnd XjSpearman sample correlation coefficient between:
the phenomenon of missing segmentation can be avoided by dividing the first data segment and the second data segment, and the phenomenon of overfitting can be avoided by dividing the third data segment.
Further, the process of segmenting the normalized binary time sequence T' in step 3 further includes:
regarding the data segment to be divided, using a linear interpolation method to take the projection of the data points in the standardized binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as a fitting point;
finding the farthest point by using the orthogonal distance as the key turning point of the next segmentation, then determining whether the third data segment to be segmented in the step (3.2.1) exists in the data segment to be segmented, and updating the key turning point;
and repeating the steps until the data segment to be divided does not exist any more.
Wherein, the parameter equation of the straight line AB in space can be expressed as:
the coordinates of any point P0 on the straight line AB can be expressed as:
[(XiB-XiA)β+XiA,(tB-tA)β+tA]。
where X represents a variable, t represents a time variable, i ═ 1, 2, representing an alarm variable and a related variable, respectively, a and B represent the ends of a straight line AB, respectively, and β represents a fixed value.
WhereinOf fingersIs the distance from point P to line AB
Wherein,when in useTaking the corresponding parameter at the minimum valueThe minimum distance, i.e. the orthogonal distance, from the data point P to the end-to-end line AB of the segment to which it belongs is the same asThe maximum value of D in each segment is the key turning point of the segment.
Taking four groups of multi-element time sequences shown in fig. 5(a) -5 (d) as examples, respectively, the invention uses a linear interpolation method to take the projections of the data points in the normalized binary time sequence T on the connection line of the head and tail data points of the segment to which the data points belong as fitting points, and obtains fitting straight lines respectively, as shown in fig. 5(a) -5 (d).
Under the constraint of the minimum time interval, the invention only aims at the weakly correlated data segment, the data segment with overlong time length and insignificant correlation, and the data segment with overlong time length and significant correlation, but the divided subdata segments are still strongly correlated data segments to carry out time sequence division, obtain the key turning points of the data segments to be divided, and carry out piecewise linear representation on the original time sequence, thereby avoiding overfitting and neglecting segmentation.
And 4, step 4: and (5) solving the correlation coefficient and the correlation trend of each segment.
Specifically, the specific process of finding the correlation coefficient and the correlation trend of each segment includes:
(4.1): dividing the time sequence according to the finally obtained key turning point, and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
according to the finally obtained key turning point set P ═ { P ═ P1,P2,...,PkDivide the time series of variables, where the piecewise linearity of the multivariate time series T' is expressed as:
TPLR=<f1[(Xi(p1),p1),(Xi(p2),p2)],...,fK[(Xi(pK-1),pK-1),(Xi(pK),pK)]>。
wherein f is1[(Xi(p1),p1),(Xi(p2),p2)]Is represented in segment pj,pj+1]Linear fit function within.
(4.2): and carrying out unilateral hypothesis test on the correlation of the variables, setting a significance level, confirming the correlation between the variables according to the unilateral hypothesis test result and the significance level, and determining the trend of the correlation coefficient.
In step (4.2), single-sided hypothesis testing: h0: rhos[Xi,Xj]=0vs H1:ρs[Xi,Xj]>0;
H0:ρs[Xi,Xj]=0 vs H2:ρs[Xi,Xj]<0;
When the number n of samples participating in hypothesis testing is > 10, the random variable Us is defined as:given a significance level of α, if Us>tα(zs-2), then H0 as opposed to H1 is rejected if Us<-tα(zs-2), then H0 as opposed to H2 is rejected, where t isα(zs-2) the quantile representing the statistic Us, in this case X in the s-th segmentiAnd XjIs related toConsidered significant, the sign direction signs(Xi,Xj) Respectively take the value of 1 or-1 if | Us|<tα(zs-2), neither for H1 or H2, H0 cannot be rejected, when there is no significant correlation between variables, sign directions(Xi,Xj) The value is 0.
When the number of samples n < 10, the cut-off value of the Spearman rank correlation coefficient for the small sample capacity hypothesis test is queried and will correspond to a given zsThe correlation coefficient critical value of sum α is denoted as ρα(zs) If | ρs[Xi,Xj]|>ρα(zs) H0 is rejected, signs(Xi,Xj) Respectively takes the value of 1 or-1, otherwise H0 can not be rejected, sign direction signs(Xi,Xj) The value is 0.
And calculating the correlation coefficient of each segment according to the obtained number of the segments, and drawing a confidence interval of the correlation coefficient, as shown in fig. 3, wherein L represents the number of the segments and is a positive integer.
Given α equal to 0.05, a correlation test was performed on each segment to determine the correlation trend between variables, as shown in fig. 4.
The method calculates the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula, performs unilateral hypothesis test on the correlation of variables, sets the significance level, confirms the correlation between the variables according to the unilateral hypothesis test result and the significance level, determines the trend of the correlation coefficient, provides accurate data for accurately acquiring the abnormal data segment and the related information thereof, and further improves the alarm efficiency.
And 5: and obtaining the abnormal data segment and the related information thereof according to the comparison between the correlation trend and the actual trend.
In step 5, the method further comprises: and acquiring a time sequence segmentation graph, a scatter diagram, a fitting straight line graph and a correlation coefficient confidence interval of the alarm variable and the correlation variable, and acquiring a time sequence original graph before translation if a dynamic delay relation exists.
According to the correlation trend analysis result, the inlet smoke temperature of the air preheater and the outlet air temperature of the air preheater are in a positive correlation relationship under the normal condition, but are irrelevant in the time period of t 2142-. Analysis shows that in the data section, the air speed at the inlet of the air preheater is reduced and the air volume is reduced due to the fault of the induced draft fan, so that the air temperature at the outlet of the air preheater is slightly increased under the condition that the smoke temperature at the inlet of the air preheater is not changed, and after the induced draft fan is adjusted, the relation between the two variables is recovered to be normal.
The invention selects the correlation among the industrial variables as the characteristic for judging whether the working point state is abnormal, establishes a binary time sequence for the associated variables, combines a time sequence segmentation method and a correlation coefficient trend method, reduces the over-fitting phenomenon and the missing segmentation phenomenon to the maximum extent, and quickly and automatically acquires the abnormal data segment from the historical data accurately, thereby carrying out abnormal data detection to provide favorable conditions for realizing the dynamic alarm threshold design of a multivariable alarm system, reducing the interference alarm, improving the alarm processing efficiency of field operators and ensuring the production safety.
FIG. 6 is a schematic diagram of the structure of the alarm correlation variable detection system based on correlation according to the present invention.
As shown in fig. 6, the alarm related variable detecting system based on correlation of the present invention includes:
(1) and the data extraction module is used for extracting alarm variables of preset time length and data of a plurality of related variables related to the alarm variables from the historical detection data and taking the alarm variables as detection objects.
(2) And the time sequence establishing module is used for judging the dynamic delay relation between the selected alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T into T'.
Further, in the time sequence establishing module, if a dynamic delay relationship exists between the alarm variable and the related variable, a section of initial data of the two variables is acquired to calculate the delay time h, and then the alarm variable with a preset time length or the time length of the related variable is translated by h; if no dynamic delay relationship exists between the alarm variable and the dependent variable, no translation is required.
According to the method, the dynamic delay relation between the alarm variable and the related variable is judged firstly, so that an accurate binary time sequence T is established, an abnormal data segment can be accurately obtained, the alarm processing efficiency of field operators is improved, and the production safety is guaranteed.
(3) A time series segmentation module for segmenting the binary time series T' under the constraint of a minimum time interval caused by noise.
Further, the time series segmentation module includes: the data segment to be divided acquisition module is used for taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
Specifically, a correlation coefficient range is preset, and the data segment classification attributes include a weak correlation data segment, a middle correlation data segment and a strong correlation data segment, taking the following ranges as examples:
the data segments to be divided are three types:
first, the correlation coefficient is satisfied to be 0.5 ≥ rhosA weakly correlated data segment of 0.3 or more;
second, satisfyAnd 0.9 > ρsGreater than 0.5 or 0.1 < rhosA data segment of < 0.3;
third, satisfy zs> n/3 and rhosNot less than 0.9 or rhosAfter the data segment less than or equal to 0.1 is divided, the subdata segments are still strongly correlated; where ρ issThe correlation coefficient between variables in the s-th segment of the normalized binary time sequence T' is less than or equal to 0.1; z is a radical ofsThe number of samples in the s-th subsection; n is the preset time length of the binary time sequence T'; n and s are positive integers. .
The phenomenon of missing segmentation can be avoided by dividing the first data segment and the second data segment, and the phenomenon of overfitting can be avoided by dividing the third data segment.
Further, the time-series segmentation module further includes:
the fitting point solving module is used for taking the projection of the data points in the standardized binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as the fitting points by utilizing a linear interpolation method aiming at the data segment to be divided;
and the key turning point calculation updating module is used for finding the farthest point by utilizing the orthogonal distance to serve as the key turning point of the next segmentation, then determining whether a third data segment to be segmented exists in the data segment to be segmented or not, and updating the key turning point until the data segment to be segmented does not exist any more.
Under the constraint of the minimum time interval, the invention only aims at the weakly correlated data segment, the data segment with overlong time length and insignificant correlation, and the data segment with overlong time length and significant correlation, but the divided subdata segments are still strongly correlated data segments to carry out time sequence division, obtain the key turning points of the data segments to be divided, and carry out piecewise linear representation on the original time sequence, thereby avoiding overfitting and neglecting segmentation.
(4) And the correlation coefficient calculation module is used for calculating the correlation coefficient and the correlation trend of each segment.
As shown in fig. 7, the correlation coefficient obtaining module includes:
the segment correlation coefficient calculation module is used for dividing the time sequence according to the finally obtained key turning point and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
and the correlation coefficient trend determining module is used for carrying out unilateral hypothesis test on the correlation of the variables, setting the significance level and determining the correlation coefficient trend according to the correlation between the unilateral hypothesis test result and the significance level confirmation variables.
The method calculates the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula, performs unilateral hypothesis test on the correlation of variables, sets the significance level, confirms the correlation between the variables according to the unilateral hypothesis test result and the significance level, determines the trend of the correlation coefficient, provides accurate data for accurately acquiring the abnormal data segment and the related information thereof, and further improves the alarm efficiency.
(5) And the abnormal data acquisition module is used for acquiring the abnormal data segment and the related information thereof according to the comparison between the correlation trend and the actual trend.
Further, the system further comprises: the minimum time interval calculation module is used for acquiring each lower inflection point of the alarm variable of the current preset time length, and solving the distance between adjacent lower inflection points to further form an array d;
sorting the array d and removing repeated elements to obtain an array d 0; calculating the distance dm between the point with the maximum slope change of the array d0 and the nearest lower inflection point, wherein dm is the minimum time interval in the alarm variable;
acquiring lower inflection points of the related variables and the distance between adjacent lower inflection points, and further acquiring the minimum time interval dh of the related variables;
the larger of dm and dh is taken as the minimum time interval of the binary time sequence T'.
The minimum time interval is used to reduce the influence of noise on the segmentation result, and when the actual industrial process data is processed, the time interval between key points is too short due to the interference of the noise, so that the key point search is not performed in the neighborhood of the minimum time interval.
The invention selects the correlation among the industrial variables as the characteristic for judging whether the working point state is abnormal, establishes a binary time sequence for the associated variables, combines a time sequence segmentation method and a correlation coefficient trend method, reduces the over-fitting phenomenon and the missing segmentation phenomenon to the maximum extent, and quickly and automatically acquires the abnormal data segment from the historical data accurately, thereby carrying out abnormal data detection to provide favorable conditions for realizing the dynamic alarm threshold design of a multivariable alarm system, reducing the interference alarm, improving the alarm processing efficiency of field operators and ensuring the production safety.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.
Claims (8)
1. A method for detecting alarm association variables based on correlation is characterized in that the method is completed in a server or a processor and specifically comprises the following steps:
step 1: extracting alarm variables with preset time length and data of a plurality of related variables related to the alarm variables from historical detection data, and selecting one group of alarm variables and related variables as detection objects;
step 2: judging the dynamic delay relation between the selected alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T to be T';
and step 3: segmenting the binary time series T' under the constraint of a minimum time interval caused by noise;
and 4, step 4: obtaining a correlation coefficient and a correlation trend of each segment;
and 5: comparing the correlation trend with the actual trend to obtain an abnormal data segment and related information thereof;
the process of segmenting the binary time sequence T' in the step 3 includes:
taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
2. The method according to claim 1, wherein in the step 2, if there is a dynamic delay relationship between the alarm variable and the related variable, the alarm variable or the time length of the related variable is translated for a preset time length and the dynamic delay relationship between the alarm variable and the related variable is kept unchanged; if no dynamic delay relationship exists between the alarm variable and the dependent variable, no translation is required.
3. The method of relevance based alarm correlation variable detection according to claim 1, further comprising, before step 3: calculating the minimum time interval caused by noise, wherein the specific process comprises the following steps:
(3.1.1) acquiring each lower inflection point of the alarm variable of the current preset time length, and solving the distance between adjacent lower inflection points to further form an array d;
(3.1.2) sequencing the array d and removing repeated elements to obtain an array d 0; calculating the distance dm between the point with the maximum slope change of the array d0 and the nearest lower inflection point, wherein dm is the minimum time interval in the alarm variable;
(3.1.3) repeating steps (3.1.1) and (3.1.2) on the dependent variable to obtain a minimum time interval dh for the dependent variable;
(3.1.4) the larger of dm and dh is taken as the minimum time interval of the binary time series T'.
4. The method for detecting alarm correlation variable based on correlation according to claim 1, wherein the process of segmenting the binary time series T' in the step 3 further comprises:
regarding the data segment to be divided, using a linear interpolation method to take the projection of the data points in the binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as fitting points;
finding the farthest point by using the orthogonal distance as the key turning point of the next segmentation, then determining whether the data segment to be segmented has a strongly correlated data segment, and updating the key turning point;
and repeating the steps until the data segment to be divided does not exist any more.
5. The method for detecting alarm correlation variable based on correlation according to claim 4, wherein the specific process of solving the correlation coefficient and the correlation trend of each segment in the step 4 comprises:
(4.1): dividing the time sequence according to the finally obtained key turning point, and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
(4.2): and carrying out unilateral hypothesis test on the correlation of the variables, setting a significance level, confirming the correlation between the variables according to the unilateral hypothesis test result and the significance level, and determining the trend of the correlation coefficient.
6. A correlation-based alarm correlation variable detection system, comprising:
the data extraction module is used for extracting alarm variables of a preset time length and data of a plurality of related variables related to the alarm variables from historical detection data, and selecting one group of alarm variables and related variables as detection objects;
the time sequence establishing module is used for judging the dynamic delay relation between the selected alarm variable and the related variable, and further establishing a binary time sequence T and standardizing the binary time sequence T into T';
a time series segmentation module for segmenting the binary time series T' under the constraint of a minimum time interval caused by noise;
the correlation coefficient calculating module is used for calculating the correlation coefficient and the correlation trend of each segment;
the abnormal data acquisition module is used for comparing the correlation trend with the actual trend to acquire an abnormal data segment and related information thereof;
in the time sequence establishing module, if a dynamic delay relation exists between the alarm variable and the related variable, translating the alarm variable with a preset time length or the time length of the related variable and keeping the dynamic delay relation between the alarm variable and the related variable unchanged; if no dynamic delay relation exists between the alarm variable and the related variable, translation is not needed;
further, the system further comprises: the minimum time interval calculation module is used for acquiring each lower inflection point of the alarm variable of the current preset time length, and solving the distance between adjacent lower inflection points to further form an array d;
sorting the array d and removing repeated elements to obtain an array d 0; calculating the distance dm between the point with the maximum slope change of the array d0 and the nearest adjacent lower inflection point, wherein dm is the minimum time interval in the alarm variable;
acquiring lower inflection points of the related variables and the distance between adjacent lower inflection points, and further acquiring the minimum time interval dh of the related variables;
taking the larger value of dm and dh as the minimum time interval of the binary time sequence;
further, the time series segmentation module includes: the data segment to be divided acquisition module is used for taking the binary time sequence T' as a data segment to be divided;
judging the classification attribute of the data segment to which the data segment to be divided belongs according to the correlation coefficient among the data in the data segment to be divided;
and according to the preset correlation coefficient range, the data segment classification attribute comprises a weak correlation data segment, a middle correlation data segment and a strong correlation data segment.
7. The correlation-based alarm association variable detection system of claim 6 wherein said time series segmentation module further comprises:
the fitting point solving module is used for taking the projection of the data points in the standardized binary time sequence T' on the head and tail data point connecting line of the segment to which the data points belong as the fitting points by utilizing a linear interpolation method aiming at the data segment to be divided;
and the key turning point calculation updating module is used for finding the farthest point by utilizing the orthogonal distance to serve as the key turning point of the next segmentation, then determining whether the data segment to be segmented has a strongly correlated data segment, and updating the key turning point until the data segment to be segmented does not exist any more.
8. The correlation-based alarm correlation variable detection system of claim 7 wherein the correlation coefficient derivation module comprises:
the segment correlation coefficient calculation module is used for dividing the time sequence according to the finally obtained key turning point and calculating the correlation coefficient of each segment in the time sequence by using a correlation coefficient formula;
and the correlation coefficient trend determining module is used for carrying out unilateral hypothesis test on the correlation of the variables, setting the significance level and determining the correlation coefficient trend according to the correlation between the unilateral hypothesis test result and the significance level confirmation variables.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710206963.0A CN106778053B (en) | 2017-03-31 | 2017-03-31 | A kind of alert correlation variable detection method and system based on correlation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710206963.0A CN106778053B (en) | 2017-03-31 | 2017-03-31 | A kind of alert correlation variable detection method and system based on correlation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106778053A CN106778053A (en) | 2017-05-31 |
CN106778053B true CN106778053B (en) | 2019-04-09 |
Family
ID=58965942
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710206963.0A Expired - Fee Related CN106778053B (en) | 2017-03-31 | 2017-03-31 | A kind of alert correlation variable detection method and system based on correlation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106778053B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492141A (en) * | 2018-03-28 | 2018-09-04 | 联想(北京)有限公司 | A kind of prediction technique and device of multi-model fusion |
CN108573055B (en) * | 2018-04-24 | 2019-02-15 | 山东科技大学 | A kind of multivariable alarm monitoring method and system based on qualitiative trends analysis |
CN108615340B (en) * | 2018-05-07 | 2019-04-02 | 山东科技大学 | A kind of method and system of dynamic alert threshold design and alarm elimination |
CN108549346B (en) * | 2018-05-14 | 2019-04-02 | 山东科技大学 | A kind of historical data section automatic searching method suitable for System Discrimination |
CN108629335A (en) * | 2018-06-05 | 2018-10-09 | 华东理工大学 | Adaptive face key feature points selection method |
CN113515093B (en) * | 2020-04-10 | 2024-08-13 | 阿里巴巴集团控股有限公司 | Data processing and production control method, device, equipment and storage medium |
CN111947903B (en) * | 2020-07-08 | 2022-11-22 | 中核武汉核电运行技术股份有限公司 | Vibration abnormity positioning method and device |
CN113344737B (en) * | 2021-06-04 | 2023-11-24 | 北京国电通网络技术有限公司 | Device control method, device, electronic device and computer readable medium |
CN113781758A (en) * | 2021-09-07 | 2021-12-10 | 浙江大学 | Variable collaborative dynamic alarm threshold optimization method for high-end coal-fired power generation equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915568A (en) * | 2015-06-24 | 2015-09-16 | 哈尔滨工业大学 | Satellite telemetry data abnormity detection method based on DTW |
CN106368813A (en) * | 2016-08-30 | 2017-02-01 | 北京协同创新智能电网技术有限公司 | Abnormal alarm data detection method based on multivariate time series |
-
2017
- 2017-03-31 CN CN201710206963.0A patent/CN106778053B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915568A (en) * | 2015-06-24 | 2015-09-16 | 哈尔滨工业大学 | Satellite telemetry data abnormity detection method based on DTW |
CN106368813A (en) * | 2016-08-30 | 2017-02-01 | 北京协同创新智能电网技术有限公司 | Abnormal alarm data detection method based on multivariate time series |
Non-Patent Citations (2)
Title |
---|
Correlation analysis of alarm data and alarm limit design for industrial processes;Fan Yang,et al.;《2010 American Control Conference Marriott Waterfront,Baltimore,MD ,USA》;20100630;5850-5855 |
异常检测在报警关联分析中的应用;王娟,等;《解放军理工大学学报(自然科学版)》;20090615;第10卷(第3期);278-280 |
Also Published As
Publication number | Publication date |
---|---|
CN106778053A (en) | 2017-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106778053B (en) | A kind of alert correlation variable detection method and system based on correlation | |
CN106368813B (en) | A kind of abnormal alarm data detection method based on multivariate time series | |
CN110895526A (en) | Method for correcting data abnormity in atmosphere monitoring system | |
CN110648480B (en) | Single variable alarm system and method based on change rate | |
CN117556714B (en) | Preheating pipeline temperature data anomaly analysis method for aluminum metal smelting | |
CN113723452B (en) | Large-scale anomaly detection system based on KPI clustering | |
CN110851422A (en) | Data anomaly monitoring model construction method based on machine learning | |
CN105607631B (en) | The weak fault model control limit method for building up of batch process and weak fault monitoring method | |
CN105117550A (en) | Product multidimensional correlation-oriented degradation failure modeling method | |
CN112906738B (en) | Water quality detection and treatment method | |
CN112288597A (en) | Energy consumption online anomaly detection method based on hierarchical clustering and histogram algorithm | |
CN109951420B (en) | Multi-stage flow anomaly detection method based on entropy and dynamic linear relation | |
CN108205432B (en) | Real-time elimination method for observation experiment data abnormal value | |
CN114997276A (en) | Heterogeneous multi-source time sequence data abnormity identification method for compression molding equipment | |
CN110826559B (en) | Torch smoke monitoring method based on visual perception | |
CN108508860B (en) | Process industrial production system data monitoring method based on coupling relation | |
CN106406257A (en) | Iron ore flotation concentrate grade soft measurement method and system based on case-based reasoning | |
CN112184034B (en) | Multi-block k-nearest neighbor fault monitoring method and system based on mutual information | |
CN100375076C (en) | Condition identifying metod and condition identifying system | |
CN116629686A (en) | Method and device for evaluating enterprise energy consumption data | |
CN115526407A (en) | Power grid big data safety detection early warning method and system based on causal machine learning | |
CN115935285A (en) | Multi-element time series anomaly detection method and system based on mask map neural network model | |
CN113627885A (en) | Power grid power supply abnormity monitoring system and monitoring method thereof | |
CN110929800A (en) | Business body abnormal electricity utilization detection method based on sax algorithm | |
CN116075824B (en) | Automatic window generation of process traces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190409 Termination date: 20200331 |