CN116401545A - Multimode fusion type turbine runout analysis method - Google Patents
Multimode fusion type turbine runout analysis method Download PDFInfo
- Publication number
- CN116401545A CN116401545A CN202310322168.3A CN202310322168A CN116401545A CN 116401545 A CN116401545 A CN 116401545A CN 202310322168 A CN202310322168 A CN 202310322168A CN 116401545 A CN116401545 A CN 116401545A
- Authority
- CN
- China
- Prior art keywords
- value
- runout
- data
- model
- predicted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 24
- 238000004458 analytical method Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 39
- 230000002159 abnormal effect Effects 0.000 claims abstract description 21
- 238000012544 monitoring process Methods 0.000 claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 15
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims abstract description 15
- 238000009826 distribution Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000012360 testing method Methods 0.000 claims description 9
- 238000010219 correlation analysis Methods 0.000 claims description 7
- 238000012952 Resampling Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 14
- 238000004422 calculation algorithm Methods 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000005284 excitation Effects 0.000 description 3
- 238000012886 linear function Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000012806 monitoring device Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007636 ensemble learning method Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- KJONHKAYOJNZEC-UHFFFAOYSA-N nitrazepam Chemical compound C12=CC([N+](=O)[O-])=CC=C2NC(=O)CN=C1C1=CC=CC=C1 KJONHKAYOJNZEC-UHFFFAOYSA-N 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2148—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F03—MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
- F03B—MACHINES OR ENGINES FOR LIQUIDS
- F03B13/00—Adaptations of machines or engines for special use; Combinations of machines or engines with driving or driven apparatus; Power stations or aggregates
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F03—MACHINES OR ENGINES FOR LIQUIDS; WIND, SPRING, OR WEIGHT MOTORS; PRODUCING MECHANICAL POWER OR A REACTIVE PROPULSIVE THRUST, NOT OTHERWISE PROVIDED FOR
- F03B—MACHINES OR ENGINES FOR LIQUIDS
- F03B15/00—Controlling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
- G06F18/15—Statistical pre-processing, e.g. techniques for normalisation or restoring missing data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
- G06F18/2113—Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E10/00—Energy generation through renewable energy sources
- Y02E10/20—Hydro energy
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Mechanical Engineering (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a multimode fusion type turbine runout analysis method, which comprises the following steps of: s1, acquiring relevant data of the vibration of a historical unit, and preprocessing; s2, performing characteristic engineering on the relevant data of the unit runout to obtain a training set; s3, extracting data associated with runout in the training set, and respectively inputting an SVR model, a LightGBM model and an XGBoost model for training; s4, performing least square fitting on the results of the three models to obtain weight distribution of the three models, and forming a fusion model; s5, inputting the on-line monitoring runout related data into a fusion model to obtain a predicted runout value and taking the predicted runout value as a standard value of runout under the working condition; s6, comparing the collected runout data with a standard value, and marking abnormal data and abnormal grades. According to the method, multidimensional influence factors are considered, the operation condition of the water turbine is considered, three models are used for training respectively, the accuracy of a prediction model is ensured by an automatic weighting mode, and the accuracy and the scientificity of prediction are improved.
Description
Technical Field
The invention belongs to the technical field of hydroelectric generator operation analysis, and particularly relates to a multimode fusion type hydraulic turbine runout analysis method.
Background
The hydroelectric generating set is used as a large-scale rotary machine, and the vibration of the set in operation is ubiquitous, can not be completely avoided and eliminated, and serious vibration of the set affects the power supply quality, safe operation and service life of the set. Under the composite influence of various reasons such as mechanical, hydraulic and electromagnetic factor coupling and mechanical component ageing, the fault that the hydroelectric generating set produced is mostly expressed in the form of runout, so that the runout signal can intuitively represent the running state of the generating set.
The current monitoring system and the online monitoring system of the hydroelectric generating set monitor important indexes, and set alarm limit values for the indexes; however, in order to avoid false alarms, the set limit is high, and when the unit reaches the alarm limit, serious faults may have occurred. Even in a stable operation area, each monitoring index of the hydroelectric generating set is influenced by working conditions such as water head, exciting current and the like, and fluctuates up and down, and the real condition of the equipment state still cannot be reflected by the monitoring index change rate directly acquired and calculated. With the new technical innovation application of artificial intelligence, big data analysis and the like, trend analysis becomes possible by means of intelligent algorithms and technologies, and the transformation of the production mode of the hydropower plant from traditional manual monitoring and manual decision making into informationized, automatic and intelligent machine decision making is necessary.
Defects and deficiencies of the prior art:
1. at present, most hydroelectric generating sets are provided with a considerable amount of online monitoring systems, but no standardized operation, use and maintenance methods are formed, the application of the online monitoring systems is not important enough, and the acquired data lack of special personnel to carry out deep analysis and technical support of professional technicians.
2. At present, the state monitoring and early warning of the hydraulic generator adopts a mode of setting a fixed threshold value and calculating the change rate, and the problems of false alarm and untimely early warning exist.
3. Considering the coupling influence of various factors such as the running environment, local impact and the like, the hydroelectric generating set runout monitoring signal often presents complex non-stable and nonlinear characteristics, the development trend of the generating set runout signal is predicted by utilizing the existing method, and satisfactory prediction precision is difficult to obtain.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a multimode fused turbine runout analysis method which is used for analyzing and predicting turbine runout data and accurately alarming abnormal runout values in real time so as to achieve the aim of fault diagnosis auxiliary decision.
The technical purpose of the invention is realized by the following technical scheme: a multimode fusion turbine runout analysis method specifically comprises the following steps:
s1, acquiring historical unit runout related data, and preprocessing the unit runout related data;
s11, collecting historical online and offline monitoring data, removing data and error data which are irrelevant to runout, and preliminarily integrating a data set used for training;
s12, cleaning repeated data, zero value data and missing data in the data set, and resampling;
s2, further performing characteristic engineering on the preprocessed unit runout related data to obtain a training set;
s21, carrying out correlation analysis on data in the data set, carrying out correlation analysis on the characteristic attribute and the target attribute by using a Pearson correlation coefficient, and selecting a required characteristic value according to the correlation sequence;
s22, according to each column of features, solving the maximum value max and the minimum value min of each feature;
s23, if the min is more than or equal to 0, normalizing each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s24 if min <0, normalize each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s25, segmenting the normalized data, dividing the data into a training set and a test set, wherein the proportion is 8:2, and dividing the characteristic value and the target value to obtain the training set;
s3, extracting data related to runout in the training set, and respectively inputting an SVR model, a LightGBM model and an XGBoost model for training;
s4, performing least square fitting on the results of the three models to obtain weight distribution of the three models, and forming a fusion model;
s5, inputting the on-line monitoring runout related data into a trained fusion model to obtain a predicted runout value and taking the predicted runout value as a standard value of runout under the working condition;
s6, comparing the collected runout data with a standard value, and marking abnormal data and abnormal grades.
Preferably, in step S6, if the amplitude is greater than or equal to 10 μm from the predicted value when the vibration value and the yaw rate value are smaller than 40 μm, it is determined that the secondary amplitude is abnormal; if the amplitude is larger than the predicted value by more than 20 mu m, judging that the first-order amplitude is abnormal.
Preferably, in step S6, when the vibration value and the swing value are greater than 40um, the amplitude is greater than 10% -25% of the predicted value, and it is determined that the secondary amplitude is abnormal; when the amplitude is greater than 25% of the predicted value, it is determined that the first-order amplitude is abnormal.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the multimode fusion water turbine runout analysis method provided by the invention, multidimensional influence factors are considered, the running condition of the water turbine is considered, a more accurate prediction result is obtained, and more reliable alarm output is provided.
2. According to the multimode fusion water turbine runout analysis method provided by the invention, three models are used for training respectively, the accuracy of a prediction model is ensured by an automatic weighting mode, and the accuracy and the scientificity of prediction are improved.
3. The multimode fused turbine runout analysis method provided by the invention can be integrated in an online monitoring system to predict data in real time, and overcomes the limitation that the current online monitoring device only adopts a runout value out-of-limit mode to perform early warning.
Drawings
FIG. 1 is a flow chart of one embodiment of the present invention.
FIG. 2 is a schematic diagram of SVR model support vector regression in accordance with one embodiment of the present invention.
FIG. 3 is a graph of the predicted outcome of a water-guided ferry X-direction model in one embodiment of the present invention.
FIG. 4 is a graph of predicted results of a water-guided ferry Y-direction model in accordance with one embodiment of the present invention.
FIG. 5 is a graph of the results of model predictions of the X-direction of vibration of the top cover in accordance with an embodiment of the present invention.
FIG. 6 is a graph of top cover vibration Y-direction model predictions in accordance with an embodiment of the present invention.
FIG. 7 is a graph of top cover vibration Z-direction model predictions in accordance with an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As a preferred embodiment of the present invention, the present embodiment provides a multimode fused turbine runout analysis method, referring to FIG. 1, specifically comprising the following steps:
s1, acquiring historical unit runout related data, including possibly related data such as water head, power, excitation, guide vane opening and the like, and preprocessing the unit runout related data;
s11, collecting historical online and offline monitoring data, removing data and error data which are irrelevant to runout, and preliminarily integrating a data set used for training;
s12, cleaning repeated data, zero value data and missing data in the data set, and resampling;
s2, further performing characteristic engineering on the preprocessed unit runout related data to obtain a training set;
s21, carrying out correlation analysis on data in the data set, carrying out correlation analysis on the characteristic attribute and the target attribute by using a Pearson correlation coefficient, and selecting a required characteristic value according to the correlation sequence;
s22, according to each column of features, solving the maximum value max and the minimum value min of each feature;
s23, if the min is more than or equal to 0, normalizing each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s24 if min <0, normalize each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s25, segmenting the normalized data, dividing the data into a training set and a test set, wherein the proportion is 8:2, and dividing the characteristic value and the target value to obtain the training set;
s3, extracting data related to runout in the training set, and respectively inputting an SVR model, a LightGBM model and an XGBoost model for training;
s4, the robustness of the model is increased, least square fitting is conducted on the results of the three models, weight distribution of the three models is obtained, a fusion model is formed, the model of the predicted runout result is obtained, and the fusion model is formed through the weight distribution of the three models, so that the predicted data is close to the real condition, and the model capable of accurately predicting the runout result is obtained;
s5, inputting the on-line monitoring runout related data into a trained fusion model to obtain a predicted runout value and taking the predicted runout value as a standard value of runout under the working condition;
s6, comparing the collected runout data with a standard value, and marking abnormal data and abnormal grades.
In the above embodiment, the SVR model is generally called support vector regression, i.e. support vector regression, which is an application of the SVM (support vector machine support vector machine) to regression problem. It fits the samples with a linear function in vector space. The model takes the integrated distance from the actual positions of all samples to the linear function as loss, and the parameters of the linear function are obtained by minimizing the loss.
LightGBM (Light Gradient Boosting Machine) is a distributed gradient promotion framework based on decision tree algorithm. In order to meet the requirement of shortening the model calculation time in the industry, the design idea of the LightGBM is mainly two points: the use of data to the memory is reduced, and the single machine can use more data as much as possible under the condition of not sacrificing the speed; the cost of communication is reduced, the efficiency of multi-machine parallel operation is improved, and the linear acceleration in calculation is realized.
The full name of XGBoost is eXtreme Gradient Boosting, which is an optimized distributed gradient promotion library, intended to be efficient, flexible and portable. XGBoost redefines the loss function and the weak evaluator based on the gradient lifting tree, improves the integration means of the lifting algorithm, and realizes the balance of the operation speed and the model effect.
The training set after the feature engineering is input into the SVR model LightGBM and the XGBoost model for training, so that the model finds out the relation between the related data of the vibration of the water head, the power, the excitation, the guide vane and the like and the vibration data, and the output model can predict the vibration value to a certain extent.
In the above embodiment, the sample data (x, y) is set, the model output value is denoted as f (x), and the true value is denoted as y. The conventional regression model takes the difference between f (x) and y as a loss value, and the model can determine the loss value as 0 only when f (x) is equal to f (x). The SVR can set a deviation value e, calculate the absolute value of the difference between f (x) and y, and calculate the loss when the absolute value is greater than e. Referring to fig. 3, a graphical representation is that the spacing bands of width e are set on each side of f (x), and the values falling between the two spacing bands are considered to be correct values.
The SVR differs from the conventional SVR in that it takes into account the relaxation variables xi, the penalty coefficients C, the insensitive loss function e in the derivation process. Through derivation, the functional form of SVR can be finally obtained:
where w represents a weight vector in a high-dimensional space, b is a threshold, phi (x i ) Is a nonlinear function and epsilon is a set parameter of the insensitive loss function.The function of gentle fitting can be achieved, and the popularization capability of the model is further improved; the punishment parameter C represents the control level of a sample point with the error exceeding a given value epsilon and mainly plays a role in balancing the model estimation degree and the complex degree, and the punishment factor is usually a positive number under the general condition; epsilon represents the requirement for regression model errors; the relaxation variables ζ, ζ introduced * And the upper and lower bounds of the output value are controlled.
When the solution in the sample space is impossible, a kernel function K (x i ,x j ) The solution at this time is as follows:
alpha in the formula i Andare all determined coefficients, K (x) i ,x j )=φ(x i )φ(x j ) Is a symmetric positive real function.
The LightGBM belongs to a boosting integrated learning method, and is an efficient implementation of a gradient lifting tree (gradient boosting decision tree, GBDT) algorithm framework. The GBDT algorithm is implemented by: input training set { (x) 1 ,y 1 ),(x 2 ,y 2 ),…(x N ,y N ) Initializing a classifierWherein h is 0 (x) The first base learner selected by the user and the training targets of the T base learners are set, and the calculation method of each base learner is as follows:
1) Calculating the negative gradient my of the current loss function i :
2) Fitting the negative gradient to obtain the current base learner h t Is defined by the parameters:
3) Minimizing the loss function yields the weight of the current basis learner:
final classifier F t (x) I.e., a weighted sum for each base learner:
F t (x)=F t-1 (x i )+α t h t (x;w t ) (4)
as can be seen from the calculation processes of the formulas (1) - (4), the GBDT algorithm needs to traverse the whole training data multiple times in each iteration, and if the whole training data is loaded into the memory, the size of the training data is limited, and if the whole training data is not loaded into the memory, the training data needs to be repeatedly read and written, so that a great amount of calculation time is consumed.
To solve this problem, the LightGBM makes optimization of feature histograms, single-sided gradient sampling, mutually exclusive feature bundling, and Leaf-wise growth strategies in the traditional GBDT algorithm. These optimizations allow the algorithm to have faster training speeds and lower memory consumption, so the LightGBM algorithm is more suitable for processing massive amounts of data, while the runout data has a huge amount of data.
XGBoost is an ensemble learning method that improves the performance of the model by iteratively adding weak learners to the training data. In each iteration, the XGBoost adds a new model into the original model, and fits the residual error between the predicted result and the real result of the previous model by using the new model so as to obtain a better predicted result. In terms of runout prediction, XGBoost uses a basic regression tree model, and the integrated model of the tree can be expressed as follows:
wherein: x is x i Is the feature vector of the i-th input;representing a predicted shimmy value for the ith sample; k represents the number of regression trees; r is the collection space of the regression tree; f (f) k A function in the representation set R is the output of the base learner.
By accumulating the results of the iterative process, the objective function of XGBoost may be converted to the following:
wherein:is the error between the predicted outcome and the true outcome, < >>Is a regularization term of the objective function, Ω (f k ) The expression of (2) is:
wherein: t is the number of leaf nodes; gamma is a penalty function coefficient for controlling the number of leaf nodes; omega j Is the weight of the leaf node; lambda is the regularization penalty term coefficient. Finally, combining the iteration result of XGBoost and at f k The optimal objective function value can be obtained by taylor second-order expansion at=0.
In least square fitting of the results of the three models, it is assumed that the SVR prediction result is y 1 The LightGBM prediction result is y 2 XGBoost prediction result is y 3 Then the predicted value y' of the fused model meets the following convention:
y'=βy 1 +γy 2 +λy 3
wherein beta is the proportion of SVR model weight to combination weight, gamma is the proportion of LightGBM model weight to combination weight, lambda is the proportion of XGBoost model weight to combination weight, and the following conditions are satisfied:
β+γ+λ=1
and optimally solving the comprehensive weight according to the following steps:
in a specific verification test, the water guiding swing degree and top cover vibration of the water turbine are selected as prediction targets, 49 relevant features are screened through expert experience, pelson coefficient correlation analysis is adopted, and 25 high relevant features are selected as feature sets. The historical data are split into training data and test data, the SVR model, the LightGBM model and the XGBoost model are respectively adopted for training and testing, further, in order to increase the robustness of the models, the three models are subjected to weight fusion by adopting a least square method, and according to the weight fusion distribution method, the fused weight distribution is as shown in the following table:
and selecting 100 points from the prediction results of all models for drawing, wherein the experimental results are shown in fig. 3-7.
It can be seen from fig. 3-7 that the predicted results of the four models on the test set are consistent with the trend of the true values, which indicates that the four methods have good effects on the prediction of runout. To further evaluate the model effect, three models were evaluated using a coefficient of determination R2 index, the R2 score reflects the ratio of all variations of the dependent variable that can be interpreted by the independent variable through a regression relationship, expressed as:
wherein y is i Representing the actual observed value byMean value of true observations is expressed by +.>Representing the predicted value, MSE is the mean square error and Var is the variance.
The R2 score ranges from 0,1, and when R2 is 1, the predicted value and the true value in the sample are completely equal, and no error exists, which means that the better the interpretation of the independent variable to the dependent variable in the regression analysis, the larger R2 generally means the better the model fitting effect [12], and the results are shown in the following table:
from the table, the R2 coefficients of the three methods on the water deflection degree and the top cover vibration are both larger than 0.95, which shows that the three models can effectively predict the vibration value, but by comparison, the R2 score can reach more than 0.98 through the fusion model after least square fusion, and the prediction accuracy is obviously improved.
When the least square fitting is carried out on the results of the three models, the weight distribution of the SVR model, the LightGBM model and the XGBoost model can be slightly different according to different feature sets and different training data and test data, and accurate and unique weight distribution is obtained according to the method aiming at the determined feature sets and the determined training data and test data, so that the method is a main reason for improving the prediction accuracy after the models are fused, and is a contribution of the method for improving the runout analysis and the prediction of the water turbine.
According to the embodiment, the multi-model fusion and the separate training are carried out, the fusion model for analyzing and predicting the runout of the water turbine is obtained by adopting an automatic weighting mode, the accuracy of the prediction model is ensured, the influence factors of multiple dimensions are considered, the running conditions of the water turbine are considered, including but not limited to the runout related data such as water head, power, excitation, guide vane opening and the like, more accurate prediction results can be obtained, more reliable alarm output is provided, real-time prediction of the data is realized, and the limitation that the current online monitoring device only adopts the runout value out-of-limit mode for early warning is overcome.
In some embodiments, based on the above embodiments, in step S6, if the amplitude is greater than the predicted value by more than 10 μm when the vibration value and the yaw value are smaller than 40 μm, it is determined that the secondary amplitude is abnormal; if the amplitude is larger than the predicted value by more than 20 mu m, judging that the first-order amplitude is abnormal.
In other embodiments, based on the above embodiments, in step S6, when the vibration value and the swing value are greater than 40um, the amplitude is greater than 10% -25% of the predicted value, and it is determined that the secondary amplitude is abnormal; when the amplitude is greater than 25% of the predicted value, it is determined that the first-order amplitude is abnormal.
The present invention is not limited in its scope to the examples given herein, and all prior art, including but not limited to prior patent documents, prior publications, etc., which do not contradict the scope of the present invention.
In addition, it should be noted that the combination of the technical features described in the present invention is not limited to the combination described in the claims or the combination described in the specific embodiments, and all the technical features described in the present invention may be freely combined or combined in any manner unless contradiction occurs between them.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (3)
1. A multimode fusion turbine runout analysis method is characterized by comprising the following steps:
s1, acquiring historical unit runout related data, and preprocessing the unit runout related data;
s11, collecting historical online and offline monitoring data, removing data and error data which are irrelevant to runout, and preliminarily integrating a data set used for training;
s12, cleaning repeated data, zero value data and missing data in the data set, and resampling;
s2, further performing characteristic engineering on the preprocessed unit runout related data to obtain a training set;
s21, carrying out correlation analysis on data in the data set, carrying out correlation analysis on the characteristic attribute and the target attribute by using a Pearson correlation coefficient, and selecting a required characteristic value according to the correlation sequence;
s22, according to each column of features, solving the maximum value max and the minimum value min of each feature;
s23, if the min is more than or equal to 0, normalizing each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s24 if min <0, normalize each column of data as follows:
wherein x is i,j For the characteristic value of the ith column and jth row, x' i,j For its normalized value, min i Minimum value of ith column, max i Is the maximum value of the ith column;
s25, segmenting the normalized data, dividing the data into a training set and a test set, wherein the proportion is 8:2, and dividing the characteristic value and the target value to obtain the training set;
s3, extracting data related to runout in the training set, and respectively inputting an SVR model, a LightGBM model and an XGBoost model for training;
s4, performing least square fitting on the results of the three models to obtain weight distribution of the three models, and forming a fusion model;
s5, inputting the on-line monitoring runout related data into a trained fusion model to obtain a predicted runout value and taking the predicted runout value as a standard value of runout under the working condition;
s6, comparing the collected runout data with a standard value, and marking abnormal data and abnormal grades.
2. The method for analyzing the runout of a multi-model fusion water turbine according to claim 1, wherein in the step S6, if the vibration value and the runout value are smaller than 40 μm and the amplitude is larger than the predicted value by more than 10 μm, the second-level amplitude is judged to be abnormal; if the amplitude is larger than the predicted value by more than 20 mu m, judging that the first-order amplitude is abnormal.
3. The multimode fused turbine runout analysis method of claim 1, wherein the method comprises the following steps of: in the step S6, when the vibration value and the swing value are larger than 40um, judging that the secondary amplitude is abnormal when the amplitude is larger than 10% -25% of the predicted value; when the amplitude is greater than 25% of the predicted value, it is determined that the first-order amplitude is abnormal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310322168.3A CN116401545A (en) | 2023-03-29 | 2023-03-29 | Multimode fusion type turbine runout analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310322168.3A CN116401545A (en) | 2023-03-29 | 2023-03-29 | Multimode fusion type turbine runout analysis method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116401545A true CN116401545A (en) | 2023-07-07 |
Family
ID=87015410
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310322168.3A Pending CN116401545A (en) | 2023-03-29 | 2023-03-29 | Multimode fusion type turbine runout analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116401545A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117932499A (en) * | 2024-03-21 | 2024-04-26 | 四川交通职业技术学院 | Method for monitoring abnormity of toothed rail |
-
2023
- 2023-03-29 CN CN202310322168.3A patent/CN116401545A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117932499A (en) * | 2024-03-21 | 2024-04-26 | 四川交通职业技术学院 | Method for monitoring abnormity of toothed rail |
CN117932499B (en) * | 2024-03-21 | 2024-05-31 | 四川交通职业技术学院 | Method for monitoring abnormity of toothed rail |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230213895A1 (en) | Method for Predicting Benchmark Value of Unit Equipment Based on XGBoost Algorithm and System thereof | |
CN105117602B (en) | A kind of metering device running status method for early warning | |
CN112765873B (en) | LSTM algorithm-based power transformation equipment heating prediction method | |
CN111444940A (en) | Fault diagnosis method for critical parts of fan | |
de la Hermosa González | Wind farm monitoring using Mahalanobis distance and fuzzy clustering | |
CN110763997A (en) | Early fault early warning method for synchronous motor stator | |
CN118157132B (en) | Data mining method and device for voltage monitoring system based on neural network | |
CN114358116A (en) | Oil-immersed transformer fault diagnosis method and system and readable storage medium | |
CN112821424A (en) | Power system frequency response analysis method based on data-model fusion drive | |
CN116662925A (en) | Industrial process soft measurement method based on weighted sparse neural network | |
CN116401545A (en) | Multimode fusion type turbine runout analysis method | |
CN114548494B (en) | Visual cost data prediction intelligent analysis system | |
CN115238573A (en) | Hydroelectric generating set performance degradation trend prediction method and system considering working condition parameters | |
CN117829822B (en) | Power transformer fault early warning method and system | |
CN114116832A (en) | Power distribution network abnormity identification method based on data driving | |
CN114326395A (en) | Intelligent generator set control model online updating method based on working condition judgment | |
CN117893058A (en) | Method and system for comprehensively evaluating performance of photovoltaic field station | |
CN117520809A (en) | Transformer fault diagnosis method based on EEMD-KPCA-CNN-BiLSTM | |
CN116561691A (en) | Power plant auxiliary equipment abnormal condition detection method based on unsupervised learning mechanism | |
Ou et al. | Fault Prediction Model of Wind Power Pitch System Based on BP Neural Network | |
CN111443686A (en) | Industrial alarm design method based on multi-objective optimization and evidence iterative update | |
CN118263863B (en) | Intelligent control method for power load balance | |
Zheng et al. | Research on predicting remaining useful life of equipment based on health index | |
CN118656272B (en) | Equipment operation process monitoring system | |
CN118757347B (en) | Intelligent identification method and system for wind generating set |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |