CN111105074A - Fault prediction method based on improved deep belief learning - Google Patents
Fault prediction method based on improved deep belief learning Download PDFInfo
- Publication number
- CN111105074A CN111105074A CN201911156833.6A CN201911156833A CN111105074A CN 111105074 A CN111105074 A CN 111105074A CN 201911156833 A CN201911156833 A CN 201911156833A CN 111105074 A CN111105074 A CN 111105074A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- learning
- error
- prediction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012360 testing method Methods 0.000 claims abstract description 38
- 230000006870 function Effects 0.000 claims description 36
- 239000011159 matrix material Substances 0.000 claims description 33
- 238000012549 training Methods 0.000 claims description 18
- 238000012795 verification Methods 0.000 claims description 14
- 238000007781 pre-processing Methods 0.000 claims description 13
- 230000036541 health Effects 0.000 abstract description 5
- 230000008859 change Effects 0.000 abstract description 4
- 238000013135 deep learning Methods 0.000 abstract description 4
- 238000000605 extraction Methods 0.000 abstract description 4
- 239000010410 layer Substances 0.000 description 45
- 230000004913 activation Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005662 electromechanics Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a fault prediction method based on improved deep belief learning, which comprises the following steps of: data acquisition-data preprocessing-test model acquisition-fault prediction-result analysis. Aiming at the problems that the characteristics of the data of the prediction object are difficult to discover, the failure trend is not obvious and the like, the data of the prediction object is deeply mined by utilizing the strong characteristic extraction capability of deep learning, the change information of the failure of the prediction object is learned, the indexes of the residual life or the health degree and the like are used as the learning target of the model, and the dangerous part is discovered before the failure so as to avoid the failure.
Description
Technical Field
The invention relates to a fault prediction method based on improved deep belief learning, belongs to the technical field related to fault prediction, and particularly relates to a fault prediction method based on a neural network.
Background
The fault prediction is one of the hot problems of research in the current colleges and universities and industrial departments, and with the development of scientific technology, the fault prediction gradually replaces fault diagnosis and is changed from after-service to before-service. In addition, the amount of state data acquired at present is larger and larger, and the traditional fault prediction method is difficult to fully utilize the characteristic information in the large data, so that the information expression is incomplete, and the reliability of the prediction result is not high. Although many failure prediction techniques based on machine learning and deep learning methods belong to classification prediction, regression prediction is few, and in many regression prediction methods, it is difficult to process a large amount of data using simple regression prediction, and the output range of a prediction result is limited due to the range of a nonlinear activation function in regression prediction using deep learning.
Disclosure of Invention
In order to solve the technical problems, the invention provides a fault prediction method based on improved deep belief learning, which is improved on the basis of a DBN model, an output layer of the model adopts an extreme learning mode without adding a nonlinear activation function, the output range of the model is not limited, and credible output can be obtained during prediction.
The invention is realized by the following technical scheme.
The invention provides a fault prediction method based on improved deep belief learning, which comprises the following steps of:
① obtaining data, obtaining structured data and learning models that reflect changes in the state of the predicted object;
②, preprocessing data, namely defining a label for the structured data and acquiring label data, preprocessing the structured data through characteristic engineering and acquiring characteristic data;
③ obtaining a test model by setting a prediction error value and an error function, initializing a learning model and inputting characteristic data to train the initialized learning model, obtaining a fine error model according to the prediction error value and the trained learning model, obtaining an error function value according to the error function and the trained learning model, and obtaining the test model according to the fine error model and the error function value;
④ failure prediction, namely acquiring a failure threshold value through a test model, and acquiring prediction information according to the failure threshold value;
⑤ analyzing the result, and counting the prediction accuracy according to the test information.
The step ② is divided into the following steps:
(2.1) defining a label for the structured data, and acquiring label data;
(2.2) preprocessing the structured data;
(2.3) extracting characteristic parameters from the structured data;
(2.4) dividing the characteristic parameters to obtain a data set;
and (2.4) vectorizing and normalizing the data set to obtain characteristic data.
In the step (2.4), if the data set exceeds five thousand, the data set is divided into a training set, a verification set and a test set, and if the data set does not exceed five thousand, the data set is divided through k-fold cross verification.
The step ③ is divided into the following steps:
(3.1) setting a prediction error value and initializing a learning model;
(3.2) carrying out greedy unsupervised pre-learning layer by layer on the learning model according to the characteristic data to obtain a coarse error model;
(3.3) obtaining a fine error model through the coarse error model and the label data;
(3.4) acquiring the output of the learning model through the characteristic data and the fine error model; (3.5) acquiring an error function value through the output of the learning model and the label data, and finishing learning if the error function value is smaller than the set prediction error value; otherwise, the learning model corrects the learning model based on an error back propagation method until the error function value is minimum;
and (3.6) obtaining a test model according to the fine error model and the minimum error function value.
In step ①, the structured data is data with each attribute having the same format.
In step ②, the label indicates that the structured data reflects the state representation of the predicted object.
In the step (3.3), a matrix equation is solved through the coarse error model and the label data together, and the matrix equation is solved into the hyper-parameters of the fine error model.
The matrix equation is:
Hβ=Y;
the H represents the output matrix of the coarse error model, Y represents the tag data matrix, and β is the minimum norm least squares solution.
The invention has the beneficial effects that: aiming at the problems that the characteristics of the predicted object data are difficult to discover and the failure trend is not obvious, the predicted object data are deeply mined by utilizing the strong characteristic extraction capability of deep learning, the change information of the failure of the predicted object is learned, indexes such as the residual life or the health degree of the predicted object are used as the learning target of the model, and dangerous parts are discovered before the failure so as to avoid the failure.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the present invention for obtaining a test model.
Detailed Description
The technical solution of the present invention is further described below, but the scope of the claimed invention is not limited to the described.
As shown in fig. 1 and fig. 2, a failure prediction method based on improved deep belief learning includes the following steps:
① obtaining data, obtaining structured data and learning models that reflect changes in the state of the predicted object;
②, preprocessing data, namely defining a label for the structured data and acquiring label data, preprocessing the structured data through characteristic engineering and acquiring characteristic data;
③ obtaining a test model by setting a prediction error value and an error function, initializing a learning model and inputting characteristic data to train the initialized learning model, obtaining a fine error model according to the prediction error value and the trained learning model, obtaining an error function value according to the error function and the trained learning model, and obtaining the test model according to the fine error model and the error function value;
④ failure prediction, namely acquiring a failure threshold value through a test model, and acquiring prediction information according to the failure threshold value;
⑤ analyzing the result, and counting the prediction accuracy according to the test information.
The step ② is divided into the following steps:
(2.1) defining a label for the structured data, and acquiring label data;
(2.2) preprocessing the structured data;
(2.3) extracting characteristic parameters from the structured data;
(2.4) dividing the characteristic parameters to obtain a data set;
and (2.4) vectorizing and normalizing the data set to obtain characteristic data.
In the step (2.4), if the data set exceeds five thousand, the data set is divided into a training set, a verification set and a test set, and if the data set does not exceed five thousand, the data set is divided through k-fold cross verification.
The step ③ is divided into the following steps:
(3.1) setting a prediction error value and initializing a learning model;
(3.2) carrying out greedy unsupervised pre-learning layer by layer on the learning model according to the characteristic data to obtain a coarse error model;
(3.3) obtaining a fine error model through the coarse error model and the label data;
(3.4) acquiring the output of the learning model through the characteristic data and the fine error model; (3.5) acquiring an error function value through the output of the learning model and the label data, and finishing learning if the error function value is smaller than the set prediction error value; otherwise, the learning model corrects the learning model based on an error back propagation method until the error function value is minimum;
and (3.6) obtaining a test model according to the fine error model and the minimum error function value.
In step ①, the structured data is data with each attribute having the same format.
In step ②, the label indicates that the structured data reflects the state representation of the predicted object.
In the step (3.3), a matrix equation is solved through the coarse error model and the label data together, and the matrix equation is solved into the hyper-parameters of the fine error model.
The matrix equation is:
Hβ=Y;
the H represents the output matrix of the coarse error model, Y represents the tag data matrix, and β is the minimum norm least squares solution.
Furthermore, the invention can effectively solve the problems of feature extraction of a large amount of data and limited range of prediction results by utilizing a prediction method combining a deep neural network and linear regression.
Example 1
As mentioned above, the fault prediction method based on the improved deep belief learning comprises the following implementation steps:
step 1, data acquisition: acquiring structured data capable of reflecting state changes of a predicted object by any method;
step 2, data preprocessing: after the structured data in the step 1 is obtained, defining labels for the data, preprocessing the data by using feature engineering, extracting feature parameters from the processed data, dividing the data, if the data volume is too much, dividing the data into a training set, a verification set and a test set, if the data volume is too little, dividing the data by using k-fold cross verification, and then vectorizing and normalizing the data;
step 3, obtaining a test model: firstly defining a prediction error value and initializing a learning model, wherein input and output layers of the model are respectively determined by training data and label data, then carrying out non-supervised pre-learning of a greedy type layer by layer on partial layers of the model by utilizing the training data to obtain a learning model with a coarse error, solving a matrix equation by utilizing the output of the coarse error model and the label data together, taking the obtained solution as a hyper-parameter of the other part of the learning model, and finally outputting the learning model as a linear expression of the output of the coarse error model and the hyper-parameter of the other part.
And secondly, defining an error function related to the learning model output and the label, if the error function value is lower than a set prediction error value, indicating that the learning is finished, otherwise, correcting the learning model by using an error back propagation method until the error function is minimum.
Finally, the generalization ability of the learning model is verified on the verification set, and if the generalization ability is good, the performance of the test set data test model can be utilized to further learn the label corresponding to the test set;
step 4, failure prediction: when a mapping relation between the remaining life or the health degree and the data label is established, fault threshold value division is carried out on the data label, after the label of the trained learning model on the test set is obtained, prediction information of the label is defined according to the divided fault threshold value, and if the fault threshold value is exceeded, a fault is represented;
step 5, result analysis: and 4, after the prediction information in the step 4 is obtained, comparing the prediction information with the original labels on the test set or the actual application one by one to evaluate the prediction accuracy of the model, representing the prediction result of the receivable confidence in a given confidence interval, and if not, not collecting the confidence and counting the prediction accuracy of the model.
The structured data in step 1 is data with the same format and length, and is also called quantitative data by row attributes (characteristics) and column number.
The label in the step 2 represents that the characteristic data reflects the state expression of the predicted object, wherein the state is a residual life label, a characteristic degradation data label or a health label of the predicted object.
The characteristic engineering in the step 2 is a general term of a series of preprocessing methods for data, and includes but is not limited to data cleaning, missing value identification and filling, and principal component analysis dimension reduction.
The characteristic parameter extraction in the step 2 is to select data which can generate large influence on the prediction result according to the indexes of the sensitivity, the relevance and the like of the data, so that the overfitting of the model caused by data redundancy is avoided.
The overfitting indicates that the model performed well on the training set samples and poorly on the validation set and test set.
In the step 2, the k-fold cross validation is to divide the original data set into k subsets, use k-2 subsets as a training set, use the remaining two subsets as a validation set and a test set respectively, repeat the k times, calculate the model error (including the training error, the validation error and the test error) of k times, and use the error as the real error of the learning model.
The initializing of the learning model in the step 3 includes initial setting of a model topology structure and initial setting of relevant parameters in the model, including learning times, learning rate, interlayer connection weight matrix, bias of each node, and the like of the learning model, and the weight matrix and the bias are also called as hyper-parameters.
The topological structure is a model structure and comprises an input layer, a plurality of middle hidden layers, an output layer and the number of nodes in each layer, wherein the number of model layers is also called depth.
The partial layers of the model in said step 3 represent the part of the learning model from the input layer to the last hidden layer.
The step 3 of the non-supervision pre-learning of the greedy type layer by layer is to start from the input layer, regard the input layer and the first hidden layer as a small model and learn by using the training data without labels, which is also called encoding, and after the first hidden layer is well learned, regard the first hidden layer and the second hidden layer as a small model and learn in the same way until learning the last hidden layer.
The small model is a Restricted Boltzmann Machine (RBM), and the visible layer and the hidden layer of the small model are connected in a bidirectional mode.
In step 3, the matrix equation is H β ═ Y, where H denotes the output matrix of the coarse error model, Y denotes the label matrix, and β is required to be the minimum norm least squares solution of the matrix equation.
Another part of the model in said step 3 represents the part between the last hidden layer to the output layer.
The hyper-parameters in the step 3 are the general names of the connection weight matrix between layers and the bias of each node in the learning model;
the linear expression in the step 3 isWhere H denotes the output matrix of the coarse error model,the matrix equation H β is expressed as a minimum norm least squares solution of Y, and O represents the expression of the label data by the learning model output layer, also referred to as the actual output of the learning model.
The error function in the step 3 represents a function for measuring the distance between the actual output O of the learning model and the label Y, and the formula isWhere m denotes the characteristic dimension of the tag matrix and n denotes the number of samples of the tag matrix.
In the step 3, an Error Back Propagation (BP) method is an influence of the Error function L on each hyper-parameter in the learning model, that is, a partial derivative of each hyper-parameter in the learning model with respect to L is solved, which is a chain derivation rule of a complex function, the hyper-parameter is adjusted together with parameters such as the learning rate in the learning model according to the influence, and the hyper-parameter is learned again, and the BP method is repeatedly adopted until the Error of the learning model meets the requirement or the learning times are used up.
The generalization ability in the step 3 refers to the expression ability of the learning model on new data, and the better the expression ability is, which indicates that the stronger the generalization ability is, the weaker the generalization ability of the over-fitted learning model is.
The fault threshold division in the step 4 is to define the variation or available range of the parameter according to the specific situation and related knowledge of the predicted object, wherein two thresholds (delta) are used1,δ2) Dividing three intervals to respectively represent no fault, general fault and fault, and labeling the value y not more than delta1Indicates no fault, delta1<y<δ2Denotes a general failure, y ≧ delta2Indicating a fault.
The confidence interval in step 5 is a range defining which standard tag value the actual output value of the learning model belongs to.
And 5, the prediction accuracy in the step 5 refers to the matching degree of the membership label data, actually output by the learning model in the confidence interval in the test set or the actual application, and the original label, and the confidence is collected and the prediction result is obtained if the membership label data is matched.
Example 2
As mentioned above, the fault prediction method based on the improved deep belief learning comprises the following implementation steps:
step 1, data acquisition: acquiring structured data capable of reflecting state changes of a predicted object by any method; if the data is not unstructured data, the data should be processed into structured data, and the data capable of reflecting state changes includes but is not limited to vibration, force, electromagnetism, temperature, current, voltage, level, flow, displacement, and the like;
step 2, data preprocessing: because the supervised learning mode is adopted for training, after the data are obtained, the label of the data is defined, namely the target output of the model is determined; under the condition of containing multi-dimensional characteristic data, one-dimensional or several-dimensional characteristics can be used as a data label, or indexes such as residual service life or health degree of a predicted object which can be reflected by characteristic data are defined by domain experts, and the final prediction accuracy of a learning model is greatly influenced by the quality of training data;
further, after the label data exist, the characteristic data are further analyzed and processed, firstly, abnormal values in the characteristic data are removed, whether missing values exist in the data or not is judged, when the missing values exist, the missing values can be deleted when the data quantity is sufficient, and when the data quantity is insufficient, the missing parts can be filled with 0 or an average value;
furthermore, checking correlation coefficients, multiple collinearity, variance and the like among all attributes in the data to select good attributes as characteristics, improving the data quality, further, when the data amount is small, dividing an original data set into k subsets by using k-fold cross validation, taking k-2 subsets as a training set, and taking the remaining two subsets as a validation set and a test set respectively, and repeating for k times; thirdly, vectorizing and normalizing the data characteristics to ensure that the model correctly identifies the characteristic data and the importance of each characteristic;
step 3, obtaining a test model: referring to fig. 2, after training data (feature data and label data) is prepared, the data is learned, firstly, learning error requirements of a model are defined to evaluate the quality of the model, then, learning iteration times of another stage of pre-training and fine-tuning in the model are set, and the structure of the model is set. However, in order to reduce the time of the model adjustment stage as much as possible, the intermediate hidden layer and the number of nodes are determined according to the empirical formulas. Further, setting relevant parameters in the model specifically includes: batch learning sample number, super-parameter updating momentum parameter, super-parameter expansion rate, initial learning rate of pre-learning and fine-tuning, learning rate attenuation coefficient and layer node random zero setting threshold;
further, the invention sets the learning rate to be reduced along with the increase of the iteration times, namely, the learning rate is initial learning rate multiplied by the learning rate attenuation coefficient, the characteristic data is used for pre-training the super-parameters of all layers except the output layer in the learning model after the learning rate is ready, the super-parameters in each RBM are set to zero, the batch multiple thought is adopted, the 1-step contrastive divergence method is used for pre-training layer by layer, then the trained super-parameters are assigned to the learning model, forward propagation is carried out according to the general neural network mode until the super-parameters are propagated to the last hidden layer, the nonlinear activation function of each layer is a sigmoid function, the data matrix of the layer is recorded as H, a label matrix Y is introduced, and the Moore-Penrose generalized inverse matrix of the H is used for solving the matrix equation H β which is the minimum norm of the YAnd is assigned as the connection weight between the last hidden layer and the output layer without the need of the output layerTo activate nonlinearly and bias in the layers, the output matrix of the learning model is thenAnd calculating the error L between the output O and the output Y of the model by using the label data matrix Y again, calculating the gradient of the error to the hyper-parameter after obtaining the error of one batch due to a batch multiple training mode adopted by the learning model, further executing a BP algorithm, and updating the hyper-parameter according to the gradient, wherein the updating process of the hyper-parameter is as follows: the hyper-parameter (hyper-parameter expansion rate × hyper-parameter) (hyper-parameter update momentum parameter × hyper-parameter update change rate + fine-tuned initial learning rate × hyper-parameter gradient).
Specifically, forward propagation of the rest batches is carried out by using the updated model, the error after one iteration is expressed as the average error of the errors of multiple batches, the iteration is continued after one batch iteration is finished, and if the error does not meet the requirement, the learning model continues to carry out iterative adjustment until the iteration is finished; if the error is in a descending trend under the condition that the error is not satisfied, the iteration times are increased to ensure that the learning model is adjusted in a ready state, and if necessary, partial parameters are modified to reduce the iteration times; if the error tends to be stable under the condition that the error is not satisfied, the structure and related parameters of the model need to be changed, including changing the depth and the number of nodes of the model and changing the parameters of the model.
Further, if the error meets the requirement, jumping out of the iteration loop to stop learning, further entering a verification stage, using a verification set as a data set of the trained model, verifying the generalization capability of the model on the verification set, if the verification error does not meet the requirement, reusing the test set data to train the model, if the verification error meets the requirement, using the test set data to test the model, and using the actual output on the test set as the input of fault prediction;
step 4, failure prediction: the essence of labeling the data is to establish a mapping relation between the data and fault information capable of reflecting the data, and when the fault information is not obvious, future data can be used as a label of historical data;
further, when the fault state is judged, a fault threshold judgment method is adopted, namely the change or the available range of the parameters is defined according to the specific condition and the related knowledge of the predicted object, and the divided threshold defines the data state reflected by the fault information;
preferably, the present invention divides two thresholds, delta1、δ2,δ1Definition, delta, representing no fault and general fault2Definition of general fault and fault, label value y is less than or equal to delta1Indicates no fault, delta1<y<δ2Denotes a general failure, y ≧ delta2Representing faults, taking each output value of the learning model on the test set as o, judging the interval corresponding to o according to the principle, and further obtaining a prediction result;
step 5, result analysis: the output value of the learning model is generally quantitative, and although the accuracy of fault prediction can be judged under the threshold judgment method, the accuracy of model prediction is difficult to compare with the original label when the accuracy of the model prediction is evaluated due to the quantitative value;
furthermore, a small number epsilon is given, if y falls into an interval [ o-epsilon, o + epsilon ], the predicted value o is consistent with y, the prediction is considered to be accurate, o at the moment of confidence acquisition is recorded as 1, otherwise, the predicted value is recorded as 0, and the actual prediction accuracy of the model is as follows:
in conclusion, the invention adopts the deep neural network to extract the characteristics, has a certain inhibition effect on data noise, is easy to realize in engineering, can be widely applied to the fault prediction work of objects such as machinery, electronics, electromechanics, hydraulic pressure and the like, and in a further scheme, the invention can ensure the range of the model predicted value under the action that the output layer does not adopt the nonlinear activation function, thereby improving the performance of the model in fault prediction.
Claims (8)
1. A fault prediction method based on improved deep belief learning is characterized by comprising the following steps: the method comprises the following steps:
① obtaining data, obtaining structured data and learning models that reflect changes in the state of the predicted object;
②, preprocessing data, namely defining a label for the structured data and acquiring label data, preprocessing the structured data through characteristic engineering and acquiring characteristic data;
③ obtaining a test model by setting a prediction error value and an error function, initializing a learning model and inputting characteristic data to train the initialized learning model, obtaining a fine error model according to the prediction error value and the trained learning model, obtaining an error function value according to the error function and the trained learning model, and obtaining the test model according to the fine error model and the error function value;
④ failure prediction, namely acquiring a failure threshold value through a test model, and acquiring prediction information according to the failure threshold value;
⑤ analyzing the result, and counting the prediction accuracy according to the test information.
2. The method of claim 1, wherein the step ② is divided into the following steps:
(2.1) defining a label for the structured data, and acquiring label data;
(2.2) preprocessing the structured data;
(2.3) extracting characteristic parameters from the structured data;
(2.4) dividing the characteristic parameters to obtain a data set;
and (2.4) vectorizing and normalizing the data set to obtain characteristic data.
3. The failure prediction method based on improved deep belief learning as claimed in claim 2, characterized in that: in the step (2.4), if the data set exceeds five thousand, the data set is divided into a training set, a verification set and a test set, and if the data set does not exceed five thousand, the data set is divided through k-fold cross verification.
4. The method of claim 3, wherein the step ③ is divided into the following steps:
(3.1) setting a prediction error value and initializing a learning model;
(3.2) carrying out greedy unsupervised pre-learning layer by layer on the learning model according to the characteristic data to obtain a coarse error model;
(3.3) obtaining a fine error model through the coarse error model and the label data;
(3.4) acquiring the output of the learning model through the characteristic data and the fine error model; (3.5) acquiring an error function value through the output of the learning model and the label data, and finishing learning if the error function value is smaller than the set prediction error value; otherwise, the learning model corrects the learning model based on an error back propagation method until the error function value is minimum;
and (3.6) obtaining a test model according to the fine error model and the minimum error function value.
5. The method for predicting faults based on improved deep belief learning as claimed in claim 1, wherein the structured data is data with each attribute having the same format in step ①.
6. The method for fault prediction based on improved deep belief learning as set forth in claim 1, wherein the step ② is characterized in that the label representation structured data reflects a state expression of the predicted object.
7. The improved deep belief learning-based failure prediction method of claim 4, characterized by: in the step (3.3), a matrix equation is solved through the coarse error model and the label data together, and the matrix equation is solved into the hyper-parameters of the fine error model.
8. The improved deep belief learning-based failure prediction method of claim 7, characterized by: the matrix equation is:
Hβ=Y;
the H represents the output matrix of the coarse error model, Y represents the tag data matrix, and β is the minimum norm least squares solution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911156833.6A CN111105074A (en) | 2019-11-22 | 2019-11-22 | Fault prediction method based on improved deep belief learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911156833.6A CN111105074A (en) | 2019-11-22 | 2019-11-22 | Fault prediction method based on improved deep belief learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111105074A true CN111105074A (en) | 2020-05-05 |
Family
ID=70420668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911156833.6A Pending CN111105074A (en) | 2019-11-22 | 2019-11-22 | Fault prediction method based on improved deep belief learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111105074A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113568305A (en) * | 2021-06-10 | 2021-10-29 | 贵州恰到科技有限公司 | Control method of deep reinforcement learning model robot |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101126929A (en) * | 2007-09-05 | 2008-02-20 | 东北大学 | Continuous miner remote real-time failure forecast and diagnosis method and device |
CN101872165A (en) * | 2010-06-13 | 2010-10-27 | 西安交通大学 | Method for fault diagnosis of wind turbines on basis of genetic neural network |
US20140324747A1 (en) * | 2013-04-30 | 2014-10-30 | Raytheon Company | Artificial continuously recombinant neural fiber network |
CN104835103A (en) * | 2015-05-11 | 2015-08-12 | 大连理工大学 | Mobile network health evaluation method based on neural network and fuzzy comprehensive evaluation |
CN106650919A (en) * | 2016-12-23 | 2017-05-10 | 国家电网公司信息通信分公司 | Information system fault diagnosis method and device based on convolutional neural network |
CN106952028A (en) * | 2017-03-13 | 2017-07-14 | 杭州安脉盛智能技术有限公司 | Dynamoelectric equipment failure is examined and health control method and system in advance |
CN108256556A (en) * | 2017-12-22 | 2018-07-06 | 上海电机学院 | Wind-driven generator group wheel box method for diagnosing faults based on depth belief network |
CN108519768A (en) * | 2018-03-26 | 2018-09-11 | 华中科技大学 | A kind of method for diagnosing faults analyzed based on deep learning and signal |
US20180373233A1 (en) * | 2017-06-27 | 2018-12-27 | Fanuc Corporation | Failure predicting apparatus and machine learning device |
WO2019054949A1 (en) * | 2017-09-15 | 2019-03-21 | Smartclean Technologies, Pte. Ltd. | System and method for predictive cleaning |
-
2019
- 2019-11-22 CN CN201911156833.6A patent/CN111105074A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101126929A (en) * | 2007-09-05 | 2008-02-20 | 东北大学 | Continuous miner remote real-time failure forecast and diagnosis method and device |
CN101872165A (en) * | 2010-06-13 | 2010-10-27 | 西安交通大学 | Method for fault diagnosis of wind turbines on basis of genetic neural network |
US20140324747A1 (en) * | 2013-04-30 | 2014-10-30 | Raytheon Company | Artificial continuously recombinant neural fiber network |
CN104835103A (en) * | 2015-05-11 | 2015-08-12 | 大连理工大学 | Mobile network health evaluation method based on neural network and fuzzy comprehensive evaluation |
CN106650919A (en) * | 2016-12-23 | 2017-05-10 | 国家电网公司信息通信分公司 | Information system fault diagnosis method and device based on convolutional neural network |
CN106952028A (en) * | 2017-03-13 | 2017-07-14 | 杭州安脉盛智能技术有限公司 | Dynamoelectric equipment failure is examined and health control method and system in advance |
US20180373233A1 (en) * | 2017-06-27 | 2018-12-27 | Fanuc Corporation | Failure predicting apparatus and machine learning device |
WO2019054949A1 (en) * | 2017-09-15 | 2019-03-21 | Smartclean Technologies, Pte. Ltd. | System and method for predictive cleaning |
CN108256556A (en) * | 2017-12-22 | 2018-07-06 | 上海电机学院 | Wind-driven generator group wheel box method for diagnosing faults based on depth belief network |
CN108519768A (en) * | 2018-03-26 | 2018-09-11 | 华中科技大学 | A kind of method for diagnosing faults analyzed based on deep learning and signal |
Non-Patent Citations (2)
Title |
---|
张国辉: "基于深度置信网络的时间序列预测方法及其应用研究" * |
王双园: "风力机健康状态监测及评估关键技术研究" * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113568305A (en) * | 2021-06-10 | 2021-10-29 | 贵州恰到科技有限公司 | Control method of deep reinforcement learning model robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112149316B (en) | Aero-engine residual life prediction method based on improved CNN model | |
CN110610035B (en) | Rolling bearing residual life prediction method based on GRU neural network | |
CN110175386B (en) | Method for predicting temperature of electrical equipment of transformer substation | |
CN108875771B (en) | Fault classification model and method based on sparse Gaussian Bernoulli limited Boltzmann machine and recurrent neural network | |
CN108960303B (en) | Unmanned aerial vehicle flight data anomaly detection method based on LSTM | |
CN113884290B (en) | Voltage regulator fault diagnosis method based on self-training semi-supervised generation countermeasure network | |
CN111144548B (en) | Method and device for identifying working condition of oil pumping well | |
CN113743016B (en) | Engine residual life prediction method based on self-encoder and echo state network | |
CN108875772B (en) | Fault classification model and method based on stacked sparse Gaussian Bernoulli limited Boltzmann machine and reinforcement learning | |
CN115659797B (en) | Self-learning method for generating anti-multi-head attention neural network aiming at aeroengine data reconstruction | |
CN114004346B (en) | Soft measurement modeling method based on gating stacking isomorphic self-encoder and storage medium | |
CN112633493A (en) | Fault diagnosis method and system for industrial equipment data | |
CN105606914A (en) | IWO-ELM-based Aviation power converter fault diagnosis method | |
CN110956309A (en) | Flow activity prediction method based on CRF and LSTM | |
CN114399032A (en) | Method and system for predicting metering error of electric energy meter | |
CN115290326A (en) | Rolling bearing fault intelligent diagnosis method | |
CN116303786B (en) | Block chain financial big data management system based on multidimensional data fusion algorithm | |
CN113988177A (en) | Water quality sensor abnormal data detection and fault diagnosis method | |
CN115345222A (en) | Fault classification method based on TimeGAN model | |
CN111105074A (en) | Fault prediction method based on improved deep belief learning | |
CN108665001B (en) | Cross-tested idle state detection method based on deep belief network | |
CN112232570A (en) | Forward active total electric quantity prediction method and device and readable storage medium | |
CN114898164B (en) | Neural network image classifier confidence calibration method and system | |
CN117371321A (en) | Internal plasticity depth echo state network soft measurement modeling method based on Bayesian optimization | |
CN117785522A (en) | Method and system for performing root cause analysis using a trained machine learning model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200505 |
|
RJ01 | Rejection of invention patent application after publication |