CN113033079B - Chemical fault diagnosis method based on unbalance correction convolutional neural network - Google Patents
Chemical fault diagnosis method based on unbalance correction convolutional neural network Download PDFInfo
- Publication number
- CN113033079B CN113033079B CN202110248735.6A CN202110248735A CN113033079B CN 113033079 B CN113033079 B CN 113033079B CN 202110248735 A CN202110248735 A CN 202110248735A CN 113033079 B CN113033079 B CN 113033079B
- Authority
- CN
- China
- Prior art keywords
- sample
- fault
- new
- data
- samples
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 82
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 30
- 238000003745 diagnosis Methods 0.000 title claims abstract description 26
- 239000000126 substance Substances 0.000 title claims abstract description 18
- 238000012937 correction Methods 0.000 title claims abstract description 11
- 230000008569 process Effects 0.000 claims abstract description 35
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 238000012549 training Methods 0.000 claims description 32
- 239000013598 vector Substances 0.000 claims description 26
- 239000000498 cooling water Substances 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 14
- 239000000376 reactant Substances 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 230000009467 reduction Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 239000002826 coolant Substances 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 239000007788 liquid Substances 0.000 claims description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 3
- 230000003068 static effect Effects 0.000 abstract description 3
- 230000008901 benefit Effects 0.000 abstract description 2
- 230000002194 synthesizing effect Effects 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 72
- 238000001311 chemical methods and process Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000035945 sensitivity Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 238000002405 diagnostic procedure Methods 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 229940060587 alpha e Drugs 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 238000003889 chemical engineering Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000004886 process control Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/08—Thermal analysis or thermal optimisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, which comprises the following steps of S1: preprocessing TE process data; s2: synthesizing a sample; s3: reducing the dimension of data; s4: and constructing a CNN incremental learning network. The invention has the advantages that the II-CNN framework can synthesize unbalanced data and consider the importance of boundary samples, so that the synthesized samples are more representative; on the basis, the dimension of the data is reduced, and the complex learning process is simplified; finally, for the arrival of new fault types, adopting incremental learning to update the structure and parameters of the CNN network. The method is superior to the existing static model method, and has remarkable robustness and reliability in the diagnosis of chemical faults.
Description
Technical Field
The invention belongs to the field of chemical industry, and particularly relates to an unbalance correction convolutional neural network incremental learning method for chemical industry fault diagnosis.
Background
The fault diagnosis of the chemical process is one of the most important programs in a process control system, and is important to ensure the successful operation of the chemical process and improve the safety of the chemical process. The fault diagnosis model aims at detecting abnormal states of the production process, finding out root causes of faults, helping to make reliable decisions and eliminating system faults. The fault diagnosis model can convert historical data into process information according to data acquired from a plurality of sensors to judge whether faults occur or not, so that the safety, the high efficiency and the economical performance of a complex chemical process are ensured.
The current intelligent fault diagnosis method based on machine learning and deep learning is greatly researched. However, most of these methods suffer from the following drawbacks: 1) They assume that the data samples in different failure modes are balanced or equal, but this assumption is not always applicable to the actual chemical process, and the data imbalance can cause the classifier to fail to learn the complete class knowledge, reducing the classification accuracy of the classifier, because the data imbalance can cause the classifier to pay less attention to few failures; 2) One or more new fault types may occur as production proceeds in an actual industrial process, and these models all require a complete retraining process as new fault categories come in.
Therefore, aiming at the problems of unbalance of data samples and model updating in the complex chemical process, a new and effective fault diagnosis framework is necessary.
Disclosure of Invention
The invention aims to provide a chemical engineering fault diagnosis-oriented fault diagnosis framework based on an unbalance correction convolutional neural network, so that various methods are fully utilized, the influence of unbalance of a data sample is reduced, the network structure and parameters can be automatically updated, and the robustness of a fault diagnosis model is improved.
In order to achieve the above object, the present invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, comprising the steps of,
s1: TE process data preprocessing, namely performing discrete value and standardization processing on the data;
s2: generating and extracting information of unbalanced data;
s3: performing data dimension reduction, and extracting key feature variables of faults;
s4: and constructing a CNN incremental learning network.
Further, the step S1 includes,
the normalization processing of the sample TE process data sample set X is calculated by adopting the following formula:
wherein: x is x ik The kth sample value of the ith input variable before normalization, M represents the number of the input variables, and N represents trainingThe number of samples;
a kth sample value that is the normalized ith input variable;
xmin{x ik |1≤k≤N} i,min ;
xmax{x ik |1≤k≤N} i,max 。
further, the step S2 includes,
input: d represents the original sample set, k represents the number of nearest neighbor samples, n represents the number of samples in D;
and (3) outputting: t represents a minority failure mode dataset;
s21 creates a minority data set T for each minority failure type i i ;
S22 calculating each minority sample x in D i And each sample y j Euclidean distance between:
i, j represent sample numbers, respectively;
s23 obtaining x i K neighbor sets of (a);
s24 is provided with k' i (0≤k′ i K) the samples belong to a plurality of fault modes;
s25 if k/2.ltoreq.k' i K.ltoreq.k, then x i Is an edge sample;
S26{x′ 1 ,x′ 2 ,...,x′ m the number of the boundary samples is m;
s27, weight w is allocated to each boundary sample i The weight determines the application frequency of boundary samples in the data generation process, and the weight w i Is calculated according to the formula:
wherein z is j Nearest neighbor samples of most failure modes of x;
s28 is according to formula x new Generate a composite sample of =x '+α× (x' -x), α is [0-1]Random numbers within a range;
s29, combining the synthesized sample with the original sample to form a new minority failure mode data set T';
s210, undersampling is completed by using a Tomek link, and partial fault samples in the Tomek link pair are deleted;
s211 obtains a new minority failure mode dataset T.
Further, the step S3 includes,
input:for the training dataset, ite is the number of iterations, θ is the tolerance, δ (0) is the learning rate, and N represents the number of training samples.
And (3) outputting: w is a feature weight vector.
S31 initializing weight vectorI represents the dimension of the sample;
s32, randomly selecting one sample x;
s33 is such that e (t-1) =0, δ (t) =δ (t-1)/t, t representing the number of iterations;
s34 calculates alpha using the following formula i And beta i :
h is a sample matrix, λ is a regularization factor;
s35, updating e (t-1) by adopting the following formula:
e(t-1)=e(t-1)+(|x-x LH(NH) |-|x-x LH(NM) |);
s36 at i=1: within the range of N, the steps S34 and S35 are circularly executed until the cycle is N times;
s37, updating e (t-1) by adopting the following formula:
e(t-1)=e(t-1)/N;
s38, calculating z (t-1), wherein the formula is as follows:
s39 updates w (t), formula:
s310 judges whether or not a condition is satisfied: and (3) carrying out the next step if the ratio of the I to the W (t) -W (t-1) is less than or equal to theta, and circulating S32 to S39 if the ratio is not met;
s312 at i=1: within the range of N, the steps S32 to S310 are circularly executed until the cycle is N times;
s313 obtains a weight vector w.
Further, the step S4 includes,
input: x represents new samples, N represents the number of training samples, and T represents a threshold.
And (3) outputting: w (W) 1 And W is 2 Is a first and second degree of match;
s41 calculates x and S using the following formula i Degree of matching between:
s42 at i=1: in the range of N, the step S41 is circularly executed until the cycle is performed for N times;
s43 obtaining W 1 And W is 2 ;
S44 if ((W) 1 >W 2 >T)||(W 1 >T>W 2 ) And then, if desired, the second device,
x and s 1 Belonging to the same category, adding x to the training dataset;
s45 if (T > W) 1 >W 2 ) Then, the first and second processes, respectively,
x is the new sample belonging to the new class; adding x to the training dataset of the new category; adding a new layer to the trained CNN; randomly initializing new parameters; new layers are gradually trained.
Further, in the step S1,
the TE process has 5 main units including chemical reactors, recycle compressors, condensers, strippers, vapor/liquid separators, TE simulators to generate 22 different types of state data, including 21 standard fault and normal state data;
the 21 fault state types for the TE process are as follows:
failure 1A/C feed ratio, component B constant;
fault 2B component, a/C specific constant;
fault 3D feed temperature;
failure 4 reactor cooling water inlet temperature;
failure 5 condenser cooling water inlet temperature;
failure 6A feed loss;
failure 7C header pressure loss;
fault 8A, B, C feed components;
a fault 9D feed temperature;
failure 10C feed temperature;
the cooling water inlet temperature of the reactor is failed 11;
failure 12 condenser cooling water inlet temperature;
a fault 13 reaction kinetics indicator;
fault 14 reactor coolant water valve;
failure 15 condenser cooling water valve;
faults 16-20 are of unknown type;
failure 21 valve in stream 4.
Wherein A, C, D represents three different gaseous reactants, B represents an inert component, and during TE the reactants and inert component are fed into the reactor; flow 4 refers to the valve position.
Further, the step S3 includes,
given training sample setWherein-> Is x i I and N are the dimension and number of training samples, respectively, < >>Representing sample space, x on a local hyperplane i Is expressed as +.>
Let x be i In the presence of wh alpha, the catalyst is,is a sample matrix with k nearest neighbor samples x i W is a diagonal element of a diagonal matrix, w ii Weights representing the ith feature, and alpha E R k Each element is a reconstruction coefficient of the nearest neighbor sample, and its optimization problem is expressed as:
s.t.||w|| 2 =1,w j ≥0,j=1,...,I
wherein,, is a matrix with k x i Homogeneous neighbor,/->Is a matrix with k x i Homogeneous neighbor of alpha i And beta i Are the reconstruction coefficients closest to the sample, respectively from the same class +.>And +.>w represents a weight margin vector.
w (t) represents the weight of the i-th iteration weighted feature space, z (t) represents the i-th iteration expected boundary vector, and the objective function is:
s.t.z(t-1)=w(t-1)⊙e(t-1)
||w(t)|| 2 =1,w j (t)≥0,j=1,...,I,t=1,...,Ite
where e (t) is the expected edge vector of the original space at the ith iteration, ite is the maximum number of iterations,the obtained e (t) is:
representing a given sample x by a point on a local hyperplane i The final weight vector is obtained by maximizing the interval between a given sample and the local hyperplane. Thus, the first and second substrates are bonded together,and->Can be expressed as:
wherein the method comprises the steps ofReconstruction coefficient vector for homogeneous neighbor, +.>For the reconstruction coefficient vector of the heterogeneous neighbor, alpha is obtained by solving the optimization problem i And beta i :
Wherein I 2 Is 2-norm, λ is a regularization factor, if t=0, initializing feature weightsI=1, a method of treating a subject suffering from a disorder, I, at the (t-1) th iteration, obtaining alpha for each sample i And beta i (i=1.,), N), the feature weight factor w (t) is then updated, based on the gradient ascent method, w (t) is updated by:
δ is learning rate, δ (t) =δ/t, t=1, 2,..ite, 0< δ (t). Ltoreq.1, the calculation formula of gradient is as follows:
given training sample setWherein-> Is x i I is the dimension of the training samples, N is the number of training samples, C is the number of categories, so e (t-1) of the t-1 th iteration is defined as follows:
wherein P (c) is the prior probability of c class,and->Is x in class c i Hit and miss reconstruction points of (c).
Further, the step S4 includes,
using the matching degree to measure the similarity between two samples, and setting a new sample x and a best matching sample s 1 The best matching degree between the two is W s1 Then the new sample x matches the second matching sample s 2 The second matching degree is W s2 The degree of matching is defined as:
wherein m is a characteristic dimension; f (x) i ) And f(s) i ) The i-th features of samples x and s, respectively, min (f (x i ),f(s i ) And max (f (x) i ),f(s i ) Are f (x) i ) And f(s) i ) Is set to be the minimum value and the maximum value of (c),representing the similarity of x and s, +.>The value is between 0 and 1, the closer the matching degree is to 1, the higher the similarity between two samples is;
after obtaining W s1 And W is s2 After that, compare W s1 And W is s2 And a matching degree threshold T, if W 1 >W 2 > T or W 1 >T>W 2 Then the new sample x matches best sample s 1 Belongs to the same category, if T is more than W 1 And x belongs to a new class, becomes an initial sample of the new class, and realizes inter-class increment learning.
The invention has the beneficial effects that the importance of the boundary sample is considered, so that the synthesized sample is more representative; on the basis, the dimension of the data is reduced, and the complex learning process is simplified; finally, for the arrival of new fault types, adopting incremental learning to update the structure and parameters of the CNN network. The method is superior to the existing static model method, and has remarkable robustness and reliability in the diagnosis of chemical faults.
Drawings
FIG. 1 shows a TE process block diagram;
FIG. 2 illustrates a block diagram of a chemical industry fault diagnosis-oriented imbalance correction convolutional neural network in accordance with one embodiment of the present invention;
FIG. 3 shows a II-CNN framework proposed by the present invention;
FIG. 4 shows a frame diagram of a data dimension reduction algorithm proposed by the present invention;
FIG. 5 illustrates a framework of an incremental hierarchical model in accordance with the present invention;
FIG. 6 shows the results on two classes of few faults using the method of the present invention, with both graphs (a) and (b) being sensitivity index curves;
FIG. 7 shows the results of class 8 faults at each iteration using the method of the present invention, with (a) being a sensitivity index curve and (b) being a g-mean curve;
FIG. 8 shows the results of class 13 faults at each iteration of the method of the present invention, with (a) being a sensitivity index curve and (b) being a g-mean curve;
FIG. 9 shows graphs of the accuracy of 7 different method experiments performed based on the method of the present invention: graph (a) is a comparison of the results of different sample numbers for each fault; graph (b) shows comparison of different fault type number results;
Detailed Description
As shown in fig. 1, the present invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, comprising the steps of,
s1: TE process data preprocessing, namely performing discrete value and standardization processing on the data;
s2: after data preprocessing, valuable information of unbalanced data is generated and extracted.
S3: after a synthetic sample is obtained, performing data dimension reduction, and extracting key feature variables of faults;
s4: constructing a CNN incremental learning network;
further, the step S1 includes,
the normalization process of the sample TE process data sample set X is calculated according to the following formula:
wherein: x is x ik A kth sample value that is the ith input variable before normalization;
a kth sample value that is the normalized ith input variable;
x i,min =min{x ik |1≤k≤N};
x i,max =max{x ik |1≤k≤N}。
further, the step S2 includes,
input: d represents the original sample set, k represents the number of nearest neighbor samples, n represents the number of samples in D;
and (3) outputting: t represents a minority failure mode dataset;
s21 creates a minority data set T for each minority failure type i i ;
S22 calculating each minority sample x in D i And each sample y j Euclidean distance between:
s23 obtaining x i K neighbor sets of (a);
s24 supposedly has k' i (0≤k′ i K) the samples belong to a plurality of fault modes;
s25 if k/2.ltoreq.k' i K.ltoreq.k, then x i Is an edge sample;
S26{x′ 1 ,x′ 2 ,...,x′ m the number of the boundary samples is m;
s27, weight w is allocated to each boundary sample i . The weights determine the frequency of application of the boundary samples in the data generation process. w (w) i Is calculated according to the formula:
wherein z is j Nearest neighbor samples of most failure modes of x;
s28, generating a synthetic sample through SMOTE according to a formula x new =x '+α× (x' -x) (α is [ 0-1)]Random numbers within a range);
s29, combining the synthesized sample with the original sample to form a new minority failure mode data set T';
s210, undersampling is completed by using a Tomek link, and most fault samples in the Tomek link pair are deleted;
s211, obtaining a new minority fault mode data set T;
further, the step S3 includes,
input:for training data sets, ite is the number of iterations, θ is the tolerance, and δ (0) is the learning rate.
And (3) outputting: w is a feature weight vector.
S31 initializing weight vector
S32, randomly selecting one sample x;
s33 makes e (t-1) =0, δ (t) =δ (t-1)/t;
s34 calculating alpha i And beta i The formula:
s35, calculating e (t-1):
e(t-1)=e(t-1)+(|x-x LH(NH) |-|x-x LH(NM) |);
s36 at i=1: within the range of N, the steps S34 and S35 are circularly executed until the cycle is N times;
s37, calculating an average value of e (t-1):
e(t-1)=e(t-1)/N;
s38, calculating z (t-1), wherein the formula is as follows:
s39 updates w (t), formula:
s310 judges whether or not a condition is satisfied: and (3) carrying out the next step if the ratio of the I to the W (t) -W (t-1) is less than or equal to theta, and circulating S32 to S39 if the ratio is not met;
s312 at i=1: within the range of N, the steps S32 to S310 are circularly executed until the cycle is N times;
s313 obtains a weight vector w.
Further, the step S4 includes,
input: x represents new samples, N represents the number of training samples, and T represents a threshold.
And (3) outputting: w (W) 1 And W is 2 Is a first and second degree of match;
s41 calculating x and S i Degree of matching between:
s42 at i=1: in the range of N, the step S41 is circularly executed until the cycle is performed for N times;
s43 obtaining W 1 And W is 2 ;
S44 if ((W) 1 >W 2 >T)||(W 1 >T>W 2 ) And then, if desired, the second device,
x and s 1 Belonging to the same category, adding x to the training dataset;
s45 if (T > W) 1 >W 2 ) Then, the first and second processes, respectively,
x is the new sample belonging to the new class; adding x to the training dataset of the new category; adding a new layer to the trained CNN; randomly initializing new parameters; new layers are gradually trained.
Further, the step S1 includes,
the TE process has 5 main operations including chemical reactor, recycle compressor, condenser, stripper, vapor/liquid separator, the variables of the TE process include 12 inputs and 41 outputs, the TE simulator generates 22 different types of status data including 21 standard fault and normal status data;
the 21 fault state types for the TE process are as follows:
failure 1A/C feed ratio, component B constant;
fault 2B component, a/C specific constant;
fault 3D feed temperature;
failure 4 reactor cooling water inlet temperature;
failure 5 condenser cooling water inlet temperature;
failure 6A feed loss;
failure 7C header pressure loss;
fault 8A, B, C feed components;
a fault 9D feed temperature;
failure 10C feed temperature;
the cooling water inlet temperature of the reactor is failed 11;
failure 12 condenser cooling water inlet temperature;
a fault 13 reaction kinetics indicator;
fault 14 reactor coolant water valve;
failure 15 condenser cooling water valve;
faults 16-20 are of unknown type;
failure 21 valve in stream 4.
Wherein A, C, D represents three different gaseous reactants, B represents an inert component, and during TE the reactants and inert component are fed into the reactor; flow 4 refers to the valve position.
Further, the step S3 includes,
given training sample setWherein->y i E y= { -1, +1} is x i I and N are the number and dimension of the training samples, respectively. On a local hyperplane, x i Can be expressed as +.>Each feature is assigned an appropriate weight, the greater the weight, the more important the feature.
A weight is assigned to each feature by maximizing the expected margin. Let x be i Is whα, h εR I×k Is a sample matrix with kNearest neighbor sample x i W is the diagonal element w of a diagonal matrix ii Weights representing the ith feature, and alpha E R k Each element is a reconstruction coefficient of the nearest neighbor sample. The optimization problem is expressed as:
s.t.||w|| 2 =1,w j ...0,j=1,...,I
wherein,,(h i ∈R I×k is a matrix with k x i Homogeneous neighbor of m i ∈R I×k Is a matrix with k x i Homogeneous neighbor of (f)) alpha i And beta i Are the reconstruction coefficients closest to the sample, respectively from the same class +.>And +.>w represents a weight margin vector.
w (t) and z (t) represent the weight and the expected boundary vector, respectively, of the i-th iteration weighted feature space. The objective function is:
s.t.z(t-1)=w(t-1)⊙e(t-1)
||w(t)|| 2 =1,w j (t)...0,j=1,...,I,t=1,...,Ite
where e (t) is the expected edge vector of the original space at the ith iteration, ite is the maximum number of iterations,the obtained e (t) is:
representing a given sample x by a point on a local hyperplane i Is a nearest neighbor to (c). The final weight vector may be obtained by maximizing the interval between a given sample and the local hyperplane. Thus, the first and second substrates are bonded together,and->Can be expressed as:
wherein alpha is i ∈R k ,β i ∈R k Reconstruction coefficient vectors for the homogeneous and heterogeneous neighbors, respectively. Alpha is obtained by solving an optimization problem i And beta i :
Wherein I 2 Is 2-norm and lambda is the regularization factor. If t=o, LHD-Relief initialization feature weightsAt iteration (t-1), α is obtained for each sample i And beta i (i=1.,), N). The feature weight factor w (t) is then updated, based on a gradient-increasing method, w (t) may be updated by:
δ is learning rate (δ (t) =δ/t, t=1, 2..ite0 < δ (t). Ltoreq.1), the calculation formula of the gradient is as follows:
given training sample setWherein->y i E y= {1,2,.. i I and N are the number and dimension of the training samples, respectively. Thus, e (t-1) for the t-1 th iteration is defined as follows:
wherein P (c) is the prior probability of c class,and->Is x in class c i Hit and miss reconstruction points of (c).
Further, the step S4 includes,
the degree of matching is used to measure the similarity between two samples. Set new sample x and best match sample s 1 The best matching degree between the two is W s1 Then the new sample x matches the second matching sample s 2 The second matching degree is W s2 . The degree of matching is defined as:
wherein m is a characteristic dimension; f (x) i ) And f(s) i ) The ith features of x and s respectively,min(f(x i ),f(s i ) And max (f (x) i ),f(s i ) Are f (x) i ) And f(s) i ) Is a minimum and a maximum of (a). Thus, the first and second substrates are bonded together,representing the similarity of x and s. />The value is between 0 and 1, and the closer the matching degree is to 1, the higher the similarity between two samples is.
After obtaining W s1 And W is s2 After that, compare W s1 And W is s2 Is a value of (2) and a matching degree threshold T. If W is 1 >W 2 > T or W 1 >T>W 2 New sample x and best matching sample s 1 Belonging to the same category. If T > W 1 And x belongs to a new class, becomes an initial sample of the new class, and realizes inter-class increment learning.
To diagnose new faults, new classifications are automatically added to existing networks. These new networks can inherit the topology and learn the knowledge of the trained CNN so they can update themselves, including new failure classes, without requiring a complete retraining process. These layers are not trained from scratch, but rather are trained step by copying the parameters of the old layers as an initialization. Samples belonging to the new class can be applied to the modified CNN and the corresponding new layer can be incrementally trained.
The meaning of the abbreviations in the present invention is described below.
II-CNN represents a new incremental imbalance correction convolutional neural network that is improved.
The invention has the beneficial effects that the importance of the boundary sample is considered, so that the synthesized sample is more representative; on the basis, the dimension of the data is reduced, and the complex learning process is simplified; finally, for the arrival of new fault types, adopting incremental learning to update the structure and parameters of the CNN network. The method is superior to the existing static model method, and has remarkable robustness and reliability in the diagnosis of chemical faults.
The TE process data is used as an experimental basis, a TE process structure diagram is shown in figure 1, and the fault diagnosis is carried out by adopting the method of the invention.
(1) The TE simulator may generate 22 different types of state data, including 21 standard fault and normal state data. All data sets are sampled here using the basic mode of the TE process, with corresponding training and testing data for each set of pre-described faults. In order to test the performance of the proposed method, the experiments are divided here into two cases. In the first case, an unbalanced data flow in the chemical process is simulated, and 6 types of faults are selected. This is the case in order to test the diagnostic performance of our proposed method in unbalanced fault data. The second case is the incremental learning performance of the test method. In this case, 10 types of faults are initially selected here, and then added to 15 types of faults.
(2) Compared with other methods
The fault types are preprocessed, their outputs are used as inputs to the CNN, and share the same CNN structure. Thus, the failure diagnosis performance is compared. The DBN method in deep learning has very good performance, so the DBN is used here to compare with the present invention. Some typical fault diagnosis methods (shallow models) are compared with the present invention, including widely used conventional methods such as Back Propagation Neural Networks (BPNNs) and Support Vector Machines (SVMs). From these methods, we can demonstrate the fault diagnosis performance of the deep learning methods, because these shallow methods ignore the feature learning process compared to the deep learning methods. Here, SVM is used in scikit-learn with RBF kernel, setting the parameter γ=1/df, where df is the characteristic number of the original data. BPNN has 5 layers (52, 42, 32, 22 and 10 for each layer of neurons, respectively). To obtain perfect BPNN diagnostic performance, the learning rate was set to 0.5.
Example 1: diagnosis model experiment of unbalanced fault data
To evaluate the performance of the present invention, 6 faults with a specific imbalance ratio were selected during the training process, with faults 8 and 13 being the fault types of a few samples. As shown in fig. 6, the advantage of diagnostic performance in a few failure modes can be found. The present invention provides a significant improvement in identifying few faults over other methods. Compared with the prior art, the performance of the invention is respectively higher by 6.7 percent and 2.9 percent. Thus, the present invention proves advantageous in generating a few faulty samples. From fig. 6 it can be seen that the present invention is superior to the shallow model because it can efficiently extract features from the raw data and process unbalanced data in complex chemistry. Due to the adoption of the deep architecture, the invention can effectively solve the chemical imbalance data, wherein the data has a plurality of variables and has a high nonlinear relation.
From fig. 7 and 8, the performance of the present invention in diagnosing a few faults can be embodied. In this case, the unbalance rate is increased here to test the performance of the above model. Initially, the number of samples for both type 8 faults and type 13 faults is 50, and 30 samples will be added per iteration. The invention greatly improves the diagnosis performance of a few fault types. Compared with the prior art, the sensitivity index and the g-mean value of the invention are respectively 3.7 percent and 1.9 percent higher. The present invention seeks to simulate as few original features of a fault as possible and provide the most meaningful diagnosis.
Experimental results show that as the number of few faults increases, the method provided by the invention is iterated for about 10 times, the original knowledge of few faults is enough, and each model can effectively extract the characteristics from the original data. As can be seen from fig. 7 and 8, the present invention can effectively solve the class imbalance problem.
The sensitivity index for all fault types for the different diagnostic methods is shown in table 1. The results show that the method provided by the invention can remarkably improve the diagnosis performance of few fault types. The present invention can well solve the problem of chemical data imbalance because it attempts to learn rare types of faults from the imbalance data.
TABLE 1
Example 2: diagnostic model experiments with increased fault types
The incremental learning capabilities of the present invention for new samples and fault categories are described herein. The invention can adaptively update new faults. The number of faults is gradually increased from 10 to 15. The experimental results of the first 10 failures are shown in fig. 9 (a). It illustrates the incremental learning capabilities of the new samples. In fig. 9 (a), the x-axis represents the number of training samples for each failure category, and the y-axis represents the accuracy of the diagnostic model test samples. Here 200 samples of each fault class are used to initialize each diagnostic model. Then, for each step, 50 samples will be added for each failure category to test the incremental learning capabilities of the proposed method. In this case, SVM, BPNN, DBN, CNN would be fully trained based on the corresponding data set for comparison.
When a fault class newly occurs, the incremental learning capability of the present invention is shown in fig. 9 (b). In fig. 9 (b), the x-axis represents the number of fault categories and the y-axis represents the accuracy of the different diagnostic methods. The initial diagnostic model will be trained here to diagnose 10 faults in the TE process samples. A new fault class will then be added for each step to test the incremental learning capabilities of each diagnostic method until all 15 fault classes are imported into the diagnostic model. From fig. 9 (b), it can be seen that the present invention performs better than other methods. The diagnostic performance of the invention is superior to other methods. This is because convolution operations can effectively extract the fault trend and nonlinear characteristics of the fault process.
The results of the comparison experiments show that compared with the traditional methods such as deep learning, the II-CNN framework provided by the invention is more effective in the fault diagnosis of the chemical process.
While the invention has been disclosed in connection with the preferred embodiments shown and described in detail, it will be understood by those skilled in the art that various modifications may be made to the above-described method for diagnosing a chemical failure based on an imbalance correction convolutional neural network, without departing from the scope of the invention. Accordingly, the scope of the invention should be determined from the following claims.
Claims (1)
1. A chemical fault diagnosis method based on unbalance correction convolutional neural network is characterized by comprising the following steps,
s1: TE process data preprocessing, namely performing discrete value and standardization processing on the data;
s2: generating and extracting information of unbalanced data;
s3: performing data dimension reduction, and extracting key feature variables of faults;
s4: constructing a CNN incremental learning network;
the step S3 of this method comprises the steps of,
input:for a training data set, ite is the iteration number, θ is the tolerance error, δ (0) is the learning rate, and U represents the number of training samples;
w is a characteristic weight vector;
s31 initializing weight vectorM represents the dimension of the sample;
s32 randomly selecting one sample x u ;
S33 is such that e (t-1) =0, δ (t) =δ (t-1)/t, t representing the number of iterations;
s34 calculates alpha using the following formula u And beta u :
h is a sample matrix, λ is a regularization factor;
s35, updating e (t-1) by adopting the following formula:
s36 is within u= 1:U, and steps S34 and S35 are cyclically executed until U times are cyclically executed;
s37, updating e (t-1) by adopting the following formula:
e(t-1)=e(t-1)/U;
s38, calculating z (t-1), wherein the formula is as follows:
s39 updates w (t), formula:
s310 judges whether or not a condition is satisfied: and (3) carrying out the next step if the ratio of the I to the W (t) -W (t-1) is less than or equal to theta, and circulating S32 to S39 if the ratio is not met;
s312 is within u= 1:U, and steps S32 to S310 are cyclically executed until U times are cyclically executed;
s313, obtaining a weight vector w;
the step S4 of said step comprises the steps of,
input: s represents new samples, U represents the number of training samples, and T represents a threshold value;
and (3) outputting: w (W) 1 And W is 2 Is a first and second degree of match;
s41 calculates a new sample S and sample x using the following formula u Degree of matching between:
s42 is within u= 1:U, and step S41 is cyclically executed until U times;
s43 obtaining W 1 And W is 2 ;
S44 if ((W) 1 >W 2 >T)||(W 1 >T>W 2 ) And then, if desired, the second device,
s and x u S is added to the training dataset belonging to the same category;
s45 if (T)>W 1 >W 2 ) Then, the first and second processes, respectively,
s is a new sample belonging to a new class, s is added to a training data set of the new class, a new layer is added for the trained CNN, and new parameters are randomly initialized; gradually training a new layer;
in the step S1 of the above-mentioned process,
the TE process has 5 units including chemical reactors, recycle compressors, condensers, strippers, vapor/liquid separators, TE simulators generate 22 different types of status data, including 21 standard fault and normal status data;
the 21 fault state types for the TE process are as follows:
fault 1A/C feed ratio, B component constant;
fault 2B component, a/C specific constant;
fault 3D feed temperature;
failure 4 reactor cooling water inlet temperature;
failure 5 condenser cooling water inlet temperature;
fault 6A feed loss;
failure 7C header pressure loss;
failure 8A, B, C feed components;
fault 9D feed temperature;
fault 10C feed temperature;
the cooling water inlet temperature of the reactor is failed 11;
failure 12 condenser cooling water inlet temperature;
a fault 13 reaction kinetics indicator;
fault 14 reactor coolant water valve;
failure 15 condenser cooling water valve;
faults 16-20 are of unknown type;
failure 21 valve in stream 4;
wherein A, C, D represents three different gaseous reactants, B represents an inert component, and during TE the reactants and inert component are fed into the reactor; flow 4 refers to the valve position;
the step S3 of this method comprises the steps of,
given training sample setWherein-> Is x u M and U are the dimension and number of training samples, respectively, < >>Representing sample space, x on a local hyperplane u Is expressed as +.>
Is provided withIs a sample matrix with k nearest neighbor samples x u W is a diagonal matrix with diagonal elements w mm Weight representing mth feature +.>Each element is the most
s.t.||w|| 2 =1,w mm ≥0,m=1,...,M
Wherein,,is a matrix with k x u Homogeneous neighbor,/->Is a matrix with k x u Homogeneous neighbor of alpha u And beta u Is the reconstruction coefficient closest to the sample, respectively from the same class +.>And from the opposite category->w represents a weight margin vector;
w (t) represents the weight of the weighted feature space of the t iteration, z (t) represents the expected boundary vector of the t iteration, and the objective function is:
s.t.z(t-1)=w(t-1)⊙e(t-1)
||w(t)|| 2 =1,w m (t)≥0,m=1,...,M,t=1,...,Ite
where e (t) is the expected edge vector of the original space at the mth iteration, ite is the maximum number of iterations,the obtained e (t) is:
representing a given sample x by a point on a local hyperplane u The final weight vector is obtained by maximizing the interval between a given sample and the local hyperplane, and therefore,and->Can be expressed as:
wherein the method comprises the steps ofReconstruction coefficient vector for homogeneous neighbor, +.>For the reconstruction coefficient vector of the heterogeneous neighbor, alpha is obtained by solving the optimization problem u And beta u :
Wherein I 2 Is 2-norm, λ is a regularization factor, if t=0, initializing feature weights At iteration (t-1), α is obtained for each sample u And beta u Where u=1, …, U, then the feature weight factor w (t) is updated, based on the gradient-increasing method, by:
δ is learning rate, δ (t) =δ/t, t=1, 2, …, ite,0< δ (t). Ltoreq.1, the calculation formula of gradient is as follows:
given training sample setWherein-> Is x u M is the dimension of the training samples, U is the number of training samples, +.>Representing the number of categories, therefore, e (t-1) for the t-1 th iteration is defined as follows:
wherein P (c) is the prior probability of c class,and->Is x in class c u Intra-class and extra-class reconstruction points of (a);
the step S4 of said step comprises the steps of,
using the matching degree to measure the similarity between two samples, and setting a new sample s and a best matching sampleThe best matching degree between the two is W 1 Then the new sample s is +.>The second matching degree is W 2 The degree of matching is defined as:
wherein M is a characteristic dimension; f(s) m ) Andsample s and sample->M-th feature of->And->Are f(s) m ) And->Minimum and maximum of (2), W g Representation->Similarity to s, W g Take a value between 0 and 1;
after obtaining W 1 And W is 2 After that, compare W 1 And W is 2 And a matching degree threshold T, if W 1 >W 2 >T or W 1 >T>W 2 Then the new sample s matches bestIs a sample of (2)Belonging to the same category, if T>W 1 S belongs to a new class and becomes the initial sample of the new class.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110248735.6A CN113033079B (en) | 2021-03-08 | 2021-03-08 | Chemical fault diagnosis method based on unbalance correction convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110248735.6A CN113033079B (en) | 2021-03-08 | 2021-03-08 | Chemical fault diagnosis method based on unbalance correction convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113033079A CN113033079A (en) | 2021-06-25 |
CN113033079B true CN113033079B (en) | 2023-07-18 |
Family
ID=76466690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110248735.6A Active CN113033079B (en) | 2021-03-08 | 2021-03-08 | Chemical fault diagnosis method based on unbalance correction convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113033079B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114038169A (en) * | 2021-11-10 | 2022-02-11 | 英业达(重庆)有限公司 | Method, device, equipment and medium for monitoring faults of production equipment |
CN117407824B (en) * | 2023-12-14 | 2024-02-27 | 四川蜀能电科能源技术有限公司 | Health detection method, equipment and medium of power time synchronization device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107784325A (en) * | 2017-10-20 | 2018-03-09 | 河北工业大学 | Spiral fault diagnosis model based on the fusion of data-driven increment |
CN109816044A (en) * | 2019-02-11 | 2019-05-28 | 中南大学 | A kind of uneven learning method based on WGAN-GP and over-sampling |
CN110070060A (en) * | 2019-04-26 | 2019-07-30 | 天津开发区精诺瀚海数据科技有限公司 | A kind of method for diagnosing faults of bearing apparatus |
CN110244689A (en) * | 2019-06-11 | 2019-09-17 | 哈尔滨工程大学 | A kind of AUV adaptive failure diagnostic method based on identification feature learning method |
CN110334580A (en) * | 2019-05-04 | 2019-10-15 | 天津开发区精诺瀚海数据科技有限公司 | The equipment fault classification method of changeable weight combination based on integrated increment |
CN111580506A (en) * | 2020-06-03 | 2020-08-25 | 南京理工大学 | Industrial process fault diagnosis method based on information fusion |
CN112200104A (en) * | 2020-10-15 | 2021-01-08 | 重庆科技学院 | Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4446941A2 (en) * | 2017-05-23 | 2024-10-16 | INTEL Corporation | Methods and apparatus for discriminative semantic transfer and physics-inspired optimization of features in deep learning |
-
2021
- 2021-03-08 CN CN202110248735.6A patent/CN113033079B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107784325A (en) * | 2017-10-20 | 2018-03-09 | 河北工业大学 | Spiral fault diagnosis model based on the fusion of data-driven increment |
CN109816044A (en) * | 2019-02-11 | 2019-05-28 | 中南大学 | A kind of uneven learning method based on WGAN-GP and over-sampling |
CN110070060A (en) * | 2019-04-26 | 2019-07-30 | 天津开发区精诺瀚海数据科技有限公司 | A kind of method for diagnosing faults of bearing apparatus |
CN110334580A (en) * | 2019-05-04 | 2019-10-15 | 天津开发区精诺瀚海数据科技有限公司 | The equipment fault classification method of changeable weight combination based on integrated increment |
CN110244689A (en) * | 2019-06-11 | 2019-09-17 | 哈尔滨工程大学 | A kind of AUV adaptive failure diagnostic method based on identification feature learning method |
CN111580506A (en) * | 2020-06-03 | 2020-08-25 | 南京理工大学 | Industrial process fault diagnosis method based on information fusion |
CN112200104A (en) * | 2020-10-15 | 2021-01-08 | 重庆科技学院 | Chemical engineering fault diagnosis method based on novel Bayesian framework for enhanced principal component analysis |
Non-Patent Citations (4)
Title |
---|
"Broad Convolutional Neural Network Based Industrial Process Fault Diagnosis With Incremental Learning Capability";Wanke Yu等;《IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS》;第67卷(第6期);第5081-5091页 * |
"Incremental Learning of Random Forests for Large-Scale Image Classification";Marko Ristin等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;第490-503页 * |
"基于深度学习的化工故障诊断方法研究";胡志新;《中国优秀硕士学位论文全文数据库》(第1期);第17-43页第2,3,4节 * |
基于卷积神经网络的机械故障诊断方法综述;吴定海等;《机械强度》(第05期);第12-20页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113033079A (en) | 2021-06-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113469219B (en) | Rotary machine fault diagnosis method under complex working condition based on element transfer learning | |
CN113033079B (en) | Chemical fault diagnosis method based on unbalance correction convolutional neural network | |
CN106843195B (en) | The Fault Classification differentiated based on adaptive set at semi-supervised Fei Sheer | |
CN113052218A (en) | Multi-scale residual convolution and LSTM fusion performance evaluation method for industrial process | |
CN112085252B (en) | Anti-fact prediction method for set type decision effect | |
CN110297480B (en) | TE process fault diagnosis method of deep belief network model based on parameter optimization | |
CN112784920A (en) | Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part | |
CN111985825A (en) | Crystal face quality evaluation method for roller mill orientation instrument | |
Sitawarin et al. | Minimum-norm adversarial examples on KNN and KNN based models | |
CN113541985A (en) | Internet of things fault diagnosis method, training method of model and related device | |
CN111709577B (en) | RUL prediction method based on long-range correlation GAN-LSTM | |
CN117349595A (en) | Industrial scene-oriented extremely few sample fault diagnosis method | |
Lee et al. | Data-driven fault detection for chemical processes using autoencoder with data augmentation | |
Yang et al. | A new chiller fault diagnosis method under the imbalanced data environment via combining an improved generative adversarial network with an enhanced deep extreme learning machine | |
CN110674893B (en) | Self-adaptive correction method for diagnosis experience in rotary machine fault diagnosis knowledge base | |
CN111723857A (en) | Intelligent monitoring method and system for running state of process production equipment | |
CN113688875B (en) | Industrial system fault identification method and device | |
CN115578325A (en) | Image anomaly detection method based on channel attention registration network | |
CN109547248A (en) | Based on artificial intelligence in orbit aerocraft ad hoc network method for diagnosing faults and device | |
Huang et al. | Label propagation dictionary learning based process monitoring method for industrial process with between-mode similarity | |
CN115564021A (en) | Fault root cause sequencing method in polyester fiber polymerization process | |
CN113610148A (en) | Fault diagnosis method based on bias weighting AdaBoost | |
Sun et al. | A multi-stage semi-supervised improved deep embedded clustering (MS-SSIDEC) method for bearing fault diagnosis under the situation of insufficient labeled samples | |
Lu et al. | Three-layer deep learning network random trees for fault diagnosis in chemical production process | |
CN114707098B (en) | Aeroengine performance degradation state evaluation method based on multisource sensor state distance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |