CN113033079B

CN113033079B - Chemical fault diagnosis method based on unbalance correction convolutional neural network

Info

Publication number: CN113033079B
Application number: CN202110248735.6A
Authority: CN
Inventors: 辜小花; 卢飞; 杨光; 唐德东; 杨利平; 李家庆; 李太福; 李芳�
Original assignee: Zhongnan University Of Economics And Law; Chongqing Youyite Intelligent Technology Co ltd; Chongqing University of Science and Technology
Current assignee: Zhongnan University Of Economics And Law; Chongqing Youyite Intelligent Technology Co ltd; Chongqing University of Science and Technology
Priority date: 2021-03-08
Filing date: 2021-03-08
Publication date: 2023-07-18
Anticipated expiration: 2041-03-08
Also published as: CN113033079A

Abstract

The invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, which comprises the following steps of S1: preprocessing TE process data; s2: synthesizing a sample; s3: reducing the dimension of data; s4: and constructing a CNN incremental learning network. The invention has the advantages that the II-CNN framework can synthesize unbalanced data and consider the importance of boundary samples, so that the synthesized samples are more representative; on the basis, the dimension of the data is reduced, and the complex learning process is simplified; finally, for the arrival of new fault types, adopting incremental learning to update the structure and parameters of the CNN network. The method is superior to the existing static model method, and has remarkable robustness and reliability in the diagnosis of chemical faults.

Description

Chemical fault diagnosis method based on unbalance correction convolutional neural network

Technical Field

The invention belongs to the field of chemical industry, and particularly relates to an unbalance correction convolutional neural network incremental learning method for chemical industry fault diagnosis.

Background

The fault diagnosis of the chemical process is one of the most important programs in a process control system, and is important to ensure the successful operation of the chemical process and improve the safety of the chemical process. The fault diagnosis model aims at detecting abnormal states of the production process, finding out root causes of faults, helping to make reliable decisions and eliminating system faults. The fault diagnosis model can convert historical data into process information according to data acquired from a plurality of sensors to judge whether faults occur or not, so that the safety, the high efficiency and the economical performance of a complex chemical process are ensured.

The current intelligent fault diagnosis method based on machine learning and deep learning is greatly researched. However, most of these methods suffer from the following drawbacks: 1) They assume that the data samples in different failure modes are balanced or equal, but this assumption is not always applicable to the actual chemical process, and the data imbalance can cause the classifier to fail to learn the complete class knowledge, reducing the classification accuracy of the classifier, because the data imbalance can cause the classifier to pay less attention to few failures; 2) One or more new fault types may occur as production proceeds in an actual industrial process, and these models all require a complete retraining process as new fault categories come in.

Therefore, aiming at the problems of unbalance of data samples and model updating in the complex chemical process, a new and effective fault diagnosis framework is necessary.

Disclosure of Invention

The invention aims to provide a chemical engineering fault diagnosis-oriented fault diagnosis framework based on an unbalance correction convolutional neural network, so that various methods are fully utilized, the influence of unbalance of a data sample is reduced, the network structure and parameters can be automatically updated, and the robustness of a fault diagnosis model is improved.

In order to achieve the above object, the present invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, comprising the steps of,

s1: TE process data preprocessing, namely performing discrete value and standardization processing on the data;

s2: generating and extracting information of unbalanced data;

s3: performing data dimension reduction, and extracting key feature variables of faults;

s4: and constructing a CNN incremental learning network.

Further, the step S1 includes,

the normalization processing of the sample TE process data sample set X is calculated by adopting the following formula:

wherein: x is x _ik The kth sample value of the ith input variable before normalization, M represents the number of the input variables, and N represents trainingThe number of samples;

a kth sample value that is the normalized ith input variable;

xmin{x _ik |1≤k≤N} _i，min ；

xmax{x _ik |1≤k≤N} _i，max 。

further, the step S2 includes,

input: d represents the original sample set, k represents the number of nearest neighbor samples, n represents the number of samples in D;

and (3) outputting: t represents a minority failure mode dataset;

s21 creates a minority data set T for each minority failure type i _i ；

S22 calculating each minority sample x in D _i And each sample y _j Euclidean distance between:

i, j represent sample numbers, respectively;

s23 obtaining x _i K neighbor sets of (a);

s24 is provided with k' _i (0≤k′ _i K) the samples belong to a plurality of fault modes;

s25 if k/2.ltoreq.k' _i K.ltoreq.k, then x _i Is an edge sample;

S26{x′ ₁ ，x′ ₂ ，...，x′ _m the number of the boundary samples is m;

s27, weight w is allocated to each boundary sample _i The weight determines the application frequency of boundary samples in the data generation process, and the weight w _i Is calculated according to the formula:

wherein z is _j Nearest neighbor samples of most failure modes of x;

s28 is according to formula x _new Generate a composite sample of =x '+α× (x' -x), α is [0-1]Random numbers within a range;

s29, combining the synthesized sample with the original sample to form a new minority failure mode data set T';

s210, undersampling is completed by using a Tomek link, and partial fault samples in the Tomek link pair are deleted;

s211 obtains a new minority failure mode dataset T.

Further, the step S3 includes,

input:for the training dataset, ite is the number of iterations, θ is the tolerance, δ (0) is the learning rate, and N represents the number of training samples.

And (3) outputting: w is a feature weight vector.

S31 initializing weight vectorI represents the dimension of the sample;

s32, randomly selecting one sample x;

s33 is such that e (t-1) =0, δ (t) =δ (t-1)/t, t representing the number of iterations;

s34 calculates alpha using the following formula _i And beta _i ：

h is a sample matrix, λ is a regularization factor;

s35, updating e (t-1) by adopting the following formula:

e(t-1)＝e(t-1)+(|x-x ^LH(NH) |-|x-x ^LH(NM) |)；

s36 at i=1: within the range of N, the steps S34 and S35 are circularly executed until the cycle is N times;

s37, updating e (t-1) by adopting the following formula:

e(t-1)＝e(t-1)/N；

s38, calculating z (t-1), wherein the formula is as follows:

s39 updates w (t), formula:

s310 judges whether or not a condition is satisfied: and (3) carrying out the next step if the ratio of the I to the W (t) -W (t-1) is less than or equal to theta, and circulating S32 to S39 if the ratio is not met;

s312 at i=1: within the range of N, the steps S32 to S310 are circularly executed until the cycle is N times;

s313 obtains a weight vector w.

Further, the step S4 includes,

input: x represents new samples, N represents the number of training samples, and T represents a threshold.

And (3) outputting: w (W) ₁ And W is ₂ Is a first and second degree of match;

s41 calculates x and S using the following formula _i Degree of matching between:

s42 at i=1: in the range of N, the step S41 is circularly executed until the cycle is performed for N times;

s43 obtaining W ₁ And W is ₂ ；

S44 if ((W) ₁ ＞W ₂ ＞T)||(W ₁ ＞T＞W ₂ ) And then, if desired, the second device,

x and s ₁ Belonging to the same category, adding x to the training dataset;

s45 if (T > W) ₁ ＞W ₂ ) Then, the first and second processes, respectively,

x is the new sample belonging to the new class; adding x to the training dataset of the new category; adding a new layer to the trained CNN; randomly initializing new parameters; new layers are gradually trained.

Further, in the step S1,

the TE process has 5 main units including chemical reactors, recycle compressors, condensers, strippers, vapor/liquid separators, TE simulators to generate 22 different types of state data, including 21 standard fault and normal state data;

the 21 fault state types for the TE process are as follows:

failure 1A/C feed ratio, component B constant;

fault 2B component, a/C specific constant;

fault 3D feed temperature;

failure 4 reactor cooling water inlet temperature;

failure 5 condenser cooling water inlet temperature;

failure 6A feed loss;

failure 7C header pressure loss;

fault 8A, B, C feed components;

a fault 9D feed temperature;

failure 10C feed temperature;

the cooling water inlet temperature of the reactor is failed 11;

failure 12 condenser cooling water inlet temperature;

a fault 13 reaction kinetics indicator;

fault 14 reactor coolant water valve;

failure 15 condenser cooling water valve;

faults 16-20 are of unknown type;

failure 21 valve in stream 4.

Wherein A, C, D represents three different gaseous reactants, B represents an inert component, and during TE the reactants and inert component are fed into the reactor; flow 4 refers to the valve position.

Further, the step S3 includes,

given training sample setWherein-> Is x _i I and N are the dimension and number of training samples, respectively, < >>Representing sample space, x on a local hyperplane _i Is expressed as +.>

Let x be _i In the presence of wh alpha, the catalyst is,is a sample matrix with k nearest neighbor samples x _i W is a diagonal element of a diagonal matrix, w _ii Weights representing the ith feature, and alpha E R ^k Each element is a reconstruction coefficient of the nearest neighbor sample, and its optimization problem is expressed as:

s.t.||w|| ₂ ＝1，w _j ≥0，j＝1，...，I

wherein,, is a matrix with k x _i Homogeneous neighbor,/->Is a matrix with k x _i Homogeneous neighbor of alpha _i And beta _i Are the reconstruction coefficients closest to the sample, respectively from the same class +.>And +.>w represents a weight margin vector.

w (t) represents the weight of the i-th iteration weighted feature space, z (t) represents the i-th iteration expected boundary vector, and the objective function is:

s.t.z(t-1)＝w(t-1)⊙e(t-1)

||w(t)|| ₂ ＝1，w _j (t)≥0，j＝1，...，I，t＝1，...，Ite

where e (t) is the expected edge vector of the original space at the ith iteration, ite is the maximum number of iterations,the obtained e (t) is:

representing a given sample x by a point on a local hyperplane _i The final weight vector is obtained by maximizing the interval between a given sample and the local hyperplane. Thus, the first and second substrates are bonded together,and->Can be expressed as:

wherein the method comprises the steps ofReconstruction coefficient vector for homogeneous neighbor, +.>For the reconstruction coefficient vector of the heterogeneous neighbor, alpha is obtained by solving the optimization problem _i And beta _i ：

Wherein I ₂ Is 2-norm, λ is a regularization factor, if t=0, initializing feature weightsI=1, a method of treating a subject suffering from a disorder, I, at the (t-1) th iteration, obtaining alpha for each sample _i And beta _i (i=1.,), N), the feature weight factor w (t) is then updated, based on the gradient ascent method, w (t) is updated by:

δ is learning rate, δ (t) =δ/t, t=1, 2,..ite, 0< δ (t). Ltoreq.1, the calculation formula of gradient is as follows:

given training sample setWherein-> Is x _i I is the dimension of the training samples, N is the number of training samples, C is the number of categories, so e (t-1) of the t-1 th iteration is defined as follows:

wherein P (c) is the prior probability of c class,and->Is x in class c _i Hit and miss reconstruction points of (c).

Further, the step S4 includes,

using the matching degree to measure the similarity between two samples, and setting a new sample x and a best matching sample s ₁ The best matching degree between the two is W _s1 Then the new sample x matches the second matching sample s ₂ The second matching degree is W _s2 The degree of matching is defined as:

wherein m is a characteristic dimension; f (x) _i ) And f(s) _i ) The i-th features of samples x and s, respectively, min (f (x _i )，f(s _i ) And max (f (x) _i )，f(s _i ) Are f (x) _i ) And f(s) _i ) Is set to be the minimum value and the maximum value of (c),representing the similarity of x and s, +.>The value is between 0 and 1, the closer the matching degree is to 1, the higher the similarity between two samples is;

after obtaining W _s1 And W is _s2 After that, compare W _s1 And W is _s2 And a matching degree threshold T, if W ₁ ＞W ₂ > T or W ₁ ＞T＞W ₂ Then the new sample x matches best sample s ₁ Belongs to the same category, if T is more than W ₁ And x belongs to a new class, becomes an initial sample of the new class, and realizes inter-class increment learning.

The invention has the beneficial effects that the importance of the boundary sample is considered, so that the synthesized sample is more representative; on the basis, the dimension of the data is reduced, and the complex learning process is simplified; finally, for the arrival of new fault types, adopting incremental learning to update the structure and parameters of the CNN network. The method is superior to the existing static model method, and has remarkable robustness and reliability in the diagnosis of chemical faults.

Drawings

FIG. 1 shows a TE process block diagram;

FIG. 2 illustrates a block diagram of a chemical industry fault diagnosis-oriented imbalance correction convolutional neural network in accordance with one embodiment of the present invention;

FIG. 3 shows a II-CNN framework proposed by the present invention;

FIG. 4 shows a frame diagram of a data dimension reduction algorithm proposed by the present invention;

FIG. 5 illustrates a framework of an incremental hierarchical model in accordance with the present invention;

FIG. 6 shows the results on two classes of few faults using the method of the present invention, with both graphs (a) and (b) being sensitivity index curves;

FIG. 7 shows the results of class 8 faults at each iteration using the method of the present invention, with (a) being a sensitivity index curve and (b) being a g-mean curve;

FIG. 8 shows the results of class 13 faults at each iteration of the method of the present invention, with (a) being a sensitivity index curve and (b) being a g-mean curve;

FIG. 9 shows graphs of the accuracy of 7 different method experiments performed based on the method of the present invention: graph (a) is a comparison of the results of different sample numbers for each fault; graph (b) shows comparison of different fault type number results;

Detailed Description

As shown in fig. 1, the present invention provides a chemical fault diagnosis method based on an unbalance correction convolutional neural network, comprising the steps of,

s2: after data preprocessing, valuable information of unbalanced data is generated and extracted.

S3: after a synthetic sample is obtained, performing data dimension reduction, and extracting key feature variables of faults;

s4: constructing a CNN incremental learning network;

further, the step S1 includes,

the normalization process of the sample TE process data sample set X is calculated according to the following formula:

wherein: x is x _ik A kth sample value that is the ith input variable before normalization;

a kth sample value that is the normalized ith input variable;

x _i，min ＝min{x _ik |1≤k≤N}；

x _i，max ＝max{x _ik |1≤k≤N}。

further, the step S2 includes,

and (3) outputting: t represents a minority failure mode dataset;

s21 creates a minority data set T for each minority failure type i _i ；

s23 obtaining x _i K neighbor sets of (a);

s24 supposedly has k' _i (0≤k′ _i K) the samples belong to a plurality of fault modes;

s25 if k/2.ltoreq.k' _i K.ltoreq.k, then x _i Is an edge sample;

s27, weight w is allocated to each boundary sample _i . The weights determine the frequency of application of the boundary samples in the data generation process. w (w) _i Is calculated according to the formula:

wherein z is _j Nearest neighbor samples of most failure modes of x;

s28, generating a synthetic sample through SMOTE according to a formula x _new =x '+α× (x' -x) (α is [ 0-1)]Random numbers within a range);

s210, undersampling is completed by using a Tomek link, and most fault samples in the Tomek link pair are deleted;

s211, obtaining a new minority fault mode data set T;

further, the step S3 includes,

input:for training data sets, ite is the number of iterations, θ is the tolerance, and δ (0) is the learning rate.

And (3) outputting: w is a feature weight vector.

S31 initializing weight vector

S32, randomly selecting one sample x;

s33 makes e (t-1) =0, δ (t) =δ (t-1)/t;

s34 calculating alpha _i And beta _i The formula:

s35, calculating e (t-1):

e(t-1)＝e(t-1)+(|x-x ^LH(NH) |-|x-x ^LH(NM) |)；

s37, calculating an average value of e (t-1):

e(t-1)＝e(t-1)/N；

s38, calculating z (t-1), wherein the formula is as follows:

s39 updates w (t), formula:

s313 obtains a weight vector w.

Further, the step S4 includes,

s41 calculating x and S _i Degree of matching between:

s43 obtaining W ₁ And W is ₂ ；

x and s ₁ Belonging to the same category, adding x to the training dataset;

Further, the step S1 includes,

the TE process has 5 main operations including chemical reactor, recycle compressor, condenser, stripper, vapor/liquid separator, the variables of the TE process include 12 inputs and 41 outputs, the TE simulator generates 22 different types of status data including 21 standard fault and normal status data;

the 21 fault state types for the TE process are as follows:

failure 1A/C feed ratio, component B constant;

fault 2B component, a/C specific constant;

fault 3D feed temperature;

failure 4 reactor cooling water inlet temperature;

failure 5 condenser cooling water inlet temperature;

failure 6A feed loss;

failure 7C header pressure loss;

fault 8A, B, C feed components;

a fault 9D feed temperature;

failure 10C feed temperature;

the cooling water inlet temperature of the reactor is failed 11;

failure 12 condenser cooling water inlet temperature;

a fault 13 reaction kinetics indicator;

fault 14 reactor coolant water valve;

failure 15 condenser cooling water valve;

faults 16-20 are of unknown type;

failure 21 valve in stream 4.

Further, the step S3 includes,

given training sample setWherein->y _i E y= { -1, +1} is x _i I and N are the number and dimension of the training samples, respectively. On a local hyperplane, x _i Can be expressed as +.>Each feature is assigned an appropriate weight, the greater the weight, the more important the feature.

A weight is assigned to each feature by maximizing the expected margin. Let x be _i Is whα, h εR ^I×k Is a sample matrix with kNearest neighbor sample x _i W is the diagonal element w of a diagonal matrix _ii Weights representing the ith feature, and alpha E R ^k Each element is a reconstruction coefficient of the nearest neighbor sample. The optimization problem is expressed as:

s.t.||w|| ₂ ＝1，w _j ...0，j＝1，...，I

wherein,,(h _i ∈R ^I×k is a matrix with k x _i Homogeneous neighbor of m _i ∈R ^I×k Is a matrix with k x _i Homogeneous neighbor of (f)) alpha _i And beta _i Are the reconstruction coefficients closest to the sample, respectively from the same class +.>And +.>w represents a weight margin vector.

w (t) and z (t) represent the weight and the expected boundary vector, respectively, of the i-th iteration weighted feature space. The objective function is:

s.t.z(t-1)＝w(t-1)⊙e(t-1)

||w(t)|| ₂ ＝1，w _j (t)...0，j＝1，...，I，t＝1，...，Ite

representing a given sample x by a point on a local hyperplane _i Is a nearest neighbor to (c). The final weight vector may be obtained by maximizing the interval between a given sample and the local hyperplane. Thus, the first and second substrates are bonded together,and->Can be expressed as:

wherein alpha is _i ∈R ^k ，β _i ∈R ^k Reconstruction coefficient vectors for the homogeneous and heterogeneous neighbors, respectively. Alpha is obtained by solving an optimization problem _i And beta _i ：

Wherein I ₂ Is 2-norm and lambda is the regularization factor. If t=o, LHD-Relief initialization feature weightsAt iteration (t-1), α is obtained for each sample _i And beta _i (i=1.,), N). The feature weight factor w (t) is then updated, based on a gradient-increasing method, w (t) may be updated by:

δ is learning rate (δ (t) =δ/t, t=1, 2..ite0 < δ (t). Ltoreq.1), the calculation formula of the gradient is as follows:

given training sample setWherein->y _i E y= {1,2,.. _i I and N are the number and dimension of the training samples, respectively. Thus, e (t-1) for the t-1 th iteration is defined as follows:

Further, the step S4 includes,

the degree of matching is used to measure the similarity between two samples. Set new sample x and best match sample s ₁ The best matching degree between the two is W _s1 Then the new sample x matches the second matching sample s ₂ The second matching degree is W _s2 . The degree of matching is defined as:

wherein m is a characteristic dimension; f (x) _i ) And f(s) _i ) The ith features of x and s respectively,min(f(x _i )，f(s _i ) And max (f (x) _i )，f(s _i ) Are f (x) _i ) And f(s) _i ) Is a minimum and a maximum of (a). Thus, the first and second substrates are bonded together,representing the similarity of x and s. />The value is between 0 and 1, and the closer the matching degree is to 1, the higher the similarity between two samples is.

After obtaining W _s1 And W is _s2 After that, compare W _s1 And W is _s2 Is a value of (2) and a matching degree threshold T. If W is ₁ ＞W ₂ > T or W ₁ ＞T＞W ₂ New sample x and best matching sample s ₁ Belonging to the same category. If T > W ₁ And x belongs to a new class, becomes an initial sample of the new class, and realizes inter-class increment learning.

To diagnose new faults, new classifications are automatically added to existing networks. These new networks can inherit the topology and learn the knowledge of the trained CNN so they can update themselves, including new failure classes, without requiring a complete retraining process. These layers are not trained from scratch, but rather are trained step by copying the parameters of the old layers as an initialization. Samples belonging to the new class can be applied to the modified CNN and the corresponding new layer can be incrementally trained.

The meaning of the abbreviations in the present invention is described below.

II-CNN represents a new incremental imbalance correction convolutional neural network that is improved.

The TE process data is used as an experimental basis, a TE process structure diagram is shown in figure 1, and the fault diagnosis is carried out by adopting the method of the invention.

(1) The TE simulator may generate 22 different types of state data, including 21 standard fault and normal state data. All data sets are sampled here using the basic mode of the TE process, with corresponding training and testing data for each set of pre-described faults. In order to test the performance of the proposed method, the experiments are divided here into two cases. In the first case, an unbalanced data flow in the chemical process is simulated, and 6 types of faults are selected. This is the case in order to test the diagnostic performance of our proposed method in unbalanced fault data. The second case is the incremental learning performance of the test method. In this case, 10 types of faults are initially selected here, and then added to 15 types of faults.

(2) Compared with other methods

The fault types are preprocessed, their outputs are used as inputs to the CNN, and share the same CNN structure. Thus, the failure diagnosis performance is compared. The DBN method in deep learning has very good performance, so the DBN is used here to compare with the present invention. Some typical fault diagnosis methods (shallow models) are compared with the present invention, including widely used conventional methods such as Back Propagation Neural Networks (BPNNs) and Support Vector Machines (SVMs). From these methods, we can demonstrate the fault diagnosis performance of the deep learning methods, because these shallow methods ignore the feature learning process compared to the deep learning methods. Here, SVM is used in scikit-learn with RBF kernel, setting the parameter γ=1/df, where df is the characteristic number of the original data. BPNN has 5 layers (52, 42, 32, 22 and 10 for each layer of neurons, respectively). To obtain perfect BPNN diagnostic performance, the learning rate was set to 0.5.

Example 1: diagnosis model experiment of unbalanced fault data

To evaluate the performance of the present invention, 6 faults with a specific imbalance ratio were selected during the training process, with faults 8 and 13 being the fault types of a few samples. As shown in fig. 6, the advantage of diagnostic performance in a few failure modes can be found. The present invention provides a significant improvement in identifying few faults over other methods. Compared with the prior art, the performance of the invention is respectively higher by 6.7 percent and 2.9 percent. Thus, the present invention proves advantageous in generating a few faulty samples. From fig. 6 it can be seen that the present invention is superior to the shallow model because it can efficiently extract features from the raw data and process unbalanced data in complex chemistry. Due to the adoption of the deep architecture, the invention can effectively solve the chemical imbalance data, wherein the data has a plurality of variables and has a high nonlinear relation.

From fig. 7 and 8, the performance of the present invention in diagnosing a few faults can be embodied. In this case, the unbalance rate is increased here to test the performance of the above model. Initially, the number of samples for both type 8 faults and type 13 faults is 50, and 30 samples will be added per iteration. The invention greatly improves the diagnosis performance of a few fault types. Compared with the prior art, the sensitivity index and the g-mean value of the invention are respectively 3.7 percent and 1.9 percent higher. The present invention seeks to simulate as few original features of a fault as possible and provide the most meaningful diagnosis.

Experimental results show that as the number of few faults increases, the method provided by the invention is iterated for about 10 times, the original knowledge of few faults is enough, and each model can effectively extract the characteristics from the original data. As can be seen from fig. 7 and 8, the present invention can effectively solve the class imbalance problem.

The sensitivity index for all fault types for the different diagnostic methods is shown in table 1. The results show that the method provided by the invention can remarkably improve the diagnosis performance of few fault types. The present invention can well solve the problem of chemical data imbalance because it attempts to learn rare types of faults from the imbalance data.

TABLE 1

Example 2: diagnostic model experiments with increased fault types

The incremental learning capabilities of the present invention for new samples and fault categories are described herein. The invention can adaptively update new faults. The number of faults is gradually increased from 10 to 15. The experimental results of the first 10 failures are shown in fig. 9 (a). It illustrates the incremental learning capabilities of the new samples. In fig. 9 (a), the x-axis represents the number of training samples for each failure category, and the y-axis represents the accuracy of the diagnostic model test samples. Here 200 samples of each fault class are used to initialize each diagnostic model. Then, for each step, 50 samples will be added for each failure category to test the incremental learning capabilities of the proposed method. In this case, SVM, BPNN, DBN, CNN would be fully trained based on the corresponding data set for comparison.

When a fault class newly occurs, the incremental learning capability of the present invention is shown in fig. 9 (b). In fig. 9 (b), the x-axis represents the number of fault categories and the y-axis represents the accuracy of the different diagnostic methods. The initial diagnostic model will be trained here to diagnose 10 faults in the TE process samples. A new fault class will then be added for each step to test the incremental learning capabilities of each diagnostic method until all 15 fault classes are imported into the diagnostic model. From fig. 9 (b), it can be seen that the present invention performs better than other methods. The diagnostic performance of the invention is superior to other methods. This is because convolution operations can effectively extract the fault trend and nonlinear characteristics of the fault process.

The results of the comparison experiments show that compared with the traditional methods such as deep learning, the II-CNN framework provided by the invention is more effective in the fault diagnosis of the chemical process.

While the invention has been disclosed in connection with the preferred embodiments shown and described in detail, it will be understood by those skilled in the art that various modifications may be made to the above-described method for diagnosing a chemical failure based on an imbalance correction convolutional neural network, without departing from the scope of the invention. Accordingly, the scope of the invention should be determined from the following claims.

Claims

1. A chemical fault diagnosis method based on unbalance correction convolutional neural network is characterized by comprising the following steps,

s2: generating and extracting information of unbalanced data;

s4: constructing a CNN incremental learning network;

the step S3 of this method comprises the steps of,

input:for a training data set, ite is the iteration number, θ is the tolerance error, δ (0) is the learning rate, and U represents the number of training samples;

w is a characteristic weight vector;

s31 initializing weight vectorM represents the dimension of the sample;

s32 randomly selecting one sample x _u ；

s34 calculates alpha using the following formula _u And beta _u ：

h is a sample matrix, λ is a regularization factor;

s35, updating e (t-1) by adopting the following formula:

s36 is within u= 1:U, and steps S34 and S35 are cyclically executed until U times are cyclically executed;

s37, updating e (t-1) by adopting the following formula:

e(t-1)＝e(t-1)/U；

s38, calculating z (t-1), wherein the formula is as follows:

s39 updates w (t), formula:

s312 is within u= 1:U, and steps S32 to S310 are cyclically executed until U times are cyclically executed;

s313, obtaining a weight vector w;

the step S4 of said step comprises the steps of,

input: s represents new samples, U represents the number of training samples, and T represents a threshold value;

s41 calculates a new sample S and sample x using the following formula _u Degree of matching between:

s42 is within u= 1:U, and step S41 is cyclically executed until U times;

s43 obtaining W ₁ And W is ₂ ；

S44 if ((W) ₁ >W ₂ >T)||(W ₁ >T>W ₂ ) And then, if desired, the second device,

s and x _u S is added to the training dataset belonging to the same category;

s45 if (T)>W ₁ >W ₂ ) Then, the first and second processes, respectively,

s is a new sample belonging to a new class, s is added to a training data set of the new class, a new layer is added for the trained CNN, and new parameters are randomly initialized; gradually training a new layer;

in the step S1 of the above-mentioned process,

the TE process has 5 units including chemical reactors, recycle compressors, condensers, strippers, vapor/liquid separators, TE simulators generate 22 different types of status data, including 21 standard fault and normal status data;

the 21 fault state types for the TE process are as follows:

fault 1A/C feed ratio, B component constant;

fault 2B component, a/C specific constant;

fault 3D feed temperature;

failure 4 reactor cooling water inlet temperature;

failure 5 condenser cooling water inlet temperature;

fault 6A feed loss;

failure 7C header pressure loss;

failure 8A, B, C feed components;

fault 9D feed temperature;

fault 10C feed temperature;

the cooling water inlet temperature of the reactor is failed 11;

failure 12 condenser cooling water inlet temperature;

a fault 13 reaction kinetics indicator;

fault 14 reactor coolant water valve;

failure 15 condenser cooling water valve;

faults 16-20 are of unknown type;

failure 21 valve in stream 4;

wherein A, C, D represents three different gaseous reactants, B represents an inert component, and during TE the reactants and inert component are fed into the reactor; flow 4 refers to the valve position;

the step S3 of this method comprises the steps of,

given training sample setWherein-> Is x _u M and U are the dimension and number of training samples, respectively, < >>Representing sample space, x on a local hyperplane _u Is expressed as +.>

Is provided withIs a sample matrix with k nearest neighbor samples x _u W is a diagonal matrix with diagonal elements w _mm Weight representing mth feature +.>Each element is the most

s.t.||w|| ₂ ＝1，w _mm ≥0，m＝1，...，M

Wherein,,is a matrix with k x _u Homogeneous neighbor,/->Is a matrix with k x _u Homogeneous neighbor of alpha _u And beta _u Is the reconstruction coefficient closest to the sample, respectively from the same class +.>And from the opposite category->w represents a weight margin vector;

w (t) represents the weight of the weighted feature space of the t iteration, z (t) represents the expected boundary vector of the t iteration, and the objective function is:

s.t.z(t-1)＝w(t-1)⊙e(t-1)

||w(t)|| ₂ ＝1，w _m (t)≥0，m＝1，...，M，t＝1，...，Ite

where e (t) is the expected edge vector of the original space at the mth iteration, ite is the maximum number of iterations,the obtained e (t) is:

representing a given sample x by a point on a local hyperplane _u The final weight vector is obtained by maximizing the interval between a given sample and the local hyperplane, and therefore,and->Can be expressed as:

wherein the method comprises the steps ofReconstruction coefficient vector for homogeneous neighbor, +.>For the reconstruction coefficient vector of the heterogeneous neighbor, alpha is obtained by solving the optimization problem _u And beta _u ：

Wherein I ₂ Is 2-norm, λ is a regularization factor, if t=0, initializing feature weights At iteration (t-1), α is obtained for each sample _u And beta _u Where u=1, …, U, then the feature weight factor w (t) is updated, based on the gradient-increasing method, by:

δ is learning rate, δ (t) =δ/t, t=1, 2, …, ite,0< δ (t). Ltoreq.1, the calculation formula of gradient is as follows:

given training sample setWherein-> Is x _u M is the dimension of the training samples, U is the number of training samples, +.>Representing the number of categories, therefore, e (t-1) for the t-1 th iteration is defined as follows:

wherein P (c) is the prior probability of c class,and->Is x in class c _u Intra-class and extra-class reconstruction points of (a);

the step S4 of said step comprises the steps of,

using the matching degree to measure the similarity between two samples, and setting a new sample s and a best matching sampleThe best matching degree between the two is W ₁ Then the new sample s is +.>The second matching degree is W ₂ The degree of matching is defined as:

wherein M is a characteristic dimension; f(s) _m ) Andsample s and sample->M-th feature of->And->Are f(s) _m ) And->Minimum and maximum of (2), W _g Representation->Similarity to s, W _g Take a value between 0 and 1;

after obtaining W ₁ And W is ₂ After that, compare W ₁ And W is ₂ And a matching degree threshold T, if W ₁ >W ₂ >T or W ₁ >T>W ₂ Then the new sample s matches bestIs a sample of (2)Belonging to the same category, if T>W ₁ S belongs to a new class and becomes the initial sample of the new class.