Background
The increasingly complex and large development modes of modern chemical industry production systems are focusing more and more on computer technology and advanced instrument and meter technology, and the application of artificial intelligence technology in management of production, monitoring, scheduling and other problems. Because of the large amount of installation and use of advanced instruments and storage equipment, chemical process objects can store and measure mass sampling data on line in real time in an off-line manner, and the data contain potential useful information which can represent the operation state of the production process, thereby laying a rich data foundation for monitoring the operation state of the chemical process. Therefore, how to fully and effectively utilize the sampling data to monitor the fault working conditions in the chemical process in real time embodies the digital and intelligent management level of the modern chemical process. In recent decades, a great deal of manpower and material resource research data driven fault detection methods and technologies are input in both academia and industry. Among these, statistical process monitoring is the most studied method technique, wherein principal component analysis (Principal Component Analysis, abbreviation: PCA) and independent component analysis (Independent Component Analysis, abbreviation: ICA) are the most mainstream implementation techniques. The core of such method techniques is to mine the sampled data for underlying useful information or features.
Because of the improvement of computer capability and the wide application of advanced measuring instruments, the unavoidable existence of sequence autocorrelation of chemical process sampling data, so that the dynamic process monitoring technology is more applicable than the traditional static technology. In general, both sequence autocorrelation and cross correlation are common features of the sampled data itself, and must be fully considered in data modeling and feature extraction. In the prior art and patent materials, the implementation of dynamic process monitoring mostly relies on the introduction of time-lapse measurement data for each sample data, i.e. a plurality of sample data consecutive in sampling time as one sample, followed by modeling and monitoring. Typical representations of this method technique are dynamic PCA and dynamic ICA, which are both extracted simultaneously by mixing together the sequence auto-correlation and cross-correlation. Recently, research has also been proposed to guide the mining of potential features of sampled data by maximizing the covariance objective function, independent of the use of augmentation vectors or matrices, typically represented by methods based primarily on dynamic latent variable (Dynamic Latent Variable, abbreviated: DLV) models.
However, the extraction of the sequence autocorrelation characteristics of the sampled data should take into account the typical correlation adequately. Covariance information, although reflecting correlation to some extent, can also maximize covariance due to the problem of co-linearity between data. Therefore, in order to fully extract the sequence autocorrelation of the sampled data, a typical correlation coefficient needs to be used as a measure. The mining of the sequence autocorrelation characteristics of the sampled data is of great significance for monitoring faults in chemical processes, as the negative effects of many faults on the sampled data are reflected in the sampling time sequence. For example, pipeline valve sticking may cause a delay in the effect of the operating variables, which may be a negative effect on time series. Therefore, the method fully excavates typical autocorrelation latent variables on the time sequence, reasonably and properly describes the autocorrelation latent variables, and has positive effects and significance on fault detection of the operation state of the chemical process.
Disclosure of Invention
The main technical problems to be solved by the invention are as follows: how to mine hidden autocorrelation latent variables from sampling data by taking typical correlation coefficients as a measurement standard, so that effective monitoring of the operation state of the chemical process is realized based on the hidden autocorrelation latent variables. Specifically, the method firstly infers a brand new autocorrelation latent variable model, and aims at maximizing typical correlation of the latent variable on a time sequence, and optimizes a corresponding conversion base for time sequence sample data so as to obtain the typical autocorrelation latent variable. Then, the correlation between latent variables is described using a least squares regression algorithm. And finally, respectively implementing real-time monitoring on the running state of the chemical process by using the regression error of the autocorrelation latent variable and the static latent variable.
The technical scheme adopted for solving the technical problems is as follows: a chemical process state monitoring method based on an autocorrelation latent variable model comprises the following steps:
step (1): collecting n sample data x under normal operation state of chemical process 1 ,x 2 ,…,x n Form training data matrix x= [ X ] 1 ,x 2 ,…,x n ] T ∈R n×m And standardized processing is carried out on the data of each sample in X to obtain a matrixWherein m is the number of measured variables, R is the real number set, R n×m Representing real matrix, x of dimension n x m i ∈R m×1 And (3) withThe i-th sample data and the normalized data vector thereof are represented by i=1, 2, …, n, and the upper symbol T represents a matrix or a transpose of the vector, respectively.
It should be noted that, each sample data of the chemical process is generally measured by measuring instruments such as temperature, pressure, flow, liquid level and the like. And (3) in the step (1), if the number of the measured variables is m, the fact that m measuring instruments sample the chemical process object in real time is indicated.
Furthermore, since the variation ranges of the respective measurement variables are unlikely to be uniform, there is caused a difference influence of dimensions among the respective measurement variables. Therefore, it is necessary to convert the sampled data of each measurement variable into data having a mean value of 0 and a standard deviation of 1 by using a standardized processing method.
Step (2): after setting the autocorrelation order of the time series to D (generally, d=3 or 4 is preferable), D time series matrices X are sequentially constructed according to the following formula 1 ,X 2 ,…,X D :
In the above formula, d=1, 2, …, D, n=n-d+1.
The next thing is to separate the autocorrelation latent variables from the training data, requiring the use of the autocorrelation latent variable model algorithm involved in the method of the present invention. The autocorrelation latent variable model is a brand new modeling algorithm, and aims to realize that the projection transformation basis W is adoptedConversion to a latent variable matrix->Thereby ensuring that the time series autocorrelation of each column vector in S is maximized. The corresponding objective function is as follows:
in the above formula, k=1, 2, …, D, λ=1, 2, …, D, I represent an identity matrix,The sum of squares, s.t. representing the individual elements in the calculation matrix is the abbreviation for the word Subject ToMeaning of constraint condition, H kλ The definition of (2) is as follows:
if it is, orderWherein->The optimization problem defined in the above formula (2) can be converted into the form as follows:
in the above, C kλ =X k T X λ . In this way, the problem of optimizing the solution of the projective transformation basis W in the above equation (2) becomes the problem of optimizing the solution of the orthogonal transformation basis U in the above equation (4).
As can be seen from the above equation (2), the objective function is usedThe typical correlation of the time series of latent variables is squared, accumulated and processed. Thus, the effect of maximizing the objective function for the case where the typical correlation is negative is avoided. Furthermore, since the above equation (2) is intended to maximize the typical correlation of the time series of each latent variable, the transformed latent variable is autocorrelation, which is why the method of the present invention refers to it as an autocorrelation latent variable model.
Taking into account thatWhere A represents any one of the real matrices, tr () represents the trace of the calculated matrix (equivalent to the sum of diagonal elements of the calculated matrix or the sum of all eigenvalues of the matrix), the objective function in the above formula (4) may be equivalent as shown belowAnd (3) transformation:
in the above, matrixClearly, due to matrix Φ U Is symmetrical, so that the optimal solution of U in the above formula (4) is a matrix Φ U Corresponding feature vectors. However, matrix Φ U Is coupled with the optimization solution of U, so an iterative loop solution process is specially designed as follows.
Step (1): initializing U as random real matrix of arbitrary m dimension.
Step (II): calculating matrix phi U Then, solving the eigenvalue problem phi U Eigenvector μ corresponding to all eigenvalues in μ=ημ 1 ,μ 2 ,…,μ m And ensuring that the eigenvectors are arranged in descending order of eigenvalue magnitude, i.e. eta 1 ≥η 2 ≥…≥η m And the matrix U= [ mu ] is updated after the sequential arrangement is carried out and the lengths of the feature vectors are all 1 1 ,μ 2 ,…,μ m ]。
Step (III): if the U is converged, executing the step (fourth); if U is not converged, returning to the step (II).
Step (IV): according to the formulaThe projective transformation basis W is calculated.
Step (3): and (3) solving according to the steps (one) to (four) to obtain the projective transformation matrix W.
Step (4): determining the number of autocorrelation latent variables as d, and correspondingly dividing the projection transformation matrix W into two parts: w (W) 1 And W is equal to 2 Wherein W is 1 Consists of column vectors of the front d columns in the projective transformation matrix W 2 Consists of column vectors of m-d columns after in W.
Determining the number of autocorrelation latent variables requires that the latent variables for which there is a significant canonical correlation of the time series cannot be missed. For this purpose, the method of the present invention objectively determines the number d of autocorrelation latent variables using the procedure shown below.
Step (1): according to formula S k =X k W calculates a time series score matrix S 1 ,S 2 ,…,S D After that, j=1 and d=0 are initialized, and S is set 1 (j),S 2 (j),…,S D (j) Respectively and correspondingly represent S 1 ,S 2 ,…,S D Column vector of the j-th column in (a).
Step (II): according to formula J kλ =|S k (j) T S λ (j)|H kλ Calculation S 1 (j),S 2 (j),…,S D (j) Typical correlation size J between kλ And will J kλ The maximum value of (1) is denoted as J max Where k=1, 2, …, D, λ=1, 2, …, D.
Step (III): setting cut-off parametersAfter that, if->Setting d=d+1 and j=j+1, and returning to the step (two); if->The number of autocorrelation latent variables is obtained as d.
Step (5): according to the formulaAnd->Respectively calculating autocorrelation latent variable matrix>And static latent variable matrix->
Step (6): establishing an input matrix using least squares regression algorithmAnd->Regression model between: />Wherein E is regression error matrix,>representing a regression coefficient matrix.
Step (7): calculating covariance matrix Λ=e of E T E/(N-1), again according to the formula ψ=diag { E Λ -1 E T And (3)The monitoring index vectors psi and Q are calculated respectively, and specific numerical values of the monitoring index vectors under the condition that the confidence limit alpha=99% are determined respectively by using a kernel density estimation (Kernel Density Estimation, abbreviated as KDE) method and are respectively and correspondingly marked as delta and beta, wherein diag is the operation of converting elements of a diagonal line of a matrix into column vectors.
The offline modeling phase is completed so far, and then the online dynamic process monitoring phase is entered, including the implementation steps shown below.
Step (8): collecting sample data x at a new sampling instant t ∈R m×1 And to x t Performing the same normalization processing as in the step (1) to obtain a corresponding vectorWhere t represents the latest sampling instant.
Step (9): according to the formulaAnd->Calculating autocorrelation latent variable score vectors s D And a static latent variable score vector u.
Step (10): data vectors from the t-1 sampling time to the t-D+1 sampling time after the normalization processingRespectively performing projective transformation to obtain corresponding autocorrelation latent variable score vectors +.>Where γ=1, 2, …, D-1.
Step (11): according to the formula e=s D -zΘ calculating a regression error vector e, where z= [ s ] 1 ,s 2 ,…,s D-1 ]Respectively according to the formulaWith θ=uu T Calculating monitoring index->And θ.
Step (12): judging whether the condition is satisfied:and theta is less than or equal to beta, if so, the chemical process at the current sampling moment operates normally, and the step (8) is returned to continuously monitor sample data at the next new moment; if not, the chemical process at the current sampling moment enters an abnormal working state, triggers a fault alarm and returns to the step (8) to continue monitoring.
Compared with the traditional method, the method has the advantages that:
first, the autocorrelation latent variable model related to the method aims at mining latent variables with obvious autocorrelation, and can explicitly divide the latent features in the sampled data into autocorrelation and static. Secondly, in the following specific embodiment, the superiority of the method of the present invention compared with the traditional dynamic chemical process monitoring method will be verified. Therefore, the method is a more superior chemical process monitoring method.
Detailed Description
The process according to the invention is described in detail below with reference to the drawings and to the specific examples.
The invention discloses a chemical process state monitoring method based on an autocorrelation latent variable model, and a specific implementation process of the method and superiority of the method relative to the existing method are described below by combining a specific chemical process object.
Table 1: TE process monitor variables.
Sequence number
|
Variable description
|
Sequence number
|
Variable description
|
Sequence number
|
Variable description
|
1
|
Flow of material A
|
12
|
Separator liquid level
|
23
|
D feed valve position
|
2
|
Material D flow
|
13
|
Separator pressure
|
24
|
E feed valve position
|
3
|
Material E flow
|
14
|
Bottom flow of separator
|
25
|
A feed valve position
|
4
|
Total feed flow
|
15
|
Stripping column grade
|
26
|
A and C feed valve positions
|
5
|
Circulation flow rate
|
16
|
Stripping column pressure
|
27
|
Compressor cycling valve position
|
6
|
Reactor feed
|
17
|
Bottom flow of stripping tower
|
28
|
Evacuation valve position
|
7
|
Reactor pressure
|
18
|
Stripper temperature
|
29
|
Separator liquid phase valve position
|
8
|
Reactor grade
|
19
|
Steam at upper part of stripping tower
|
30
|
Stripper liquid phase valve position
|
9
|
Reactor temperature
|
20
|
Compressor power
|
31
|
Steam valve position of stripping tower
|
10
|
Rate of evacuation
|
21
|
Reactor cooling water outlet temperature
|
32
|
Reactor condensate flow
|
11
|
Separator temperature
|
22
|
Separator cooling water outlet temperature
|
33
|
Condenser cooling water flow |
The application object is from the U.S. tennessee-Issman (TE) chemical production process, and the TE process is a practical process flow of the Issman chemical production workshop, and the flow diagram is shown in figure 2. At present, TE process is widely used as a standard experimental platform for process operation state monitoring research due to the complexity of the process. Variables that can be continuously measured throughout the TE process include 22 measured variables and 12 manipulated variables, where the manipulated variable of the reactor agitation speed is a fixed value. The TE chemical process object can simulate various fault types, such as material inlet temperature step change, cooling water fault change and the like. To monitor the process, 33 process measurement variables as shown in table 1 were selected. Because of the short sampling interval, the TE process samples data with unavoidable sequence autocorrelation, and the detailed description of the steps of the invention is provided below in connection with the TE process.
Firstly, using n=960 sample data sampled under normal working condition of TE process, implementing offline modeling of the method according to the implementation flow shown in fig. 1, specifically comprising the following steps:
step (1): collecting n=960 sample data x in normal operation state of chemical process 1 ,x 2 ,…,x 960 Form training data matrix x= [ X ] 1 ,x 2 ,…,x 960 ] T ∈R 960×33 And performing standardization processing on X to obtain matrix
Step (2): after setting d=4, 4 sub-block matrices X are sequentially obtained according to the above formula (1) 1 ,X 2 ,X 3 ,X 4 。
Step (3): the projective transformation matrix W is obtained by solving according to the implementation flow shown in fig. 2, and specific implementation steps are shown in the above steps (one) to (four).
Step (4): determining the number of autocorrelation latent variables as d=13, and dividing the projective transformation matrix W into two parts: w (W) 1 And W is equal to 2 Wherein W is 1 Consists of column vectors of the first 13 columns in the projective transformation matrix W 2 Consists of column vectors of the last 20 columns in W.
Step (5): according to the formulaAnd->Respectively calculating autocorrelation latent variable matrix>And static latent variable matrix->
Step (6): establishing an input matrix using least squares regression algorithmAnd->Regression model between: />Wherein E is a regression error matrix,/>Representing a regression coefficient matrix.
Step (7): calculating covariance matrix Λ=e of E T E/(N-1), again according to the formula ψ=diag { E Λ -1 E T And (3)And respectively calculating the monitoring index vectors psi and Q, and respectively determining specific numerical values of each monitoring index vector under the condition of the confidence limit alpha=99% by using a nuclear density estimation method, and respectively correspondingly marking the specific numerical values as delta and beta.
The off-line modeling phase is completed so far, and then an on-line dynamic process monitoring phase is entered. The fault monitoring performance of the method is tested by utilizing 960 test data of the TE chemical process under the viscous fault working condition of the condenser cooling water valve, wherein the first 160 data of the 960 data are collected from the normal running state of the TE process, and the TE process enters the fault working condition only from the 161 th sample point.
The corresponding on-line monitoring implementation flow is shown in fig. 3, and specifically comprises the following steps.
Step (8): collecting sample data x at a new sampling instant t ∈R m×1 And to x t Performing the same normalization processing as in the step (1) to obtain a corresponding vectorWhere t represents the latest sampling instant.
Step (9): according to the formulaAnd->Calculating autocorrelation latent variable score vectors s D And a static latent variable score vector u.
Step (10): the t-1 sampling time after the standardization treatment is up to the t-Data vector of D+1 sampling momentsRespectively performing projective transformation to obtain corresponding autocorrelation latent variable score vectors +.>Wherein γ=1, 2, …, D-1;
step (11): according to the formula e=s D -zΘ calculating a regression error vector e, where z= [ s ] 1 ,s 2 ,…,s D-1 ]Respectively according to the formulaWith θ=uu T Calculating monitoring index->And θ;
step (12): judging whether the condition is satisfied:and theta is less than or equal to beta, if so, the chemical process at the current sampling moment operates normally, and the step (8) is returned to continuously monitor sample data at the next new moment; if not, the chemical process at the current sampling moment enters an abnormal working state, triggers a fault alarm and returns to the step (8) to continue monitoring.
As shown in FIG. 4, the method of the present invention monitors the fault condition data with conventional dynamic PCA, DLV, and conventional dynamic ICA methods. As is apparent from the comparison of fig. 4, the fault detection success rate of the method of the present invention is significantly superior to other dynamic process monitoring methods. Thus, it can be said that the method of the present invention has more reliable process monitoring performance.
The above embodiments are merely illustrative of specific implementations of the invention and are not intended to limit the invention. Any modification made to the present invention that comes within the spirit of the present invention and the scope of the appended claims falls within the scope of the present invention.