CN111914384B

CN111914384B - Chemical process state monitoring method based on autocorrelation latent variable model

Info

Publication number: CN111914384B
Application number: CN201910873190.0A
Authority: CN
Inventors: 张赫; 葛英辉; 童楚东
Original assignee: Ningbo University
Current assignee: Hefei Jiuzhou Longteng Scientific And Technological Achievement Transformation Co ltd
Priority date: 2019-09-07
Filing date: 2019-09-07
Publication date: 2023-10-24
Anticipated expiration: 2039-09-07
Also published as: CN111914384A

Abstract

The invention discloses a chemical process state monitoring method based on an autocorrelation latent variable model, which aims to mine a latent autocorrelation latent variable from sampling data by taking a typical correlation coefficient as a measurement standard, thereby realizing effective monitoring on the operation state of a chemical process based on the detection. Compared with the traditional method, the autocorrelation latent variable model related by the method aims at mining latent variables with obvious autocorrelation, and can explicitly divide the latent features in the sampled data into autocorrelation and static. Secondly, in the following specific embodiment, the superiority of the method of the present invention compared with the traditional dynamic chemical process monitoring method will be verified. Therefore, the method is a more superior chemical process monitoring method.

Description

Chemical process state monitoring method based on autocorrelation latent variable model

Technical Field

The invention relates to a data-driven process monitoring method, in particular to a chemical process state monitoring method based on an autocorrelation latent variable model.

Background

The increasingly complex and large development modes of modern chemical industry production systems are focusing more and more on computer technology and advanced instrument and meter technology, and the application of artificial intelligence technology in management of production, monitoring, scheduling and other problems. Because of the large amount of installation and use of advanced instruments and storage equipment, chemical process objects can store and measure mass sampling data on line in real time in an off-line manner, and the data contain potential useful information which can represent the operation state of the production process, thereby laying a rich data foundation for monitoring the operation state of the chemical process. Therefore, how to fully and effectively utilize the sampling data to monitor the fault working conditions in the chemical process in real time embodies the digital and intelligent management level of the modern chemical process. In recent decades, a great deal of manpower and material resource research data driven fault detection methods and technologies are input in both academia and industry. Among these, statistical process monitoring is the most studied method technique, wherein principal component analysis (Principal Component Analysis, abbreviation: PCA) and independent component analysis (Independent Component Analysis, abbreviation: ICA) are the most mainstream implementation techniques. The core of such method techniques is to mine the sampled data for underlying useful information or features.

Because of the improvement of computer capability and the wide application of advanced measuring instruments, the unavoidable existence of sequence autocorrelation of chemical process sampling data, so that the dynamic process monitoring technology is more applicable than the traditional static technology. In general, both sequence autocorrelation and cross correlation are common features of the sampled data itself, and must be fully considered in data modeling and feature extraction. In the prior art and patent materials, the implementation of dynamic process monitoring mostly relies on the introduction of time-lapse measurement data for each sample data, i.e. a plurality of sample data consecutive in sampling time as one sample, followed by modeling and monitoring. Typical representations of this method technique are dynamic PCA and dynamic ICA, which are both extracted simultaneously by mixing together the sequence auto-correlation and cross-correlation. Recently, research has also been proposed to guide the mining of potential features of sampled data by maximizing the covariance objective function, independent of the use of augmentation vectors or matrices, typically represented by methods based primarily on dynamic latent variable (Dynamic Latent Variable, abbreviated: DLV) models.

However, the extraction of the sequence autocorrelation characteristics of the sampled data should take into account the typical correlation adequately. Covariance information, although reflecting correlation to some extent, can also maximize covariance due to the problem of co-linearity between data. Therefore, in order to fully extract the sequence autocorrelation of the sampled data, a typical correlation coefficient needs to be used as a measure. The mining of the sequence autocorrelation characteristics of the sampled data is of great significance for monitoring faults in chemical processes, as the negative effects of many faults on the sampled data are reflected in the sampling time sequence. For example, pipeline valve sticking may cause a delay in the effect of the operating variables, which may be a negative effect on time series. Therefore, the method fully excavates typical autocorrelation latent variables on the time sequence, reasonably and properly describes the autocorrelation latent variables, and has positive effects and significance on fault detection of the operation state of the chemical process.

Disclosure of Invention

The main technical problems to be solved by the invention are as follows: how to mine hidden autocorrelation latent variables from sampling data by taking typical correlation coefficients as a measurement standard, so that effective monitoring of the operation state of the chemical process is realized based on the hidden autocorrelation latent variables. Specifically, the method firstly infers a brand new autocorrelation latent variable model, and aims at maximizing typical correlation of the latent variable on a time sequence, and optimizes a corresponding conversion base for time sequence sample data so as to obtain the typical autocorrelation latent variable. Then, the correlation between latent variables is described using a least squares regression algorithm. And finally, respectively implementing real-time monitoring on the running state of the chemical process by using the regression error of the autocorrelation latent variable and the static latent variable.

The technical scheme adopted for solving the technical problems is as follows: a chemical process state monitoring method based on an autocorrelation latent variable model comprises the following steps:

step (1): collecting n sample data x under normal operation state of chemical process ₁ ，x ₂ ，…，x _n Form training data matrix x= [ X ] ₁ ，x ₂ ，…，x _n ] ^T ∈R ^n×m And standardized processing is carried out on the data of each sample in X to obtain a matrixWherein m is the number of measured variables, R is the real number set, R ^n×m Representing real matrix, x of dimension n x m _i ∈R ^m×1 And (3) withThe i-th sample data and the normalized data vector thereof are represented by i=1, 2, …, n, and the upper symbol T represents a matrix or a transpose of the vector, respectively.

It should be noted that, each sample data of the chemical process is generally measured by measuring instruments such as temperature, pressure, flow, liquid level and the like. And (3) in the step (1), if the number of the measured variables is m, the fact that m measuring instruments sample the chemical process object in real time is indicated.

Furthermore, since the variation ranges of the respective measurement variables are unlikely to be uniform, there is caused a difference influence of dimensions among the respective measurement variables. Therefore, it is necessary to convert the sampled data of each measurement variable into data having a mean value of 0 and a standard deviation of 1 by using a standardized processing method.

Step (2): after setting the autocorrelation order of the time series to D (generally, d=3 or 4 is preferable), D time series matrices X are sequentially constructed according to the following formula ₁ ，X ₂ ，…，X _D ：

In the above formula, d=1, 2, …, D, n=n-d+1.

The next thing is to separate the autocorrelation latent variables from the training data, requiring the use of the autocorrelation latent variable model algorithm involved in the method of the present invention. The autocorrelation latent variable model is a brand new modeling algorithm, and aims to realize that the projection transformation basis W is adoptedConversion to a latent variable matrix->Thereby ensuring that the time series autocorrelation of each column vector in S is maximized. The corresponding objective function is as follows:

in the above formula, k=1, 2, …, D, λ=1, 2, …, D, I represent an identity matrix,The sum of squares, s.t. representing the individual elements in the calculation matrix is the abbreviation for the word Subject ToMeaning of constraint condition, H _kλ The definition of (2) is as follows:

if it is, orderWherein->The optimization problem defined in the above formula (2) can be converted into the form as follows:

in the above, C _kλ ＝X _k ^T X _λ . In this way, the problem of optimizing the solution of the projective transformation basis W in the above equation (2) becomes the problem of optimizing the solution of the orthogonal transformation basis U in the above equation (4).

As can be seen from the above equation (2), the objective function is usedThe typical correlation of the time series of latent variables is squared, accumulated and processed. Thus, the effect of maximizing the objective function for the case where the typical correlation is negative is avoided. Furthermore, since the above equation (2) is intended to maximize the typical correlation of the time series of each latent variable, the transformed latent variable is autocorrelation, which is why the method of the present invention refers to it as an autocorrelation latent variable model.

Taking into account thatWhere A represents any one of the real matrices, tr () represents the trace of the calculated matrix (equivalent to the sum of diagonal elements of the calculated matrix or the sum of all eigenvalues of the matrix), the objective function in the above formula (4) may be equivalent as shown belowAnd (3) transformation:

in the above, matrixClearly, due to matrix Φ _U Is symmetrical, so that the optimal solution of U in the above formula (4) is a matrix Φ _U Corresponding feature vectors. However, matrix Φ _U Is coupled with the optimization solution of U, so an iterative loop solution process is specially designed as follows.

Step (1): initializing U as random real matrix of arbitrary m dimension.

Step (II): calculating matrix phi _U Then, solving the eigenvalue problem phi _U Eigenvector μ corresponding to all eigenvalues in μ=ημ ₁ ，μ ₂ ，…，μ _m And ensuring that the eigenvectors are arranged in descending order of eigenvalue magnitude, i.e. eta ₁ ≥η ₂ ≥…≥η _m And the matrix U= [ mu ] is updated after the sequential arrangement is carried out and the lengths of the feature vectors are all 1 ₁ ，μ ₂ ，…，μ _m ]。

Step (III): if the U is converged, executing the step (fourth); if U is not converged, returning to the step (II).

Step (IV): according to the formulaThe projective transformation basis W is calculated.

Step (3): and (3) solving according to the steps (one) to (four) to obtain the projective transformation matrix W.

Step (4): determining the number of autocorrelation latent variables as d, and correspondingly dividing the projection transformation matrix W into two parts: w (W) ₁ And W is equal to ₂ Wherein W is ₁ Consists of column vectors of the front d columns in the projective transformation matrix W ₂ Consists of column vectors of m-d columns after in W.

Determining the number of autocorrelation latent variables requires that the latent variables for which there is a significant canonical correlation of the time series cannot be missed. For this purpose, the method of the present invention objectively determines the number d of autocorrelation latent variables using the procedure shown below.

Step (1): according to formula S _k ＝X _k W calculates a time series score matrix S ₁ ，S ₂ ，…，S _D After that, j=1 and d=0 are initialized, and S is set ₁ (j)，S ₂ (j)，…，S _D (j) Respectively and correspondingly represent S ₁ ，S ₂ ，…，S _D Column vector of the j-th column in (a).

Step (II): according to formula J _kλ ＝|S _k (j) ^T S _λ (j)|H _kλ Calculation S ₁ (j)，S ₂ (j)，…，S _D (j) Typical correlation size J between _kλ And will J _kλ The maximum value of (1) is denoted as J _max Where k=1, 2, …, D, λ=1, 2, …, D.

Step (III): setting cut-off parametersAfter that, if->Setting d=d+1 and j=j+1, and returning to the step (two); if->The number of autocorrelation latent variables is obtained as d.

Step (5): according to the formulaAnd->Respectively calculating autocorrelation latent variable matrix>And static latent variable matrix->

Step (6): establishing an input matrix using least squares regression algorithmAnd->Regression model between: />Wherein E is regression error matrix,>representing a regression coefficient matrix.

Step (7): calculating covariance matrix Λ=e of E ^T E/(N-1), again according to the formula ψ=diag { E Λ ^-1 E ^T And (3)The monitoring index vectors psi and Q are calculated respectively, and specific numerical values of the monitoring index vectors under the condition that the confidence limit alpha=99% are determined respectively by using a kernel density estimation (Kernel Density Estimation, abbreviated as KDE) method and are respectively and correspondingly marked as delta and beta, wherein diag is the operation of converting elements of a diagonal line of a matrix into column vectors.

The offline modeling phase is completed so far, and then the online dynamic process monitoring phase is entered, including the implementation steps shown below.

Step (8): collecting sample data x at a new sampling instant _t ∈R ^m×1 And to x _t Performing the same normalization processing as in the step (1) to obtain a corresponding vectorWhere t represents the latest sampling instant.

Step (9): according to the formulaAnd->Calculating autocorrelation latent variable score vectors s _D And a static latent variable score vector u.

Step (10): data vectors from the t-1 sampling time to the t-D+1 sampling time after the normalization processingRespectively performing projective transformation to obtain corresponding autocorrelation latent variable score vectors +.>Where γ=1, 2, …, D-1.

Step (11): according to the formula e=s _D -zΘ calculating a regression error vector e, where z= [ s ] ₁ ，s ₂ ，…，s _D-1 ]Respectively according to the formulaWith θ=uu ^T Calculating monitoring index->And θ.

Step (12): judging whether the condition is satisfied:and theta is less than or equal to beta, if so, the chemical process at the current sampling moment operates normally, and the step (8) is returned to continuously monitor sample data at the next new moment; if not, the chemical process at the current sampling moment enters an abnormal working state, triggers a fault alarm and returns to the step (8) to continue monitoring.

Compared with the traditional method, the method has the advantages that:

first, the autocorrelation latent variable model related to the method aims at mining latent variables with obvious autocorrelation, and can explicitly divide the latent features in the sampled data into autocorrelation and static. Secondly, in the following specific embodiment, the superiority of the method of the present invention compared with the traditional dynamic chemical process monitoring method will be verified. Therefore, the method is a more superior chemical process monitoring method.

Drawings

FIG. 1 is a flow chart of an implementation of the off-line modeling phase of the method of the present invention.

FIG. 2 is a flow chart of an implementation of the autocorrelation latent variable model algorithm involved in the method of the present invention.

FIG. 3 is a flow chart of the method of the present invention for on-line monitoring.

FIG. 4 is a comparison of monitoring details of the viscous fault condition of the cooling water valve of the TE process condenser.

Detailed Description

The process according to the invention is described in detail below with reference to the drawings and to the specific examples.

The invention discloses a chemical process state monitoring method based on an autocorrelation latent variable model, and a specific implementation process of the method and superiority of the method relative to the existing method are described below by combining a specific chemical process object.

Table 1: TE process monitor variables.

Sequence number	Variable description	Sequence number	Variable description	Sequence number	Variable description
						1	Flow of material A	12	Separator liquid level	23	D feed valve position
2	Material D flow	13	Separator pressure	24	E feed valve position
						3	Material E flow	14	Bottom flow of separator	25	A feed valve position
4	Total feed flow	15	Stripping column grade	26	A and C feed valve positions
						5	Circulation flow rate	16	Stripping column pressure	27	Compressor cycling valve position
6	Reactor feed	17	Bottom flow of stripping tower	28	Evacuation valve position
						7	Reactor pressure	18	Stripper temperature	29	Separator liquid phase valve position
8	Reactor grade	19	Steam at upper part of stripping tower	30	Stripper liquid phase valve position
						9	Reactor temperature	20	Compressor power	31	Steam valve position of stripping tower
10	Rate of evacuation	21	Reactor cooling water outlet temperature	32	Reactor condensate flow
						11	Separator temperature	22	Separator cooling water outlet temperature	33	Condenser cooling water flow

The application object is from the U.S. tennessee-Issman (TE) chemical production process, and the TE process is a practical process flow of the Issman chemical production workshop, and the flow diagram is shown in figure 2. At present, TE process is widely used as a standard experimental platform for process operation state monitoring research due to the complexity of the process. Variables that can be continuously measured throughout the TE process include 22 measured variables and 12 manipulated variables, where the manipulated variable of the reactor agitation speed is a fixed value. The TE chemical process object can simulate various fault types, such as material inlet temperature step change, cooling water fault change and the like. To monitor the process, 33 process measurement variables as shown in table 1 were selected. Because of the short sampling interval, the TE process samples data with unavoidable sequence autocorrelation, and the detailed description of the steps of the invention is provided below in connection with the TE process.

Firstly, using n=960 sample data sampled under normal working condition of TE process, implementing offline modeling of the method according to the implementation flow shown in fig. 1, specifically comprising the following steps:

step (1): collecting n=960 sample data x in normal operation state of chemical process ₁ ，x ₂ ，…，x ₉₆₀ Form training data matrix x= [ X ] ₁ ，x ₂ ，…，x ₉₆₀ ] ^T ∈R ^960×33 And performing standardization processing on X to obtain matrix

Step (2): after setting d=4, 4 sub-block matrices X are sequentially obtained according to the above formula (1) ₁ ，X ₂ ，X ₃ ，X ₄ 。

Step (3): the projective transformation matrix W is obtained by solving according to the implementation flow shown in fig. 2, and specific implementation steps are shown in the above steps (one) to (four).

Step (4): determining the number of autocorrelation latent variables as d=13, and dividing the projective transformation matrix W into two parts: w (W) ₁ And W is equal to ₂ Wherein W is ₁ Consists of column vectors of the first 13 columns in the projective transformation matrix W ₂ Consists of column vectors of the last 20 columns in W.

Step (6): establishing an input matrix using least squares regression algorithmAnd->Regression model between: />Wherein E is a regression error matrix,/>Representing a regression coefficient matrix.

Step (7): calculating covariance matrix Λ=e of E ^T E/(N-1), again according to the formula ψ=diag { E Λ ^-1 E ^T And (3)And respectively calculating the monitoring index vectors psi and Q, and respectively determining specific numerical values of each monitoring index vector under the condition of the confidence limit alpha=99% by using a nuclear density estimation method, and respectively correspondingly marking the specific numerical values as delta and beta.

The off-line modeling phase is completed so far, and then an on-line dynamic process monitoring phase is entered. The fault monitoring performance of the method is tested by utilizing 960 test data of the TE chemical process under the viscous fault working condition of the condenser cooling water valve, wherein the first 160 data of the 960 data are collected from the normal running state of the TE process, and the TE process enters the fault working condition only from the 161 th sample point.

The corresponding on-line monitoring implementation flow is shown in fig. 3, and specifically comprises the following steps.

Step (10): the t-1 sampling time after the standardization treatment is up to the t-Data vector of D+1 sampling momentsRespectively performing projective transformation to obtain corresponding autocorrelation latent variable score vectors +.>Wherein γ=1, 2, …, D-1;

step (11): according to the formula e=s _D -zΘ calculating a regression error vector e, where z= [ s ] ₁ ，s ₂ ，…，s _D-1 ]Respectively according to the formulaWith θ=uu ^T Calculating monitoring index->And θ;

As shown in FIG. 4, the method of the present invention monitors the fault condition data with conventional dynamic PCA, DLV, and conventional dynamic ICA methods. As is apparent from the comparison of fig. 4, the fault detection success rate of the method of the present invention is significantly superior to other dynamic process monitoring methods. Thus, it can be said that the method of the present invention has more reliable process monitoring performance.

The above embodiments are merely illustrative of specific implementations of the invention and are not intended to limit the invention. Any modification made to the present invention that comes within the spirit of the present invention and the scope of the appended claims falls within the scope of the present invention.

Claims

1. The chemical process state monitoring method based on the autocorrelation latent variable model is characterized by comprising the following steps of:

step (1): collecting n sample data x under normal operation state of chemical process ₁ ,x ₂ ,…,x _n Form training data matrix x= [ X ] ₁ ,x ₂ ,…,x _n ] ^T ∈R ^n×m And standardized processing is carried out on the data of each sample in X to obtain a matrixWherein m is the number of measured variables, R is the real number set, R ^n×m Representing real matrix, x of dimension n x m _i ∈R ^m×1 And (3) withRespectively representing the ith sample data and the data vector after normalization processing, i=1, 2, …, n, and the upper label T represents a matrix or a transpose of the vector;

step (2): after setting the autocorrelation order D, D time-series sub-block matrices X are obtained in turn according to the following formula ₁ ,X ₂ ,…,X _D ：

In the above formula, d=1, 2, …, D, n=n-d+1;

step (3): solving according to the following steps (1) to (4) to obtain a projective transformation base W E R ^m×m ；

Step (1): initializing U as random real number matrix of arbitrary m x m dimension;

step (2): computing a matrixThen, solving the eigenvalue problem phi _U All eigenvalues in μ=ημ correspond to the featuresVector mu ₁ ,μ ₂ ,…,μ _m And ensure that the length of each eigenvector is 1 and the eigenvectors are arranged in descending order according to the eigenvalue size to be arranged in sequence, and then the matrix U= [ mu ] is updated ₁ ,μ ₂ ,…,μ _m ]Wherein k=1, 2, …, D,C _kλ ＝X _k ^T X _λ 、λ＝1,2,…,D、H _kλ The values of (2) are as follows: if k is not equal to λ, H _kλ =1; if k=λ, then H _kλ ＝0；

Step (3): if the U is converged, executing the step (4); if the U is not converged, returning to the step (2);

step (4): according to the formulaCalculating to obtain a projection transformation basis W;

step (4): determining the number of autocorrelation latent variables as d, and correspondingly dividing the projection transformation matrix W into two parts: w (W) ₁ And W is equal to ₂ Wherein W is ₁ Consists of column vectors of the front d columns in the projective transformation matrix W ₂ Consists of column vectors of m-d columns after W;

Step (6): establishing an input matrix using least squares regression algorithmAnd->Regression model between: />Wherein E is regression error matrix,>representing a regression coefficient matrix;

step (7): calculating covariance matrix Λ=e of E ^T E/(N-1), again according to the formula ψ=diag { E Λ ^-1 E ^T And (3)Respectively calculating monitoring index vectors psi and Q, and respectively determining specific numerical values of each monitoring index vector under the condition of confidence limit alpha=99% by using a kernel density estimation method, wherein delta and beta are respectively and correspondingly marked, and diag { } represents the operation of converting elements of a matrix diagonal line into column vectors;

the off-line modeling stage is completed, and then the on-line monitoring stage is entered, including the following implementation steps;

step (8): collecting sample data x at a new sampling instant _t ∈R ^m×1 And to x _t Performing the same normalization processing as in the step (1) to obtain a corresponding vectorWherein t represents the latest sampling instant

Step (9): according to the formulaAnd->Calculating autocorrelation latent variable score vectors s _D Score vector u with static latent variables;

step (10): data vectors from the t-1 sampling time to the t-D+1 sampling time after the normalization processingRespectively performing projective transformation to obtain corresponding autocorrelation latent variable score vectors +.>Wherein γ=1, 2, …, D-1;

step (11): according to the formula e=s _D -zΘ calculating a regression error vector e, where z= [ s ] ₁ ,s ₂ ，…,s _D-1 ]Respectively according to the formulaWith θ=uu ^T Calculating monitoring index->And θ;

2. The method for monitoring the state of a chemical process based on an autocorrelation latent variable model as set forth in claim 1, wherein the reasoning process for solving the projective transformation matrix W in the step (3) is as follows:

first,: the objective function is determined as follows:

in the above formula, k=1, 2, …, D, λ=1, 2, …, D, I represent an identity matrix,Representing the sum of squares of the individual elements in the calculation matrix, s.t. being the abbreviation for the word Subject To, representing the meaning of the constraint, argmax representing the maximization objective function, H _kλ The definition of (2) is as follows:

secondly, let theWherein->The optimization problem defined in the above formula (2) can be converted into the form as follows:

in this way, the problem of optimizing the solution of the projective transformation basis W in the above formula (2) becomes the problem of optimizing the solution of the orthogonal transformation basis U in the above formula (4);

then due toWhere a represents any one of the real matrices, tr () represents the trace of the computation matrix, tr () is equivalent to the sum of the eigenvalues of the computation matrix, and the objective function in the above formula (4) may be equivalently transformed as follows:

in the above, matrix

Finally, due to matrix Φ _U Is a symmetric matrix, and the optimal solution U in the above formula (4) is a matrix phi _U Corresponding feature vectors.

3. The chemical process state monitoring method based on the autocorrelation latent variable model as set forth in claim 1, wherein the specific implementation process of determining the number d of autocorrelation latent variables in the step (4) is as follows:

step (1): according to formula S _k ＝X _k W calculates a time series score matrix S ₁ ,S ₂ ,…,S _D After that, j=1 and d=0 are initialized, and S is set ₁ (j),S ₂ (j),…,S _D (j) Respectively and correspondingly represent S ₁ ,S ₂ ,…,S _D Column vectors of the j-th column of (b);

step (2): according to formula J _kλ ＝|S _k (j) ^T S _λ (j)|H _kλ Calculation S ₁ (j),S ₂ (j),…,S _D (j) Typical correlation size J between _kλ And will J _kλ The maximum value of (1) is denoted as J _max Where k=1, 2, …, D, λ=1, 2, …, D;

step (3): setting cut-off parametersAfter that, if->Setting d=d+1 and j=j+1, and returning to the step (2); if it isThe number of autocorrelation latent variables is obtained as d.