CN112180996A - Liquid level fault-tolerant control method based on reinforcement learning - Google Patents
Liquid level fault-tolerant control method based on reinforcement learning Download PDFInfo
- Publication number
- CN112180996A CN112180996A CN202010947314.8A CN202010947314A CN112180996A CN 112180996 A CN112180996 A CN 112180996A CN 202010947314 A CN202010947314 A CN 202010947314A CN 112180996 A CN112180996 A CN 112180996A
- Authority
- CN
- China
- Prior art keywords
- liquid level
- fault
- output
- weight
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000007788 liquid Substances 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000002787 reinforcement Effects 0.000 title claims abstract description 19
- 238000011156 evaluation Methods 0.000 claims abstract description 30
- 230000009471 action Effects 0.000 claims abstract description 19
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 61
- 239000002245 particle Substances 0.000 claims description 38
- 230000006870 function Effects 0.000 claims description 15
- 210000002569 neuron Anatomy 0.000 claims description 11
- 230000001133 acceleration Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 2
- 230000005484 gravity Effects 0.000 claims description 2
- 238000012886 linear function Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000013528 artificial neural network Methods 0.000 abstract description 6
- 238000001514 detection method Methods 0.000 abstract description 3
- 238000003745 diagnosis Methods 0.000 abstract 2
- 238000010586 diagram Methods 0.000 description 14
- 239000007791 liquid phase Substances 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000007789 sealing Methods 0.000 description 2
- 238000012897 Levenberg–Marquardt algorithm Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000009428 plumbing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D9/00—Level control, e.g. controlling quantity of material stored in vessel
- G05D9/12—Level control, e.g. controlling quantity of material stored in vessel characterised by the use of electric means
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
A liquid level fault-tolerant control system based on reinforcement learning is used for fault-tolerant control of a multi-water-tank system, the required precondition is only that a fault is detected without further diagnosis of the fault, the precondition is easy to realize in fault detection and diagnosis, and at present, a plurality of mature methods such as PCA and Bayesian decision making exist. In addition, the evaluation action structure is mainly realized by adopting an artificial neural network, and the neural network has good robustness and can effectively overcome the influence of noise. The invention can directly utilize the acquired data to control under the condition of no training sample, thereby realizing the same index when the liquid level of the container reaches the fault-free condition. The control quantity obtained by the method is the optimal control quantity when the system fails, and is the performance index which can be reached to the maximum extent when the system fails.
Description
Technical Field
The invention relates to a liquid level fault-tolerant control method. In particular to a liquid level fault-tolerant control method based on reinforcement learning.
Background
In the industrial and agricultural growth, the liquid level of a container is often required to be controlled, for example, the water storage capacity of a water tank, a water pool, a water tank, a boiler and the like is automatically controlled, a large number of mature products are directly used for controlling the water level of a single container, but in the industrial production (crystallizer liquid level), the situation that a plurality of containers are communicated through valves is often faced, the set heights of different containers are kept through regulating the opening degree of the valves, so that the liquid phase reaction in the containers has higher efficiency, however, the liquid level is often deviated from the original set value due to the detection signal deviation caused by the reduction of the precision of a sensor, the performance of a valve controller is reduced, and the liquid leakage in the tank caused by the sealing failure, so that the liquid phase reaction efficiency is reduced, and the commonly adopted method is fault-tolerant.
The multi-container connection enables the height of each container to be kept at a set position through the opening adjustment of the connecting valve, but the liquid level is often deviated from an original set value due to the detection signal deviation caused by the reduction of the precision of the sensor, the performance reduction of a valve controller and the liquid leakage in the tank caused by the sealing failure.
In the traditional various artificial intelligence methods based on data driving, sample data is required to be adopted for training in advance, but due to the uncertainty of the time of fault occurrence and the randomness of fault types, enough effective fault data is difficult to obtain as a training sample.
Disclosure of Invention
The invention aims to provide a liquid level fault-tolerant control method based on reinforcement learning, which can keep the liquid level of each container to be at the height without fault even under the fault condition by adjusting the flow.
The technical scheme adopted by the invention is as follows: a liquid level fault-tolerant control system based on reinforcement learning is characterized in that the fault-tolerant control system is used for a multi-tank system and comprises the following components: the device comprises an information acquisition unit for respectively acquiring liquid level information of each water tank at different moments, a fault-free model for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit, an evaluation network for respectively estimating total values V (k) and V (k +1) of control variables of the control frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k +1 and output by the fault-free model, a stage value evaluation unit for evaluating stage values R (k) according to the stage values output by the stage value evaluation unit and the total values output by the evaluation network V (k) and V (k +1) output a fitness function used for weight updating, a weight updating unit used for updating the weight of the evaluation network according to the fitness function output by the receiving fitness estimating unit, the evaluation network outputs the weight related to the control quantity u (k) of the frequency converter according to all updated weights output by the receiving weight updating unit, and an action network used for controlling the frequency converter of the multi-capacity water tank system by the optimal control variable obtained by iterative updating according to the weight related to the control quantity u (k) of the frequency converter output by the receiving evaluation network and the liquid level information of all water tanks at the moment k output by the information acquisition unit.
The liquid level fault-tolerant control method based on reinforcement learning has the following advantages that:
1. the method of the invention does not need to diagnose and position the fault type and part in advance, and directly adopts a data driving method to carry out fault-tolerant control on the liquid level of the container.
2. The method of the invention overcomes the contradiction between the traditional artificial intelligence method that enough training samples are needed and the actual system is difficult to obtain the sample data, and can directly utilize the acquired data to control under the condition of no training sample, thereby realizing the same index when the liquid level of the container reaches the fault-free condition.
3. The control quantity obtained by the method is the optimal control quantity when the system fails, and is the performance index which can be reached to the maximum extent when the system fails.
Drawings
FIG. 1 is a schematic diagram of a control structure of a liquid level fault-tolerant control method based on reinforcement learning according to the present invention;
FIG. 2 is a schematic diagram of an evaluation neural network in accordance with the present invention;
FIG. 3 is a schematic diagram of an acting neural network according to the present invention;
FIG. 4 is a schematic structural diagram of a three-volume system according to an embodiment of the present invention;
FIG. 5 is a fluid level diagram of T3 in an actuator output deviation fault scenario in accordance with an embodiment of the present invention;
FIG. 6 is a diagram illustrating the evolution of various states in an actuator output bias fault scenario in accordance with an embodiment of the present invention;
FIG. 7 is a control variable plot in an actuator output deviation fault scenario in accordance with an embodiment of the present invention;
FIG. 8 is a liquid level diagram of T3 in an actuator stuck fault scenario in accordance with an embodiment of the present invention;
FIG. 9 is a diagram illustrating the evolution of each state in a stuck-at fault scenario of an actuator according to an embodiment of the present invention;
FIG. 10 is a control variable diagram in an actuator stuck fault scenario in accordance with an embodiment of the present invention;
fig. 11 is a liquid level diagram of T3 when the submersible pump 1 opening degree similar to the dead-lock fault is reduced to 30% according to the embodiment of the present invention;
FIG. 12 is a diagram showing the evolution of the submersible pump 1 according to the embodiment of the present invention when the opening degree similar to the dead lock fault is reduced to 30%;
fig. 13 is a control variable diagram when the opening degree of the submersible pump 1 is reduced to 30% similar to the dead lock fault according to the embodiment of the invention;
FIG. 14 is a liquid level diagram of T3 in a leak fault scenario according to an embodiment of the present invention;
FIG. 15 is a diagram illustrating the evolution of various states in a leakage failure scenario in accordance with an embodiment of the present invention;
FIG. 16 is a graph of control variables in a leakage fault scenario in accordance with an embodiment of the present invention.
Detailed Description
The liquid level fault-tolerant control method based on reinforcement learning of the invention is described in detail below with reference to the embodiments and the accompanying drawings.
As shown in fig. 1, the liquid level fault-tolerant control system based on reinforcement learning of the present invention is a fault-tolerant control system for a multi-tank system, and includes: an information acquisition unit 1 for respectively acquiring liquid level information of each water tank at different moments, a fault-free model 3 for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit 1, an evaluation network 2 for respectively estimating the total values V (k) and V (k +1) of the control variables of the control frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k and the moment k +1 output by the information acquisition unit 1, and a stage value evaluation unit 4 for evaluating the stage value R (k) according to the liquid level information of all the water tanks at the moment k +1 output by the information acquisition unit 1 and the liquid level information of all the water tanks at the moment k +1 predicted by the fault-free model 3, a deviation estimating unit 5 for outputting a fitness function for weight updating according to the separately received stage value output by the stage value evaluating unit 4 and the overall values V (k) and V (k +1) output by the evaluation network 2, a weight updating unit 6 for updating the weight of the evaluation network 2 according to the fitness function output by the receiving deviation estimating unit 5, the evaluation network 2 outputs the weight value related to the control quantity u (k) of the frequency converter according to all the updated weight values output by the receiving weight value updating unit 6, and the action network 7 is used for carrying out iterative updating according to the weight values which are output by the receiving and evaluating network 2 and are related to the control quantity u (k) of the frequency converter and the liquid level information of all the water tanks at the moment k output by the information acquisition unit 1 to obtain the optimal control variable to control the frequency converter of the multi-water-tank system. Wherein,
1) the liquid level information of all the water tanks at the moment k output by the information acquisition unit 1 is represented as x (k), and the liquid level information of all the water tanks at the moment k +1 is represented as x (k + 1).
2) The fault-free model 3 is represented as follows:
…
in the formula, x1,x2,x3And xnLiquid level information of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn, S1,S2,S3And SnThe sectional areas of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn respectively, g is the gravity acceleration and the parametersParameter(s)Parameter(s)Parameter(s)Parameter(s)In the formula, R12Is the flow resistance, R, between the tank 1 and the tank 232Is the flow resistance, R, between the tank 3 and the tank 243Is the flow resistance, R, between the water tank 4 and the water tank 3n-1,nIs the flow resistance between tank n-1 and tank n, RnIs the drainage resistance of the water tank Tn, and rho is the liquid density;Q1and Q2Is the flow rate of the submersible pump 1 and the submersible pump 2.
3) The evaluation network 2 is shown in fig. 2 and includes an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n +2 neurons, the hidden layer has 2n neurons, and the output layer has 1 neuron.
4) The stage value evaluation unit 4 is composed of the following formula:
wherein R (k) is a stageA value; x (k +1) is the liquid level information of all the water tanks at the moment of k + 1; x is the number ofrAnd (k +1) is the liquid level information of all the water tanks at the moment of predicting k +1 output by the fault-free model (3).
5) The deviation estimating unit 5 is composed of the following formula:
TE=V(k)-R(k)+γV(k+1)
wherein TE is a deviation; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor.
6) The weight updating unit 6 includes:
(1) will evaluate the weight W of the input layer and the hidden layer in the network 2c1And the weight W of the hidden layer and the output layerc2Randomly selecting an initial particle value by using corresponding particle position representation;
(2) the fitness function for each particle is calculated according to the following formula:
wherein FF (z (k)) is a fitness function of the ith particle at the p-th iteration; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor; x (k) is the combination of the liquid level information x (k) of all the water tanks at the moment k and the control information u (k) of the frequency converter;
(3) obtaining the optimal position p of the current particle swarm according to the fitness function value and the following formulabestAnd the optimal position g experienced by the whole particle swarmbestAnd update pbest,gbest:
Wherein i is the number of particles, and m is the number of particles; p is the number of iterations;
(4) updating the particle moving speed v according to the basic iterative formula of the particle swarm optimizationiAnd the position z of the particlei
Wherein z represents the particle position, v represents the particle velocity, ω is the inertial weight, c1And c2Is the acceleration constant, and rand1 and rand2 are at [0,1]]Two random numbers, p, generated independently of each otherbestIs the current optimum position of the particle swarm, gbestIs the best position experienced by the whole particle swarm, (p) represents the number of iterations;
(5) repeating the steps (2) to (4) until convergence, and recording the optimal position g of the current particle swarmbest1;
(6) Redistributing particles with random numbers of [0,1] to obtain a new fitness function value;
(7) repeating the steps (2) to (4) until convergence, and recording the optimal position g of the current particle swarmbest2;
(8) If the optimum position gbest2Better than optimum position gbest1Then use the optimum position gbest2Alternative optimum position gbest1Otherwise, the optimum position g is maintainedbest1The change is not changed;
(9) repeating the step (2) to the step (8) until a better optimal position cannot be found, and obtaining a final position gbest1;
(10) The particles are in gbest1Is located to judge the network Wc1And Wc2The solution of (1).
7) The action network 7 is shown in fig. 3 and comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer is provided with n godsThe channel element, the hidden layer has n +3 neurons, the output layer has 2 neurons, the weight between the input layer and the hidden layer is Wa1The weight between the hidden layer and the output layer is Wa2。
8) The weight of the action network 7 is changed into
ΔWa2=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·sout,a
ΔWa1=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·Wa2·[sout,a(1-sout,a)]·x(k)
Wherein l is the learning rate, Wc2Represents the weight, s, between the hidden layer and the output layer in the evaluation network 2out,cAnd sout,aAre the outputs of the non-linear functions in the evaluation network 2 and the action network 7, respectively; wc1,uFor evaluating the weight of the hidden layer pair of the network 2 in relation to the control u (k) of the frequency converter, Wa2The weights of the hidden layer and the output layer in the action network 7, x (k) is the liquid level information of all the water tanks at the moment k, Wc1,u、Wc2、sout,c,sout,aAnd Wa2Both obtained from the evaluation network and the action network;
updating the weights Wa1 and Wa2 of the action network according to the following formula
Wa1’=Wa1+ΔWa1
Wa2’=Wa2+ΔWa2
In the formula, Wa1' and Wa2' is the updated weights between the input layer and the hidden layer and the weights between the hidden layer and the output layer in the action network 7.
Experimental validation is given below
The proposed method was verified using a three-volume system as the experimental platform. The three-container system consists of a water tank T1, a T2, a T3, a submersible pump 1 with flow rate Q1 and Q2 controlled by a digital controller, a submersible pump 2, a connecting valve CV1, CV2, CV3, a leakage valve LV1, LV2 and LV3 and pipelines. The liquid level information of each of the water tanks T1, T2, and T3 may be obtained separately by a liquid level meter. The three tanks T1, T2 and T3 have the same size plumbing connections. The system operates with the connecting valve open and the leak valve closed. Thus, the liquid in the reservoir flows into the tank through the connecting valve CV3 and re-enters the tank body through the submersible pumps 1 and 2. The inter-tank flow resistance can be changed by manually adjusting the opening degrees of the connecting valves CV1, CV2, CV3 and the leak valves LV1, LV2, LV 3. The submersible pump 1 and the submersible pump 2 are respectively controlled by separate frequency converters. The flow rates of the submersible pumps 1 and2 are determined by the rotating speeds of the submersible pumps, and the rotating speeds are controlled by separate frequency converters. The controller outputs a frequency converter control signal of 0-5V. By additional experiments, the relation between the pump flow and the frequency control signal was obtained. After that, for the sake of clarity, we omit the frequency converter, and replace the rotation speed with the pump flow rate as the control variable of the controlled object. The structure is shown in fig. 4.
The following formula gives the fault-free model of the three-capacitor system
In the formula, each variable has the same meaning as described above.
We use the PID controller for the submersible pump 1 to keep the submersible pump 2 at 50% opening (middle signal 2.5V of soft channel control signal 0-5V), achieving the goal of keeping the liquid level at T3 without failure. We call this stability a standard state of no failure. When a fault occurs, the FTC controller with two outputs (flow to the submersible pump 1 and 2) will replace the previous controllers (PID for the submersible pump 1 and 50% fixed opening for the submersible pump 2). Our goal is to maintain the reference level in T3 by controlling the flow rate of submersible pump 1 and submersible pump 2, respectively.
A. Actuator output deviation fault scenario
In the non-fault case, actuator faults of the submersible pump 1 are simulated by changing the relation between the submersible pump flow and the frequency control signal, which changes cause the flux to increase/decrease compared to the initial value of the connection controller output. By this method, the output deviation fault of the actuator can be obtained by a soft method, and the real actuator can be prevented from being damaged. After sampling 100, the actuator of the submersible pump 1 has a fault, wherein the fault is that the flow is greater than the initial set value and is 12L/min (conversion is carried out according to the relation between the pump flow and the frequency control signal). The liquid phase height, state evolution and control variables of T3 are shown in fig. 5, 6 and 7.
The first and second curves represent the case when no FTC is used and when FTC is used, respectively. Fig. 6 shows that the states x1, x2, and x3 in the no fault state remain stable until the fault occurs, and the liquid level of T3 remains at the reference level (fig. 5). When a fault occurs at sample 100, state x1 will rise because there is more traffic at T1, and states x2, x3 will also rise because coupling without FTC. However, after a transition, the liquid level height of T3 from 10cm to 15cm will enter another steady state. The FTC controller was designed as per procedure 1, using a 3-10-2 forward neural network. 100 data are selected from the training set, and training is carried out by adopting a Levenberg-Marquardt algorithm. A well-trained neural network is used as the FTC. The algorithm clearly restores the liquid phase height of T3.
More explanations about the control variables will be given on the basis of fig. 7. In fig. 7, the horizontal coordinate is the sampling time, and the vertical coordinate is the pump flow rate. The zero point of the scale of the vertical coordinate represents the flow rate of the pump in the normal state. We use the scale zero instead of the actual flow because in the absence of a fault, the standard state will vary with the reference level of T3. Negative means less flow and positive means more flow than in the normal state without failure. The first and second curves represent flow rates without and with FTC, respectively. It can be seen that the pump 1 will reduce the flow to react more to output faults to the actuator. On the other hand, pump 2 will also decrease the output to maintain the T3 liquid level at the reference level.
B. Scene of dead locking fault of actuator
After sampling 100, the pump 1 experienced a stuck fault at 60% opening (signal 3V of inverter control signal 0-5V, indicating that the pump 1 was bumped due to loss of control). Fig. 8, 9 and 10 are the liquid level, state evolution and control variables of T3, respectively.
As can be seen from the first curve of fig. 9, if the control object is followed, the liquid levels of T1, T2, and T3 slowly rise (in response to the characteristics of the equipment) after the occurrence of the stuck fault. Fig. 10 shows the control variables with FTC (second curve) and without FTC (first curve). Due to the pump 1 being blocked, the regulating function is lost and the first and second curves coincide. The pump 2 reflects this failure by stopping the delivery flow for a period of time to release the buildup. It will then provide a steady flow to maintain the level of T3. Fig. 8 shows that the liquid level of T3 can be maintained at a fault-free (red curve) level under FTC control.
The opening degree of the pump 1 is reduced to 30 percent (the frequency converter control signal is 0-5V, the signal is 1.5V) similar to the blocking fault, and the liquid level can not be maintained to rise. The state evolution, liquid level and control variables of T3 are shown in fig. 12, 11 and 13. The first curve represents the case without FTC and the second curve represents the state evolution with FTC. As can be seen from fig. 13, the release flow is less intense and shorter in time than the 60% breakblock opening. Due to the difference in stability, a deviation between the first curve and the second curve also occurs.
C. Leakage fault scenario
We also caused a flow leak failure by partially opening LV2 of the T3 tank. As shown in fig. 14, if a fault-free control is implemented (as shown in the first curve), the liquid level in T3 will decrease from 9cm to 7cm due to the flow leakage. The second curve of fig. 14 shows the liquid height trend for T3 with FTC. It can be seen that the liquid level in T3 will remain at a fault-free level due to the action of the FTC. The state evolution and control variables are shown in fig. 15 and 16.
Claims (9)
1. A liquid level fault-tolerant control system based on reinforcement learning is characterized in that the fault-tolerant control system is used for a multi-tank system and comprises the following components: an information acquisition unit (1) used for respectively acquiring the liquid level information of each water tank at different moments, a fault-free model (3) used for predicting the liquid level information of all the water tanks at the moment k +1 according to the liquid level information of all the water tanks at the moment k and the control information of the frequency converter output by the information acquisition unit (1) and used for estimating the total values V (k) and V (k +1) of the control variables of the frequency converter corresponding to the moment k and the moment k +1 according to the liquid level information of all the water tanks at the moment k and the moment k +1 output by the information acquisition unit (1) respectively, an evaluation network (2) used for evaluating the stage value R (k) according to the liquid level information of all the water tanks at the moment k +1 output by the information acquisition unit (1) respectively and the liquid level information of all the water tanks at the moment k +1 predicted by the fault-free model (3), a deviation estimation unit (5) for outputting a fitness function for weight updating according to the separately received phase value output by the phase value evaluation unit (4) and the overall values V (k) and V (k +1) output by the evaluation network (2), a weight updating unit (6) for updating the weight of the evaluation network (2) according to the fitness function output by the receiving deviation estimating unit (5), the evaluation network (2) outputs the weight value related to the control quantity u (k) of the frequency converter according to all the updated weight values output by the receiving weight value updating unit (6), and the action network (7) is used for carrying out iterative updating according to the weight values which are output by the receiving and evaluating network (2) and are related to the control quantity u (k) of the frequency converter and the liquid level information of all the water tanks at the time k output by the information acquisition unit (1) to obtain an optimal control variable to control the frequency converter of the multi-water-tank system.
2. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, wherein the liquid level information of all water tanks at time k output by the information acquisition unit (1) is represented as x (k), and the liquid level information of all water tanks at time k +1 is represented as x (k + 1).
3. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1, characterized in that the fault-free model (3) is represented as follows:
…
in the formula, x1,x2,x3And xnLiquid level information of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn, S1,S2,S3And SnThe sectional areas of the water tank T1, the water tank T2, the water tank T3 and the water tank Tn respectively, g is the gravity acceleration and the parametersParameter(s)Parameter(s)Parameter(s)Parameter(s)In the formula, R12Is the flow resistance, R, between the tank 1 and the tank 232Is the flow resistance, R, between the tank 3 and the tank 243Is the flow resistance, R, between the water tank 4 and the water tank 3n-1,nIs the flow resistance between tank n-1 and tank n, RnIs the drainage resistance of the water tank Tn, and rho is the liquid density;Q1and Q2Is the flow rate of the submersible pump 1 and the submersible pump 2.
4. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1, characterized in that the evaluation network (2) comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n +2 neurons, the hidden layer has 2n neurons, and the output layer has 1 neuron.
5. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, characterized in that the stage value evaluation unit (4) is composed of the following formula:
wherein R (k) is a stage value; x (k +1) is the liquid level information of all the water tanks at the moment of k + 1; x is the number ofrAnd (k +1) is the liquid level information of all the water tanks at the moment of predicting k +1 output by the fault-free model (3).
6. The fault-tolerant liquid level control method based on reinforcement learning of claim 1, wherein the deviation estimation unit (5) is formed by the following formula:
TE=V(k)-R(k)+γV(k+1)
wherein TE is a deviation; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor.
7. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, wherein the weight updating unit (6) comprises:
1) the weights W of the input layer and the hidden layer in the evaluation network (2) are calculatedc1And the weight W of the hidden layer and the output layerc2Randomly selecting an initial particle value by using corresponding particle position representation;
2) the fitness function for each particle is calculated according to the following formula:
wherein FF (z (k)) is a fitness function of the ith particle at the p-th iteration; v (k) and V (k +1) are the total values of the control variables of the control frequency converter corresponding to the time k and the time k +1 respectively; r (k) is stage value; gamma is a discount factor; x (k) is the combination of the liquid level information x (k) of all the water tanks at the moment k and the control information u (k) of the frequency converter;
3) obtaining the optimal position p of the current particle swarm according to the fitness function value and the following formulabestAnd the optimal position g experienced by the whole particle swarmbestAnd update pbest,gbest:
Wherein i is the number of particles, and m is the number of particles; p is the number of iterations;
4) updating the particle moving speed v according to the basic iterative formula of the particle swarm optimizationiAnd the position z of the particlei
Wherein z represents the particle position, v represents the particle velocity, ω is the inertial weight, c1And c2Is the acceleration constant, and rand1 and rand2 are at [0,1]]Two random numbers, P, generated independently of each otherbestIs the current optimum position of the particle swarm, gbestIs the best position experienced by the whole particle swarm, (p) represents the number of iterations;
5) repeating the steps 2) to 4) until convergence, and recording the optimal position g of the current particle swarmbest1;
6) Redistributing particles with random numbers of [0,1] to obtain a new fitness function value;
7) repeating the steps 2 to 4 until convergence, and recording the optimal position g of the current particle swarmbest2;
8) If the optimum position gbest2Better than optimum position gbest1Then use the optimum position gbest2Alternative optimum position gbest1Otherwise, the optimum position g is maintainedbest1The change is not changed;
9) repeating the steps 2) to 8) until a better optimal position cannot be found, and obtaining a final position gbest1;
10) The particles are in gbest1Is located to judge the network Wc1And Wc2The solution of (1).
8. The liquid level fault-tolerant control method based on reinforcement learning of claim 1, characterized in that the action network (7) comprises an input layer, a hidden layer and an output layer which are all connected in sequence, wherein the input layer has n neurons, the hidden layer has n +3 neurons, the output layer has 2 neurons, and the weight between the input layer and the hidden layer is Wa1The weight between the hidden layer and the output layer is Wa2。
9. The reinforcement learning-based liquid level fault-tolerant control method according to claim 1 or 8, characterized in that the weight of the action network (7) is changed into
ΔWa2=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·sout,a
ΔWa1=l·Wc2·[sout,c(1-sout,c)]·Wc1,u·Wa2·[sout,a(1-sout,a)]·x(k)
Wherein l is the learning rate, Wc2Represents the weight, s, between the hidden layer and the output layer in the evaluation network (2)out,cAnd sout,aAre the outputs of the non-linear functions in the evaluation network (2) and the action network (7), respectively; wc1,uFor evaluating the weight of the hidden layer of the network (2) on the control quantity u (k) of the frequency converter, Wa2The weight of a hidden layer and an output layer in the action network (7), x (k) is the liquid level information of all water tanks at the moment k, Wc1,u、Wc2、sout,c,sout,aAnd Wa2Both obtained from the evaluation network and the action network;
updating the weights Wa1 and Wa2 of the action network according to the following formula
Wa1’=Wa1+ΔWa1
Wa2’=Wa2+ΔWa2
In the formula, Wa1' and Wa2' are updated weights between the input layer and the hidden layer and weights between the hidden layer and the output layer in the action network (7).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010947314.8A CN112180996A (en) | 2020-09-10 | 2020-09-10 | Liquid level fault-tolerant control method based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010947314.8A CN112180996A (en) | 2020-09-10 | 2020-09-10 | Liquid level fault-tolerant control method based on reinforcement learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112180996A true CN112180996A (en) | 2021-01-05 |
Family
ID=73921803
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010947314.8A Pending CN112180996A (en) | 2020-09-10 | 2020-09-10 | Liquid level fault-tolerant control method based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112180996A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020046359A1 (en) * | 2000-03-16 | 2002-04-18 | Boden Scott T. | Method and apparatus for secure and fault tolerant data storage |
CN1471627A (en) * | 2000-10-26 | 2004-01-28 | �Ʒ� | Fault Tolerant Liquid Measurement System Using Multi-Model State Estimator |
CN1737423A (en) * | 2005-08-10 | 2006-02-22 | 东北大学 | Method and apparatus for realizing integration of fault-diagnosis and fault-tolerance for boiler sensor based on Internet |
CN109635864A (en) * | 2018-12-06 | 2019-04-16 | 佛山科学技术学院 | A kind of fault tolerant control method and device based on data |
-
2020
- 2020-09-10 CN CN202010947314.8A patent/CN112180996A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020046359A1 (en) * | 2000-03-16 | 2002-04-18 | Boden Scott T. | Method and apparatus for secure and fault tolerant data storage |
CN1471627A (en) * | 2000-10-26 | 2004-01-28 | �Ʒ� | Fault Tolerant Liquid Measurement System Using Multi-Model State Estimator |
CN1737423A (en) * | 2005-08-10 | 2006-02-22 | 东北大学 | Method and apparatus for realizing integration of fault-diagnosis and fault-tolerance for boiler sensor based on Internet |
CN109635864A (en) * | 2018-12-06 | 2019-04-16 | 佛山科学技术学院 | A kind of fault tolerant control method and device based on data |
Non-Patent Citations (1)
Title |
---|
张大鹏: "Fault Tolerant Control Using Reinforcement Learning and Particle Swarm Optimization", 《IEEE ACCESS》, 9 September 2020 (2020-09-09), pages 2 - 5 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109879410A (en) | Sewage treatment aeration control system | |
CN110347155B (en) | Intelligent vehicle automatic driving control method and system | |
CN101871782B (en) | Position error forecasting method for GPS (Global Position System)/MEMS-INS (Micro-Electricomechanical Systems-Inertial Navigation System) integrated navigation system based on SET2FNN | |
US6721647B1 (en) | Method for evaluation of a genetic algorithm | |
CN108008627B (en) | Parallel optimization reinforcement learning self-adaptive PID control method | |
CN112432644B (en) | Unmanned ship integrated navigation method based on robust adaptive unscented Kalman filtering | |
CN109724657A (en) | Watermeter flowing rate metering method and system based on modified Delphi approach | |
CN111507530B (en) | RBF neural network ship traffic flow prediction method based on fractional order momentum gradient descent | |
CN117369286B (en) | Dynamic positioning control method for ocean platform | |
CN113916329A (en) | Natural gas flowmeter calibrating device and method based on neural network | |
CN112180996A (en) | Liquid level fault-tolerant control method based on reinforcement learning | |
CN111679577B (en) | Speed tracking control method and automatic driving control system of high-speed train | |
CN114548311A (en) | Hydraulic equipment intelligent control system based on artificial intelligence | |
CN116880191A (en) | Intelligent control method of process industrial production system based on time sequence prediction | |
CN114519291A (en) | Method for establishing working condition monitoring and control model and application method and device thereof | |
Hallouzi et al. | Multiple model estimation: A convex model formulation | |
JPH04211859A (en) | Abnormality recognizing method | |
CN105204331A (en) | Intelligent optimized parameter identification method applied to steam turbine and speed regulation system of steam turbine | |
Babuska et al. | Particle filtering for on-line estimation of overflow losses in a hopper dredger | |
Meseguer et al. | Fault-Tolerant Model Predictive Control Applied to Integrated Urban Drainage and Sanitation Systems for Environmental Protection | |
Rato et al. | Multimodel based fault tolerant control of the 3-tank system | |
CN116105071B (en) | Supercritical carbon dioxide pipeline safety relief system and control method | |
CN110187633A (en) | A kind of BP ~ RNN modified integral algorithm of PID towards road simulation dynamometer | |
CN114970881B (en) | Offline reinforcement learning method and device based on convex hull constraint | |
CN108303876A (en) | The Robust Tracking Control of spring-mass damper |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210105 |