WO2024181536A1 - Device for machine learning related to feedforward control for control target, teacher data generation program, and teacher data generation method - Google Patents
Device for machine learning related to feedforward control for control target, teacher data generation program, and teacher data generation method Download PDFInfo
- Publication number
- WO2024181536A1 WO2024181536A1 PCT/JP2024/007551 JP2024007551W WO2024181536A1 WO 2024181536 A1 WO2024181536 A1 WO 2024181536A1 JP 2024007551 W JP2024007551 W JP 2024007551W WO 2024181536 A1 WO2024181536 A1 WO 2024181536A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- variable
- control
- value
- controlled
- disturbance
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 35
- 238000010801 machine learning Methods 0.000 title claims description 29
- 238000005259 measurement Methods 0.000 claims abstract description 44
- 238000013178 mathematical model Methods 0.000 claims abstract description 30
- 238000012549 training Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 13
- 238000013459 approach Methods 0.000 claims description 12
- 230000001934 delay Effects 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 description 19
- 238000003860 storage Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 14
- 238000004519 manufacturing process Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 7
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000004452 microanalysis Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B11/00—Automatic controllers
- G05B11/01—Automatic controllers electric
- G05B11/32—Automatic controllers electric with inputs from more than one sensing element; with outputs to more than one correcting element
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/04—Programme control other than numerical control, i.e. in sequence controllers or logic controllers
- G05B19/05—Programmable logic controllers, e.g. simulating logic interconnections of signals according to ladder diagrams or function charts
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B23/00—Testing or monitoring of control systems or parts thereof
- G05B23/02—Electric testing or monitoring
Definitions
- the present disclosure relates to a device for machine learning related to feedforward control of a control target, a program for generating teacher data, and a method for generating teacher data.
- JP 10-222207 A discloses a feedforward control device that uses a feedforward signal generator to control a specific process that is subject to disturbances.
- This feedforward control device includes a parameter learner that learns parameters required for calculations in the feedforward signal generator from parameters calculated in the feedforward signal generator.
- the configuration of the feedforward compensator included in the feedforward signal generator disclosed in Patent Document 1 is predetermined.
- a configuration for calculating a feedforward compensation value for feedback control such as a feedforward compensator is designed as an inverse model (inverse system) of the process to be controlled.
- Designing an inverse model of a process requires analysis of the process, so it is difficult to update the structure of the inverse model in a short period of time in response to fluctuations in disturbances and changes in the characteristics of the process. Therefore, with the feedforward control device disclosed in Patent Document 1, it may be difficult to suppress the effect of disturbances on the control of the controlled object depending on the degree of fluctuation in the disturbance or the degree of change in the characteristics of the process.
- This disclosure has been made to solve the problems described above, and its purpose is to improve the accuracy of feedforward control of the controlled object.
- An apparatus is an apparatus for machine learning for a predictive model that predicts a specific value required to generate a feedforward compensation value for a feedback manipulated variable to a controlled object that is subject to a disturbance.
- the feedback manipulated variable is determined by feedback control of a control device for the controlled object, based on an error between a target value and the controlled variable of the controlled object, so that the controlled variable approaches a target value.
- a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable.
- the controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element.
- the control element receives the manipulated variable and outputs a primary controlled variable.
- the disturbance element receives the disturbance and outputs a fluctuation amount.
- the dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object.
- the predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance.
- the apparatus includes a learning data generating unit.
- the learning data generation unit acquires the measurement values, the operation variables, and the control variables, generates a first estimate of the primary control variables from the operation variables using a mathematical model of the control elements, generates a second estimate of the fluctuation variables based on the control variables and the first estimate at a time that is a specific time back from the time at which the control variables were acquired, and outputs training data in which the measurement values are associated with the second estimate at the time at which the measurement values were acquired as the correct answer.
- the teacher data since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the control object.
- the device further includes a learning unit that performs machine learning on the prediction model using teacher data.
- the learning unit approximates the relationship between the measurement value and the second estimated value, which is expressed by the prediction model, as a function with the second estimated value as the objective variable and the measurement value as the explanatory variable.
- a predictive model can be adapted in real time to the measured values and characteristics of the controlled object in parallel with feedforward control.
- the device further includes a feedforward compensation unit that generates a feedforward compensation value from the amount of variation predicted by the prediction model using an inverse model of the mathematical model.
- control amount output from the controlled object after a specific time has elapsed closer to a normal value corresponding to the feedback manipulation amount at a time that is a specific time before the time of the control amount.
- a teacher data generation program is a program for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulation variable for a control object that is subject to a disturbance.
- the feedback manipulation variable is determined by feedback control of a control device for the control object so that the control variable approaches a target value based on an error between a target value and the control variable of the control object.
- a value obtained by subtracting the feedforward compensation value from the feedback manipulation variable is output from the control device to the control object as a manipulation variable.
- the control object is represented by a control element expressed as a mathematical model that receives the manipulation variable, a disturbance element that is subject to a disturbance, and a dead time element.
- the control element receives the manipulation variable and outputs a primary control variable.
- the disturbance element receives the disturbance and outputs a fluctuation variable.
- the dead time element delays the output of the sum of the primary control variable and the fluctuation variable to the outside of the control object.
- the predictive model predicts the fluctuation variable as a specific value from a measured value of the disturbance.
- the program When executed by the processor, the program acquires the measurement values, the manipulated variables, and the controlled variables, generates a first estimate of the primary controlled variable from the manipulated variables using a mathematical model of the control element, generates a second estimate of the fluctuation variable based on the controlled variable and the first estimate at a time that is a specific time back from the time when the controlled variable was acquired, and outputs training data that corresponds the measurement values to the second estimate at the time when the measurement values were acquired as the correct answer.
- the teacher data since the teacher data associates an appropriate correct answer with the measured value, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of a control target.
- the invention disclosed herein can also be realized as a non-transitory computer-readable medium that stores the above program.
- a method for generating teacher data is a method for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a controlled object that is subject to a disturbance.
- the feedback manipulated variable is determined by feedback control of a control device for the controlled object so that the controlled variable approaches a target value based on an error between a target value and the controlled variable of the controlled object.
- a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable.
- the controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives a disturbance, and a dead time element.
- the control element receives the manipulated variable and outputs a primary controlled variable.
- the disturbance element receives the disturbance and outputs a fluctuation amount.
- the dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object.
- the predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance.
- the method includes acquiring a measurement value, a manipulated variable, and a controlled variable, generating a first estimate of a primary controlled variable from the manipulated variable using a mathematical model of the control element, generating a second estimate of a fluctuation amount based on the controlled variable and the first estimate at the time when the controlled variable was acquired, and outputting training data in which the measurement value is associated with the second estimate at the time when the measurement value was acquired as a correct answer.
- the teacher data since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the controlled object.
- the device, program, and method disclosed herein can improve the accuracy of feedforward control of a control target.
- FIG. 2 is a block diagram showing a functional configuration of a control device according to the first embodiment
- FIG. 13 is a diagram showing an example of a distribution of a plurality of training data, and an example of a prediction model as a regression surface.
- FIG. FIG. 2 is a flowchart showing the flow of processing performed by each of a feedback control system, a feedforward control system, a learning data generating unit, and a learning unit in FIG. 1 .
- 4 is a flowchart showing a specific process flow of the learning data generation process of FIG. 3 .
- 5 is a diagram for explaining the correspondence between the times of the measurement values, the manipulated variables, the controlled variables, and the estimated primary controlled variables in FIG.
- FIG. 2 is a block diagram showing a functional configuration of a control device according to a first modified example of the first embodiment.
- FIG. 11 is a block diagram showing a functional configuration of a control system according to a second embodiment.
- FIG. 13 is a schematic diagram showing an example of a network configuration of a control system according to a third embodiment.
- 9 is a block diagram showing an example of a hardware configuration of the control device shown in FIG. 8 .
- Fig. 1 is a block diagram showing a functional configuration of a control device 100 according to embodiment 1.
- the control device 100 includes a feedback control unit 110, a feedforward compensation unit 120, a learning data generation unit 130, a storage unit 140, a subtractor 150, a subtractor 160, and a learning unit 170.
- the control device 100 outputs to the control object 200 as the manipulated variable r, a value obtained by subtracting the feedforward compensation value rf from the feedback manipulated variable rb to the control object 200, so that the controlled variable q, which is the output value of the control object 200 subjected to the disturbance ds, approaches the target value qr.
- the memory unit 140 stores the prediction model Mp and the teacher data Dt.
- the control device 100 and the control object 200 are connected via a network (for example, the Internet or a cloud system) and may be located remotely from each other. Examples of machine learning algorithms for constructing the prediction model Mp include a deep binary tree or a support vector machine.
- the disturbance ds is a quantity that disturbs the state of the control system including the control device 100 and the controlled object 200.
- the disturbance ds includes multiple disturbance elements, such as the amount of light, voltage, current, and temperature that are accidentally or suddenly input to the controlled object.
- the measured value of the disturbance ds is expressed as do.
- the transmission of information (signals) in the controlled object 200 is represented by the control element 210, the disturbance element 220, the dead time element 230, and the adder 240.
- the control element 210 receives an operation amount r and outputs a primary control amount qw to the adder 240.
- the disturbance element 220 receives a disturbance ds and outputs a fluctuation amount d to the adder 240.
- the adder 240 adds the fluctuation amount d to the primary control amount qw and outputs a secondary control amount qo to the dead time element 230.
- the dead time element 230 delays the output of the secondary control amount qo to the outside of the controlled object 200. Specifically, the dead time element 230 outputs the secondary control amount qo as the control amount q to the outside of the controlled object 200 after the lapse of dead time L after receiving the secondary control amount qo.
- the configuration including the feedback control unit 110 and the subtractor 150 is also referred to as a feedback control system
- the configuration including the feedforward compensation unit 120 and the subtractor 160 is also referred to as a feedforward control system.
- the feedback control unit 110 determines the feedback control variable rb based on the error eq and outputs it to the subtractor 160.
- the learning data generating unit 130 generates teacher data Dt used in machine learning for the prediction model Mp.
- the learning data generating unit 130 acquires a measurement value do, an operation amount r, and a control amount q at each of a plurality of timings.
- the learning data generating unit 130 estimates a fluctuation amount d as an estimated fluctuation amount dh from the operation amount r and the control amount q.
- the learning data generating unit 130 outputs a combination of the measurement value do (explanatory variable) and the estimated fluctuation amount dh (objective variable) to the storage unit 140 as teacher data Dt.
- the learning data generating unit 130 may output the teacher data Dt directly to the learning unit 170.
- the variation d corresponding to the measured value do at time t1 is added to the primary control amount qw at time t1 to become the secondary control amount qo.
- the estimated variation amount dh of the variation amount d at the same time as the measured value do is measured is considered to be the correct answer for the measured value do.
- the correspondence between the explanatory variable and the objective variable is correctly set.
- machine learning can be performed on the prediction model Mp using the teacher data Dt having an appropriate correspondence, so that the prediction accuracy of the prediction model Mp can be improved.
- the accuracy of the feedforward control of the control object 200 performed using the trained prediction model Mp can be improved.
- the transmission of information in the learning data generation unit 130 is represented by a mathematical model 131, a time converter 132, a subtractor 133, and an output generator 134.
- the mathematical model 131 is estimated as a correspondence between the operation amount r and the primary control amount qw, and is represented as a function with the operation amount r as an argument.
- the mathematical model 131 is also represented as Gwh(r).
- the mathematical model 131 receives the manipulated variable r and outputs an estimated primary control variable qwh (first estimated value) to the subtractor 133.
- the estimated primary control variable qwh is an estimated value corresponding to the primary control variable qw.
- the time converter 132 returns the measurement time t10 of the control variable q by an estimated dead time Lh (specific time) and outputs an estimated secondary control variable qoh to the subtractor 133.
- the estimated dead time Lh is an estimated value of the dead time L.
- the estimated secondary control variable qoh is an estimated value of the secondary control variable qo.
- the mathematical model 131 and the estimated dead time Lh can be determined in physical modeling or system identification based on general design information in feedback control of the control object 200.
- the feedback control unit 110 in an open loop state may apply a step signal to the control object 200, and the gain, time constant, and dead time may be calculated from the response data of the control variable. From these calculation results, the PID (Proportional-Integral-Differential) parameters for feedback control can be determined using the Ziegler Nichols method or the CHR (Chien-Hrones-Reswick) method.
- the value obtained by subtracting the primary control amount qw acquired at time t11 from the secondary control amount qo acquired at time t11 is the fluctuation amount d at time t11.
- the primary control amount qw is the control amount q at time t12, which is a period of dead time L that has elapsed since time t11.
- the control amount q acquired at time t12 is set by the time converter 132 to the estimated secondary control amount qoh at time t11, which is set back from time t12 by the estimated dead time Lh.
- the value obtained by subtracting the estimated primary control amount qw at time t11 from the estimated secondary control amount qoh at time t11 is the estimated fluctuation amount dh (second estimated value), which is an estimate of the fluctuation amount d at time t11.
- the subtractor 133 subtracts the estimated primary control amount qwh at the same time as the estimated secondary control amount qoh from the estimated secondary control amount qoh, and outputs the estimated fluctuation amount dh to the output generator 134.
- the output generator 134 outputs a combination of the measured value do and the estimated fluctuation amount dh at the same time as the measured value do to the memory unit 140 as the teacher data Dt.
- the learning unit 170 uses multiple pieces of teacher data Dt to approximate the relationship between the measured value do and the estimated variation amount dh as a function (regression curve or regression surface) with the estimated variation amount dh as the objective variable and the measured value do as the explanatory variable.
- the prediction model Mp includes this function.
- machine learning initial learning or additional learning
- the learning unit 170 resumes machine learning (additional learning) for the prediction model Mp.
- the characteristics of the control object 200 include, for example, the correspondence between the measured value do and the operation amount r and the control amount q.
- FIG. 2 shows an example of the distribution of multiple training data Dt, and an example of a prediction model Mp as a regression surface.
- the feedforward compensation unit 120 acquires an estimated fluctuation amount dh from the measurement value do using the prediction model Mp.
- the feedforward compensation unit 120 includes an inverse model 121.
- the inverse model 121 is configured as an inverse function Gwh -1 (qwh) of the mathematical model 131 that takes an estimated primary controlled variable qwh as an argument and outputs an operation variable r.
- the inverse model 121 receives the estimated fluctuation amount dh from the prediction model Mp as an input.
- the purpose of feedforward control is to suppress disturbance of the primary control amount qw due to disturbance ds. Therefore, the feedforward compensation unit 120 obtains from the inverse model 121, as a feedforward compensation value rf, a manipulation amount to be input to the control element 210 such that the value output from the control element 210 becomes the estimated fluctuation amount dh.
- the primary control amount qw corresponding to the manipulation amount r the increase in the fluctuation amount d is suppressed in advance by the feedforward compensation value rf.
- the controlled variable q output from the controlled object 200 after the dead time L has elapsed approaches a normal value corresponding to the feedback manipulated variable rb at a time that is the dead time L before the time of the controlled variable q.
- the accuracy of feedforward control for the control object 200 can be improved by performing machine learning on the prediction model Mp using the teacher data Dt. Furthermore, according to the control device 100, in parallel with the feedforward control, the prediction model Mp can be adapted in real time to the measurement value do and the characteristics of the control object 200. Furthermore, since machine learning is continued until the accuracy of the prediction model Mp becomes sufficiently high, the accuracy of the feedforward control for the control object 200 can be sufficiently improved. Furthermore, since the prediction model Mp is re-adapted to the characteristics in response to changes in the characteristics of the control object 200, a decrease in the accuracy of the feedforward control due to changes in the characteristics of the control object 200 can be suppressed.
- FIG. 3 is a diagram showing a flowchart illustrating the flow of processing performed by each of the feedback control system, the feedforward control system, the learning data generation unit 130, and the learning unit 170 in FIG. 1.
- the routines corresponding to the flowcharts of the feedback control system and the feedforward control system are executed, for example, at each sampling time.
- the routines corresponding to the flowcharts of the learning data generation unit 130 and the learning unit 170 are executed, for example, in response to the first execution of the routines corresponding to the flowcharts of the feedforward control system. Below, steps are simply abbreviated as S.
- the subtractor 150 calculates the error eq between the target value qr and the controlled variable q, and outputs it to the feedback control unit 110.
- the feedback control unit 110 determines the feedback control variable rb based on the error eq, outputs it to the subtractor 160, and ends the process.
- the feedforward compensation unit 120 determines a feedforward compensation value rf from the measurement value do and outputs it to the subtractor 160.
- the subtractor 160 outputs the difference obtained by subtracting the feedforward compensation value rf from the feedback control amount rb as the control amount r to the control object 200 and the learning data generation unit 130, and the process ends.
- the training data generation unit 130 generates training data Dt in S130 and ends the process.
- the learning unit 170 performs machine learning on the prediction model Mp using the training data Dt in S170 and ends the process.
- FIG. 4 is a flowchart showing a specific process flow of the learning data generation process S130 in FIG. 3.
- FIG. 5 is a diagram for explaining the correspondence between the time of the measured value do, the operation amount r, the control amount q, and the estimated primary control amount qwh in FIG. 4, and the time of the estimated secondary control amount qoh, which is set back from the time of the control amount q by the estimated dead time Lh.
- FIG. 5 shows a case where multiple measured values do, multiple operation amounts r, and multiple control amounts q are obtained during the trial period from time t21 to t24.
- Time t22 is the time (t21+Lh) that is the estimated dead time Lh after time t21.
- Time t23 (>t22) is the time (t24-Lh) that is the estimated dead time Lh back from time t24.
- the learning data generation unit 130 acquires the measurement value do, the operation amount r, and the control amount q at each of the multiple timings, and proceeds to S132.
- the learning data generation unit 130 acquires an estimated primary control amount qwh from the operation amount r using the mathematical model 131, and proceeds to S133.
- the learning data generation unit 130 moves the time of the control amount q back by the estimated dead time Lh to generate an estimated secondary control amount qoh, and proceeds to S134.
- the multiple control amounts q included in times t22 to t23 acquired in S131 are slid as a whole in the direction going back in time by the estimated dead time Lh.
- the learning data generation unit 130 subtracts the estimated primary control amount qwh at the time of the estimated secondary control amount qoh from the estimated secondary control amount qoh to generate an estimated variation amount dh, and proceeds to S135.
- the learning data generation unit 130 sets the correct answer for the measured value do to the estimated variation amount dh at the time of the measured value do, generates teacher data Dt, and ends the process.
- the time of the control variables q acquired from time t21 to t22 is returned to before time t21 by the processing of S133.
- FIG. 6 is a block diagram showing the functional configuration of control device 100A according to variant 1 of embodiment 1.
- Control device 100A has a configuration in which subtractor 150 and feedback control unit 110 are removed from control device 100 in FIG. 1. Since the rest of the configuration is similar, the description of the similar configuration will not be repeated. Note that subtractor 160 does not have to be included in control device 100A.
- the control device 100A determines a feedforward compensation value rf of the feedback operation amount rb to the control object 200 such that the control amount q of the control object 200 receiving the measurement value do approaches a target value qr.
- the existing feedback control system can be easily expanded into a feedforward control system and a control system including a learning function.
- the device and method according to the first embodiment and the first modification can improve the accuracy of feedforward control of the control target.
- FIG. 7 is a block diagram showing the functional configuration of a control system 2 according to embodiment 2.
- components with the same reference symbols as those in FIG. 1 have the same functions as the components identified by the reference symbols described in embodiment 1, and therefore the description of the similar components will not be repeated.
- the control system 2 includes a feedback control device 11, a feedforward compensation device 12, a learning data generation device 13, a storage device 14, and a learning device 17.
- the learning data generation device 13, the storage device 14, and the learning device 17 correspond to the learning data generation unit 130, the storage unit 140, and the learning unit 170 in FIG. 1, respectively.
- the feedback control device 11 includes a feedback control unit 110 and a subtractor 150.
- the feedforward compensation device 12 includes a feedforward compensation unit 120 and a subtractor 160.
- the feedback control device 11, the feedforward compensation device 12, the learning data generation device 13, the storage device 14, the learning device 17, and the control target 200 may be connected to each other via a network and may be located remotely from each other.
- the subtractor 160 may be included in the feedback control device 11, not in the feedforward compensation device 12.
- an existing control system can be easily expanded by adding a feedforward compensation device, a learning data generation device, and a learning device to the existing feedback control device while leaving the existing feedback control device in place.
- the device and method according to the second embodiment can improve the accuracy of feedforward control of the control target.
- Fig. 8 is a schematic diagram showing an example of a network configuration of a control system 3 according to the third embodiment.
- the control system 3 includes a device group in which a plurality of devices are configured to be able to communicate with each other.
- the devices may include a control device 300 that is a processing entity that executes a control program, and peripheral devices connected to the control device 300.
- the control device 300 has a functional configuration similar to that of the control device 100 shown in Fig. 1.
- the control device 300 corresponds to an industrial controller that controls control targets such as various facilities or devices.
- the control device 300 is a type of computer that executes control calculations, and typically includes a PLC (Programmable Logic Controller).
- the control device 300 is connected to a field device 200C via a field network 20.
- the control device 300 exchanges data with at least one field device 200C via the field network 20.
- the control calculations executed in the control device 300 include a process of collecting data collected or generated in the field device 200C, a process of generating data such as a command value (operation amount) for the field device 200C, and a process of transmitting the generated output data to the target field device 200C.
- the data collected or generated in the field device 200C includes data on disturbances input to the field device 200C, and a control amount resulting from the actual operation of the field device 200C in accordance with the command value.
- the command value for the field device 200C is determined by adding a feedforward compensation value predicted from the disturbance by a prediction model to the operation amount provisionally calculated based on the error between the control target value (target value) calculated based on the control program executed by the control device 300 and the actual control amount.
- the field network 20 preferably employs a bus or network that performs periodic communication.
- Known examples of such buses or networks that perform periodic communication include EtherCAT (registered trademark), EtherNet/IP (registered trademark), DeviceNet (registered trademark), and CompoNet (registered trademark).
- EtherCAT registered trademark
- EtherNet/IP registered trademark
- DeviceNet registered trademark
- CompoNet registered trademark
- the field device 200C includes an actuator that exerts some kind of physical action on a robot or conveyor in the field, and an input/output device that exchanges information with the field.
- the field device 200C includes a plurality of servo drivers 220_1 and 220_2, and a plurality of servo motors 222_1 and 222_2 connected to the plurality of servo drivers 220_1 and 220_2, respectively.
- the field device 200C is an example of a "controlled object.”
- Servo drivers 220_1 and 220_2 drive corresponding servo motors of servo motors 222_1 and 222_2 according to command values (such as position command values or speed command values) from control device 300. In this way, control device 300 can control field device 200C.
- command values such as position command values or speed command values
- the control device 300 is also connected to other devices via a higher-level network 32.
- the higher-level network 32 is connected to the Internet 900, which is an external network, via a gateway 700.
- the higher-level network 32 may use Ethernet (registered trademark) or EtherNet/IP (registered trademark), which are common network protocols. More specifically, at least one server device 600 and at least one display device 500 may be connected to the higher-level network 32.
- the server device 600 may be a database system or a manufacturing execution system (MES).
- the manufacturing execution system acquires information from the controlled manufacturing device or facility to monitor and manage the entire production, and may also handle order information, quality information, shipping information, and the like.
- a device that provides an information system service may be connected to the upper network 32.
- An example of an information system service is processing that acquires information from the controlled manufacturing device or facility and performs macro or micro analysis.
- an example of an information system service is data mining that extracts some characteristic tendency contained in information from the controlled manufacturing device or facility, or a machine learning tool that performs machine learning based on information from the controlled facility or machine.
- the display device 500 receives operations from the user and outputs commands to the control device 300 in response to the user operations, and also graphically displays the results of calculations performed by the control device 300.
- the support device 400 can be connected to the control device 300.
- the support device 400 may be connected to the control device 300 via the higher-level network 32 or the Internet 900.
- the support device 400 is a device that assists the control device 300 in making the necessary preparations to control the control target.
- the support device 400 provides a development environment (program creation and editing tools, parsers, compilers, etc.) for programs executed by the control device 300, a setting environment for setting configuration information (configuration) for the control device 300 and various devices connected to the control device 300, a function for outputting the generated program to the control device 300, and a function for modifying and changing the program executed on the control device 300 online.
- control device 300 the control device 300, the support device 400, and the display device 500 are each configured as separate entities, but a configuration may be adopted in which all or part of these functions are integrated into a single device.
- the control device 300 is not limited to being used at one production site, but may also be used at other production sites. It may also be used at multiple different lines at one production site.
- Fig. 9 is a block diagram showing an example of a hardware configuration of the control device 300 in Fig. 8.
- the control device 300 includes a processor 302, a main memory 304, a storage 360, a memory card interface 312, a host network controller 306, a field network controller 308, a local bus controller 316, and a Universal Serial Bus (USB) controller 370 that provides a USB interface.
- a processor bus 318 the control device 300 includes a main memory 304, a storage 360, a memory card interface 312, a host network controller 306, a field network controller 308, a local bus controller 316, and a Universal Serial Bus (USB) controller 370 that provides a USB interface.
- USB Universal Serial Bus
- the processor 302 corresponds to an arithmetic processing unit that executes control calculations, and is composed of a CPU (Central Processing Unit) and the like.
- the processor 302 may also include a GPU (Graphics Processing Unit).
- the processor 302 reads out a program stored in the storage 360, expands it in the main memory 304, and executes it to realize control calculations for a control object.
- Main memory 304 is composed of a volatile storage device such as a dynamic random access memory (DRAM). Main memory 304 may also include a static random access memory (SRAM).
- Storage 360 is composed of a non-volatile storage device such as a solid state drive (SSD). Storage 360 may also include a hard disk drive (HDD).
- Storage 360 stores control program Pc, teacher data Dt, and prediction model Mp. Storage 360 corresponds to memory unit 140 in FIG. 1.
- Control program Pc includes a program for comprehensively controlling control device 300 and realizing each function of control device 300.
- processor 302 that executes control program Pc corresponds to the feedback control system (feedback control unit 110 and subtractor 150), feedforward control system (feedforward compensation unit 120 and subtractor 160), learning data generation unit 130, and learning unit 170 in FIG. 1.
- the memory card interface 312 accepts a memory card 314, which is an example of a removable storage medium.
- the memory card interface 312 is capable of reading and writing any data to the memory card 314.
- the upper network controller 306 exchanges data with any information processing device connected to the upper network 32 (e.g., a local area network) via the upper network 32.
- the upper network 32 e.g., a local area network
- the field network controller 308 exchanges data with any devices such as servo motors 222_1 and 222_2 via the field network 20.
- the local bus controller 316 exchanges data with any of the functional units 380 constituting the control device 300 via the local bus 122.
- the functional units 380 may, for example, be an analog I/O unit responsible for at least one of the input and output of analog signals, a digital I/O unit responsible for at least one of the input and output of digital signals, and a counter unit that receives pulses from an encoder or the like.
- the USB controller 370 exchanges data with any information processing device via a USB connection.
- the support device 400 is connected to the USB controller 370.
- the device, program, and control method according to the third embodiment can improve the accuracy of feedforward control of the control target.
- the feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
- a value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
- the controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230
- a learning unit (170) that performs the machine learning on the prediction model (Mp) using the teacher data (Dt), The device (100) described in configuration 1, wherein the learning unit (170) approximates the relationship between the measurement value (do) and the second estimated value (dh), as expressed by the prediction model (Mp), as a function having the second estimated value (dh) as a response variable and the measurement value (do) as an explanatory variable.
- the feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
- a value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
- the controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
- the feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
- a value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
- the controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Feedback Control In General (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Programmable Controllers (AREA)
Abstract
The present invention improves the accuracy of feedforward control for a control target. A training data generation unit (130): acquires a measurement value (do), an operation amount (r), and a control amount (q); generates a first estimation value (qwh) for a primary control amount (qw) from the operation amount (r) by using a mathematical model (131) of a control element (210); generates a second estimation value (dh) for a variation amount (d) on the basis of the control amount (q) and the first estimation value (qwh) at a point in time a prescribed period (Lh) before the point in time when the control amount (q) was acquired; and outputs teacher data (Dt) in which the second estimation value (dh) for the point in time of the measurement value (do) has been associated with the measurement value (do) as a correct answer.
Description
本開示は、制御対象に対するフィードフォワード制御に関する機械学習のための装置、教師データの生成プログラム、および教師データの生成方法に関する。
The present disclosure relates to a device for machine learning related to feedforward control of a control target, a program for generating teacher data, and a method for generating teacher data.
従来、制御対象に対してフィードフォワード制御を行なう制御装置が知られている。たとえば、特開平10-222207号公報(特許文献1)には、外乱を受ける所定のプロセスに対して、フィードフォワード信号発生器による制御を行うフィードフォワード制御装置が開示されている。当該フィードフォワード制御装置は、フィードフォワード信号発生器における演算に必要なパラメータを、当該フィードフォワード信号発生器において算出されたパラメータから学習するパラメータ学習器を備える。
Control devices that perform feedforward control on a control target are known. For example, JP 10-222207 A (Patent Document 1) discloses a feedforward control device that uses a feedforward signal generator to control a specific process that is subject to disturbances. This feedforward control device includes a parameter learner that learns parameters required for calculations in the feedforward signal generator from parameters calculated in the feedforward signal generator.
特許文献1に開示されているフィードフォワード信号発生器に含まれるフィードフォワード補償器の構成は、予め定められている。一般に、フィードフォワード補償器のようなフィードバック制御のフィードフォワード補償値を算出する構成は、制御対象のプロセスの逆モデル(逆系)として設計される。プロセスの逆モデルの設計にはプロセスの分析を要するため、外乱の変動およびプロセスの特性の変化に応じて逆モデルの構造を短時間で更新することは困難である。そのため、特許文献1に開示されたフィードフォワード制御装置によると、外乱の変動の程度またはプロセスの特性の変化の程度によっては制御対象に対する制御への外乱の影響を抑制することが困難になり得る。
The configuration of the feedforward compensator included in the feedforward signal generator disclosed in Patent Document 1 is predetermined. In general, a configuration for calculating a feedforward compensation value for feedback control such as a feedforward compensator is designed as an inverse model (inverse system) of the process to be controlled. Designing an inverse model of a process requires analysis of the process, so it is difficult to update the structure of the inverse model in a short period of time in response to fluctuations in disturbances and changes in the characteristics of the process. Therefore, with the feedforward control device disclosed in Patent Document 1, it may be difficult to suppress the effect of disturbances on the control of the controlled object depending on the degree of fluctuation in the disturbance or the degree of change in the characteristics of the process.
本開示は上記のような課題を解決するためになされたものであり、その目的は、制御対象に対するフィードフォワード制御の精度を向上させることである。
This disclosure has been made to solve the problems described above, and its purpose is to improve the accuracy of feedforward control of the controlled object.
本開示の一局面に係る装置は、外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のための装置である。フィードバック操作量は、目標値と制御対象の制御量との誤差に基づいて、制御量が目標値に近づくように制御対象の制御装置のフィードバック制御によって決定される。フィードバック操作量からフィードフォワード補償値を引いた値が、操作量として制御装置から制御対象に出力される。制御対象は、操作量を受ける数式モデルとして表現された制御要素と、外乱を受ける外乱要素と、むだ時間要素とによって表現される。制御要素は、操作量を受けて一次制御量を出力する。外乱要素は、外乱を受けて変動量を出力する。むだ時間要素は、一次制御量と変動量との和が制御対象の外部に出力されることを遅延させる。予測モデルは、外乱の計測値から変動量を特定値として予測する。装置は、学習データ生成部を備える。学習データ生成部は、計測値、操作量、および制御量を取得し、制御要素の数式モデルを用いて操作量から一次制御量の第1推定値を生成し、制御量と、制御量が取得された時刻から特定時間だけ戻された時刻の第1推定値とに基づいて変動量の第2推定値を生成し、計測値に、計測値が取得された時刻の第2推定値を正解として対応させた教師データを出力する。
An apparatus according to one aspect of the present disclosure is an apparatus for machine learning for a predictive model that predicts a specific value required to generate a feedforward compensation value for a feedback manipulated variable to a controlled object that is subject to a disturbance. The feedback manipulated variable is determined by feedback control of a control device for the controlled object, based on an error between a target value and the controlled variable of the controlled object, so that the controlled variable approaches a target value. A value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable. The controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element. The control element receives the manipulated variable and outputs a primary controlled variable. The disturbance element receives the disturbance and outputs a fluctuation amount. The dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object. The predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance. The apparatus includes a learning data generating unit. The learning data generation unit acquires the measurement values, the operation variables, and the control variables, generates a first estimate of the primary control variables from the operation variables using a mathematical model of the control elements, generates a second estimate of the fluctuation variables based on the control variables and the first estimate at a time that is a specific time back from the time at which the control variables were acquired, and outputs training data in which the measurement values are associated with the second estimate at the time at which the measurement values were acquired as the correct answer.
この開示によれば、教師データにおいて計測値に対して適切な正解が関連付けられるため、教師データを用いて予測モデルに対して機械学習を行うことにより、制御対象に対するフィードフォワード制御の精度を向上させることができる。
According to this disclosure, since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the control object.
上記の開示において、装置は、教師データを用いて予測モデルに対して機械学習を行う学習部をさらに備える。学習部は、予測モデルによって表現される、計測値と第2推定値との関係を、第2推定値を目的変数とするとともに、計測値を説明変数とする関数として近似する。
In the above disclosure, the device further includes a learning unit that performs machine learning on the prediction model using teacher data. The learning unit approximates the relationship between the measurement value and the second estimated value, which is expressed by the prediction model, as a function with the second estimated value as the objective variable and the measurement value as the explanatory variable.
この開示によれば、フィードフォワード制御と並行して、計測値および制御対象の特性に予測モデルをリアルタイムに適合させることができる。
According to this disclosure, a predictive model can be adapted in real time to the measured values and characteristics of the controlled object in parallel with feedforward control.
上記の開示において装置は、数式モデルの逆モデルを用いて、予測モデルによって予測された変動量からフィードフォワード補償値を生成するフィードフォワード補償部をさらに備える。
In the above disclosure, the device further includes a feedforward compensation unit that generates a feedforward compensation value from the amount of variation predicted by the prediction model using an inverse model of the mathematical model.
この開示によれば、特定時間だけ経過後に制御対象から出力される制御量を、制御量の時刻から特定時間だけ遡った時刻のフィードバック操作量に対応する正常な値に近づけることができる。
According to this disclosure, it is possible to make the control amount output from the controlled object after a specific time has elapsed closer to a normal value corresponding to the feedback manipulation amount at a time that is a specific time before the time of the control amount.
本開示の他の局面に係る教師データの生成プログラムは、外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のためのプログラムである。フィードバック操作量は、目標値と制御対象の制御量との誤差に基づいて、制御量が目標値に近づくように制御対象の制御装置のフィードバック制御によって決定される。フィードバック操作量からフィードフォワード補償値を引いた値が、操作量として制御対象に制御装置から出力される。制御対象は、操作量を受ける数式モデルとして表現された制御要素と、外乱を受ける外乱要素と、むだ時間要素とによって表現される。制御要素は、操作量を受けて一次制御量を出力する。外乱要素は、外乱を受けて変動量を出力する。むだ時間要素は、一次制御量と変動量との和が制御対象の外部に出力されることを遅延させる。予測モデルは、外乱の計測値から変動量を特定値として予測する。プログラムは、プロセッサに実行されることによって、計測値、操作量、および制御量を取得し、制御要素の数式モデルを用いて操作量から一次制御量の第1推定値を生成し、制御量と、制御量が取得された時刻から特定時間だけ戻された時刻の第1推定値とに基づいて変動量の第2推定値を生成し、計測値に、計測値が取得された時刻の第2推定値を正解として対応させた教師データを出力する。
A teacher data generation program according to another aspect of the present disclosure is a program for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulation variable for a control object that is subject to a disturbance. The feedback manipulation variable is determined by feedback control of a control device for the control object so that the control variable approaches a target value based on an error between a target value and the control variable of the control object. A value obtained by subtracting the feedforward compensation value from the feedback manipulation variable is output from the control device to the control object as a manipulation variable. The control object is represented by a control element expressed as a mathematical model that receives the manipulation variable, a disturbance element that is subject to a disturbance, and a dead time element. The control element receives the manipulation variable and outputs a primary control variable. The disturbance element receives the disturbance and outputs a fluctuation variable. The dead time element delays the output of the sum of the primary control variable and the fluctuation variable to the outside of the control object. The predictive model predicts the fluctuation variable as a specific value from a measured value of the disturbance. When executed by the processor, the program acquires the measurement values, the manipulated variables, and the controlled variables, generates a first estimate of the primary controlled variable from the manipulated variables using a mathematical model of the control element, generates a second estimate of the fluctuation variable based on the controlled variable and the first estimate at a time that is a specific time back from the time when the controlled variable was acquired, and outputs training data that corresponds the measurement values to the second estimate at the time when the measurement values were acquired as the correct answer.
この開示によれば、教師データにおいて計測値に対して適切な正解が関連付けられるため、教師データを用いて予測モデルに対して機械学習を行うことにより、制御対象に対するフィードフォワード制御の精度を向上させることができる。なお、本開示に係る発明は、上記プログラムを記憶する非一時的なコンピュータ可読媒体としても実現可能である。
According to this disclosure, since the teacher data associates an appropriate correct answer with the measured value, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of a control target. Note that the invention disclosed herein can also be realized as a non-transitory computer-readable medium that stores the above program.
本開示の他の局面に係る教師データの生成方法は、外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のための方法である。フィードバック操作量は、目標値と制御対象の制御量との誤差に基づいて、制御量が目標値に近づくように制御対象の制御装置のフィードバック制御によって決定される。フィードバック操作量からフィードフォワード補償値を引いた値が、操作量として制御対象に制御装置から出力される。制御対象は、操作量を受ける数式モデルとして表現された制御要素と、外乱を受ける外乱要素と、むだ時間要素とによって表現される。制御要素は、操作量を受けて一次制御量を出力する。外乱要素は、外乱を受けて変動量を出力する。むだ時間要素は、一次制御量と変動量との和が制御対象の外部に出力されることを遅延させる。予測モデルは、外乱の計測値から変動量を特定値として予測する。方法は、計測値、操作量、および制御量を取得することと、制御要素の数式モデルを用いて操作量から一次制御量の第1推定値を生成することと、制御量と、制御量が取得された時刻の第1推定値とに基づいて変動量の第2推定値を生成することと、計測値に、計測値が取得された時刻の第2推定値を正解として対応させた教師データを出力することとを含む。
A method for generating teacher data according to another aspect of the present disclosure is a method for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a controlled object that is subject to a disturbance. The feedback manipulated variable is determined by feedback control of a control device for the controlled object so that the controlled variable approaches a target value based on an error between a target value and the controlled variable of the controlled object. A value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable. The controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives a disturbance, and a dead time element. The control element receives the manipulated variable and outputs a primary controlled variable. The disturbance element receives the disturbance and outputs a fluctuation amount. The dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object. The predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance. The method includes acquiring a measurement value, a manipulated variable, and a controlled variable, generating a first estimate of a primary controlled variable from the manipulated variable using a mathematical model of the control element, generating a second estimate of a fluctuation amount based on the controlled variable and the first estimate at the time when the controlled variable was acquired, and outputting training data in which the measurement value is associated with the second estimate at the time when the measurement value was acquired as a correct answer.
この開示によれば、教師データにおいて計測値に対して適切な正解が関連付けられるため、教師データを用いて予測モデルに対して機械学習を行うことにより、制御対象に対するフィードフォワード制御の精度を向上させることができる。
According to this disclosure, since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the controlled object.
本開示に係る装置、プログラム、および方法によれば、制御対象に対するフィードフォワード制御の精度を向上させることができる。
The device, program, and method disclosed herein can improve the accuracy of feedforward control of a control target.
以下、実施の形態について図面を参照しながら詳細に説明する。なお、図中同一または相当部分には同一符号を付してその説明は原則として繰り返さない。
The following describes the embodiments in detail with reference to the drawings. Note that the same or corresponding parts in the drawings are given the same reference numerals, and as a rule, their explanations will not be repeated.
[実施の形態1]
<適用例>
図1は、実施の形態1に係る制御装置100の機能構成を示すブロック図である。図1に示されるように、制御装置100は、フィードバック制御部110と、フィードフォワード補償部120と、学習データ生成部130と、記憶部140と、減算器150と、減算器160と、学習部170とを備える。 [First embodiment]
<Application Examples>
Fig. 1 is a block diagram showing a functional configuration of acontrol device 100 according to embodiment 1. As shown in Fig. 1, the control device 100 includes a feedback control unit 110, a feedforward compensation unit 120, a learning data generation unit 130, a storage unit 140, a subtractor 150, a subtractor 160, and a learning unit 170.
<適用例>
図1は、実施の形態1に係る制御装置100の機能構成を示すブロック図である。図1に示されるように、制御装置100は、フィードバック制御部110と、フィードフォワード補償部120と、学習データ生成部130と、記憶部140と、減算器150と、減算器160と、学習部170とを備える。 [First embodiment]
<Application Examples>
Fig. 1 is a block diagram showing a functional configuration of a
制御装置100は、外乱dsを受ける制御対象200の出力値である制御量qが目標値qrに近づくように、制御対象200へのフィードバック操作量rbからフィードフォワード補償値rfを引いた値を操作量rとして制御対象200に出力する。記憶部140には、予測モデルMpおよび教師データDtが保存されている。制御装置100と制御対象200とは、ネットワーク(たとえば、インターネット、あるいはクラウドシステム)を介して接続され、互いに遠隔に配置されていてもよい。予測モデルMpを構築する機械学習アルゴリズムとしては、ディープバイナリーツリー、あるいはサポートベクターマシンを挙げることができる。
The control device 100 outputs to the control object 200 as the manipulated variable r, a value obtained by subtracting the feedforward compensation value rf from the feedback manipulated variable rb to the control object 200, so that the controlled variable q, which is the output value of the control object 200 subjected to the disturbance ds, approaches the target value qr. The memory unit 140 stores the prediction model Mp and the teacher data Dt. The control device 100 and the control object 200 are connected via a network (for example, the Internet or a cloud system) and may be located remotely from each other. Examples of machine learning algorithms for constructing the prediction model Mp include a deep binary tree or a support vector machine.
外乱dsとは、制御装置100と制御対象200とを含む制御系の状態を乱す量である。外乱dsは、複数の外乱要素を含み、たとえば、制御対象に偶発的あるいは突発的に入力される光量、電圧、電流、および温度を含む。外乱dsの計測値がdoと表現される。
The disturbance ds is a quantity that disturbs the state of the control system including the control device 100 and the controlled object 200. The disturbance ds includes multiple disturbance elements, such as the amount of light, voltage, current, and temperature that are accidentally or suddenly input to the controlled object. The measured value of the disturbance ds is expressed as do.
制御対象200における情報(信号)の伝達は、制御要素210、外乱要素220、むだ時間要素230、加算器240によって表現される。制御要素210は、操作量rを受けて一次制御量qwを加算器240に出力する。外乱要素220は、外乱dsを受けて変動量dを加算器240に出力する。加算器240は、一次制御量qwに変動量dを加算して二次制御量qoをむだ時間要素230に出力する。むだ時間要素230は、二次制御量qoが制御対象200の外部に出力されることを遅延させる。具体的には、むだ時間要素230は、二次制御量qoを受けてからむだ時間Lの経過後に二次制御量qoを制御量qとして制御対象200の外部に出力する。
The transmission of information (signals) in the controlled object 200 is represented by the control element 210, the disturbance element 220, the dead time element 230, and the adder 240. The control element 210 receives an operation amount r and outputs a primary control amount qw to the adder 240. The disturbance element 220 receives a disturbance ds and outputs a fluctuation amount d to the adder 240. The adder 240 adds the fluctuation amount d to the primary control amount qw and outputs a secondary control amount qo to the dead time element 230. The dead time element 230 delays the output of the secondary control amount qo to the outside of the controlled object 200. Specifically, the dead time element 230 outputs the secondary control amount qo as the control amount q to the outside of the controlled object 200 after the lapse of dead time L after receiving the secondary control amount qo.
以下では、フィードバック制御部110と減算器150とを含む構成をフィードバック制御系とも呼び、フィードフォワード補償部120と減算器160とを含む構成をフィードフォワード制御系とも呼ぶ。
Hereinafter, the configuration including the feedback control unit 110 and the subtractor 150 is also referred to as a feedback control system, and the configuration including the feedforward compensation unit 120 and the subtractor 160 is also referred to as a feedforward control system.
減算器150は、目標値qrと制御量qとの誤差eq(=qr-q)をフィードバック制御部110に出力する。フィードバック制御部110は、誤差eqに基づいてフィードバック操作量rbを決定して減算器160に出力する。
The subtractor 150 outputs the error eq (=qr-q) between the target value qr and the controlled variable q to the feedback control unit 110. The feedback control unit 110 determines the feedback control variable rb based on the error eq and outputs it to the subtractor 160.
学習データ生成部130は、予測モデルMpに対する機械学習において用いられる教師データDtを生成する。学習データ生成部130は、複数のタイミングの各々において、計測値doと、操作量rと、制御量qとを取得する。学習データ生成部130は、操作量rおよび制御量qから、変動量dを推定変動量dhとして推定する。学習データ生成部130は、計測値do(説明変数)と、推定変動量dh(目的変数)との組合せを教師データDtとして記憶部140に出力する。なお、学習データ生成部130は、教師データDtを学習部170に直接出力してもよい。
The learning data generating unit 130 generates teacher data Dt used in machine learning for the prediction model Mp. The learning data generating unit 130 acquires a measurement value do, an operation amount r, and a control amount q at each of a plurality of timings. The learning data generating unit 130 estimates a fluctuation amount d as an estimated fluctuation amount dh from the operation amount r and the control amount q. The learning data generating unit 130 outputs a combination of the measurement value do (explanatory variable) and the estimated fluctuation amount dh (objective variable) to the storage unit 140 as teacher data Dt. The learning data generating unit 130 may output the teacher data Dt directly to the learning unit 170.
時刻t1の計測値doに対応する変動量dは、時刻t1において一次制御量qwに加算され、二次制御量qoとなる。しかし、二次制御量qoが制御対象200から制御量qとして出力されるタイミングは、時刻t1からむだ時間Lだけ経過した時刻t2(=t1+L)である。そこで、学習データ生成部130は、時刻t1の計測値doに、時刻t1における制御量qではなく、時刻t2における制御量qを対応させる。その結果、学習データ生成部130から出力される教師データDtにおいては、計測値doが計測された時刻と同時刻の変動量dの推定変動量dhが計測値doに対する正解とされる。すなわち、学習データ生成部130によって生成された教師データDtにおいては、説明変数と目的変数との対応関係が正しく設定される。制御装置100によれば、適切な対応関係を有する教師データDtを用いて予測モデルMpに対する機械学習を行うことができるため、予測モデルMpの予測精度を向上させることができる。その結果、学習済みの予測モデルMpを用いて行われる、制御対象200に対するフィードフォワード制御の精度を向上させることができる。
The variation d corresponding to the measured value do at time t1 is added to the primary control amount qw at time t1 to become the secondary control amount qo. However, the timing at which the secondary control amount qo is output from the controlled object 200 as the control amount q is time t2 (= t1 + L), which is the time when the dead time L has elapsed from time t1. Therefore, the learning data generation unit 130 associates the measured value do at time t1 with the control amount q at time t2, not with the control amount q at time t1. As a result, in the teacher data Dt output from the learning data generation unit 130, the estimated variation amount dh of the variation amount d at the same time as the measured value do is measured is considered to be the correct answer for the measured value do. In other words, in the teacher data Dt generated by the learning data generation unit 130, the correspondence between the explanatory variable and the objective variable is correctly set. According to the control device 100, machine learning can be performed on the prediction model Mp using the teacher data Dt having an appropriate correspondence, so that the prediction accuracy of the prediction model Mp can be improved. As a result, the accuracy of the feedforward control of the control object 200 performed using the trained prediction model Mp can be improved.
学習データ生成部130における情報の伝達は、数式モデル131、時刻変換器132、減算器133、および出力生成器134によって表現される。数式モデル131は、操作量rと一次制御量qwとの対応関係として推測され、操作量rを引数とする関数として表現される。数式モデル131は、以下ではGwh(r)とも表現される。
The transmission of information in the learning data generation unit 130 is represented by a mathematical model 131, a time converter 132, a subtractor 133, and an output generator 134. The mathematical model 131 is estimated as a correspondence between the operation amount r and the primary control amount qw, and is represented as a function with the operation amount r as an argument. Below, the mathematical model 131 is also represented as Gwh(r).
数式モデル131は、操作量rを受けて推定一次制御量qwh(第1推定値)を減算器133に出力する。推定一次制御量qwhは、一次制御量qwに対応する推定値である。時刻変換器132は、制御量qの測定時刻t10を推定むだ時間Lh(特定時間)だけ戻して、推定二次制御量qohを減算器133に出力する。推定むだ時間Lhは、むだ時間Lの推定値である。推定二次制御量qohは、二次制御量qoの推定値である。なお、数式モデル131および推定むだ時間Lhは、制御対象200に対するフィードバック制御における一般的な設計情報に基づく物理モデリングあるいはシステム同定において決定され得る。たとえば、開ループ状態にされたフィードバック制御部110が制御対象200へステップ信号を印加して、その制御量の応答データからゲイン、時定数、およびむだ時間が算出されてもよい。これらの算出結果からZiegler Nichols法、あるいはCHR(Chien-Hrones-Reswick)法等によってフィードバック制御のPID(Proportional-Integral-Differential)パラメータを決定することができる。
The mathematical model 131 receives the manipulated variable r and outputs an estimated primary control variable qwh (first estimated value) to the subtractor 133. The estimated primary control variable qwh is an estimated value corresponding to the primary control variable qw. The time converter 132 returns the measurement time t10 of the control variable q by an estimated dead time Lh (specific time) and outputs an estimated secondary control variable qoh to the subtractor 133. The estimated dead time Lh is an estimated value of the dead time L. The estimated secondary control variable qoh is an estimated value of the secondary control variable qo. The mathematical model 131 and the estimated dead time Lh can be determined in physical modeling or system identification based on general design information in feedback control of the control object 200. For example, the feedback control unit 110 in an open loop state may apply a step signal to the control object 200, and the gain, time constant, and dead time may be calculated from the response data of the control variable. From these calculation results, the PID (Proportional-Integral-Differential) parameters for feedback control can be determined using the Ziegler Nichols method or the CHR (Chien-Hrones-Reswick) method.
時刻t11において取得された二次制御量qoから時刻t11において取得された一次制御量qwを引いた値が時刻t11の変動量dである。一次制御量qwは、時刻t11からむだ時間Lだけ経過した時刻t12の制御量qである。時刻t12において取得された制御量qは、時刻変換器132によって時刻t12から推定むだ時間Lhだけ戻された時刻t11の推定二次制御量qohとされる。時刻t11の推定二次制御量qohから時刻t11の推定一次制御量qwhを引いた値は、時刻t11の変動量dの推定値である推定変動量dh(第2推定値)となる。
The value obtained by subtracting the primary control amount qw acquired at time t11 from the secondary control amount qo acquired at time t11 is the fluctuation amount d at time t11. The primary control amount qw is the control amount q at time t12, which is a period of dead time L that has elapsed since time t11. The control amount q acquired at time t12 is set by the time converter 132 to the estimated secondary control amount qoh at time t11, which is set back from time t12 by the estimated dead time Lh. The value obtained by subtracting the estimated primary control amount qw at time t11 from the estimated secondary control amount qoh at time t11 is the estimated fluctuation amount dh (second estimated value), which is an estimate of the fluctuation amount d at time t11.
減算器133は、推定二次制御量qohから、推定二次制御量qohと同時刻の推定一次制御量qwhを引いて、推定変動量dhを出力生成器134に出力する。出力生成器134は、計測値doと、計測値doと同じ時刻の推定変動量dhとの組合せを教師データDtとして記憶部140に出力する。
The subtractor 133 subtracts the estimated primary control amount qwh at the same time as the estimated secondary control amount qoh from the estimated secondary control amount qoh, and outputs the estimated fluctuation amount dh to the output generator 134. The output generator 134 outputs a combination of the measured value do and the estimated fluctuation amount dh at the same time as the measured value do to the memory unit 140 as the teacher data Dt.
学習部170は、複数の教師データDtを用いて、計測値doと、推定変動量dhとの関係を、推定変動量dhを目的変数とするとともに、計測値doを説明変数とする関数(回帰曲線あるいは回帰曲面)として近似する。予測モデルMpは、当該関数を含む。学習部170は、予測モデルMpに対する機械学習(初回の学習または追加学習)が終了し、かつ、制御対象200の特性が変化した場合、予測モデルMpに対する機械学習(追加学習)を再開する。制御対象200の特性には、たとえば、計測値doおよび操作量rと、制御量qとの対応関係が含まれる。
The learning unit 170 uses multiple pieces of teacher data Dt to approximate the relationship between the measured value do and the estimated variation amount dh as a function (regression curve or regression surface) with the estimated variation amount dh as the objective variable and the measured value do as the explanatory variable. The prediction model Mp includes this function. When machine learning (initial learning or additional learning) for the prediction model Mp is completed and the characteristics of the control object 200 have changed, the learning unit 170 resumes machine learning (additional learning) for the prediction model Mp. The characteristics of the control object 200 include, for example, the correspondence between the measured value do and the operation amount r and the control amount q.
図2は、複数の教師データDtの分布の一例、および回帰曲面としての予測モデルMpの一例を併せて示す図である。図2においては、計測値doが複数の要素(次元)do1、およびdo2を含み、Gwh(r)=1.23×rである場合が示されている。
FIG. 2 shows an example of the distribution of multiple training data Dt, and an example of a prediction model Mp as a regression surface. In FIG. 2, the measured value do includes multiple elements (dimensions) do1 and do2, and Gwh(r) = 1.23 x r.
再び図1を参照して、フィードフォワード補償部120は、予測モデルMpを用いて計測値doから推定変動量dhを取得する。フィードフォワード補償部120は、逆モデル121を含む。逆モデル121は、推定一次制御量qwhを引数とし、操作量rを出力する、数式モデル131の逆関数Gwh-1(qwh)として構成される。逆モデル121には、予測モデルMpから推定変動量dhが入力される。
1 again, the feedforward compensation unit 120 acquires an estimated fluctuation amount dh from the measurement value do using the prediction model Mp. The feedforward compensation unit 120 includes an inverse model 121. The inverse model 121 is configured as an inverse function Gwh -1 (qwh) of the mathematical model 131 that takes an estimated primary controlled variable qwh as an argument and outputs an operation variable r. The inverse model 121 receives the estimated fluctuation amount dh from the prediction model Mp as an input.
フィードフォワード制御の目的は外乱dsによる一次制御量qwの乱れを抑制することである。そこで、フィードフォワード補償部120は、制御要素210から出力される値が推定変動量dhとなるような制御要素210に入力される操作量を、フィードフォワード補償値rfとして、逆モデル121から取得する。減算器160は、フィードフォワード補償部120からのフィードフォワード補償値rfをフィードバック操作量rbから引いて、操作量r(=rb-rf)を制御要素210に出力する。操作量rに対応する一次制御量qwにおいては、フィードフォワード補償値rfによって予め変動量dの増加分が抑制されている。そのため、変動量dが一次制御量qwに加算されても、二次制御量qoにおいて変動量dによる乱れがフィードフォワード補償値rfによって打ち消される。その結果、むだ時間Lだけ経過後に制御対象200から出力される制御量qは、制御量qの時刻からむだ時間Lだけ遡った時刻のフィードバック操作量rbに対応する正常な値に近づく。
The purpose of feedforward control is to suppress disturbance of the primary control amount qw due to disturbance ds. Therefore, the feedforward compensation unit 120 obtains from the inverse model 121, as a feedforward compensation value rf, a manipulation amount to be input to the control element 210 such that the value output from the control element 210 becomes the estimated fluctuation amount dh. The subtractor 160 subtracts the feedforward compensation value rf from the feedforward compensation unit 120 from the feedback manipulation amount rb, and outputs the manipulation amount r (= rb - rf) to the control element 210. In the primary control amount qw corresponding to the manipulation amount r, the increase in the fluctuation amount d is suppressed in advance by the feedforward compensation value rf. Therefore, even if the fluctuation amount d is added to the primary control amount qw, the disturbance due to the fluctuation amount d in the secondary control amount qo is canceled out by the feedforward compensation value rf. As a result, the controlled variable q output from the controlled object 200 after the dead time L has elapsed approaches a normal value corresponding to the feedback manipulated variable rb at a time that is the dead time L before the time of the controlled variable q.
制御装置100によれば、教師データDtにおいて計測値doに対して適切な正解が関連付けられるため、教師データDtを用いて予測モデルMpに対して機械学習を行うことにより、制御対象200に対するフィードフォワード制御の精度を向上させることができる。また、制御装置100によれば、フィードフォワード制御と並行して、計測値doおよび制御対象200の特性に予測モデルMpをリアルタイムに適合させることができる。また、予測モデルMpの精度が十分に高くなるまで機械学習が継続されるため、制御対象200に対するフィードフォワード制御の精度を十分に向上させることができる。さらに、制御対象200の特性の変化に応じて予測モデルMpが当該特性に再適合されるため、制御対象200の特性の変化によるフィードフォワード制御の精度の低下を抑制することができる。
According to the control device 100, since an appropriate correct answer is associated with the measurement value do in the teacher data Dt, the accuracy of feedforward control for the control object 200 can be improved by performing machine learning on the prediction model Mp using the teacher data Dt. Furthermore, according to the control device 100, in parallel with the feedforward control, the prediction model Mp can be adapted in real time to the measurement value do and the characteristics of the control object 200. Furthermore, since machine learning is continued until the accuracy of the prediction model Mp becomes sufficiently high, the accuracy of the feedforward control for the control object 200 can be sufficiently improved. Furthermore, since the prediction model Mp is re-adapted to the characteristics in response to changes in the characteristics of the control object 200, a decrease in the accuracy of the feedforward control due to changes in the characteristics of the control object 200 can be suppressed.
図3は、図1のフィードバック制御系、フィードフォワード制御系、学習データ生成部130、および学習部170の各々によって行われる処理の流れを示すフローチャートを示す図である。フィードバック制御系およびフィードフォワード制御系の各々のフローチャートに対応するルーチンは、たとえば、サンプリングタイム毎に実行される。学習データ生成部130および学習部170の各々のフローチャートに対応するルーチンは、たとえば、フィードフォワード制御系の各々のフローチャートに対応するルーチンの初回実行に応じて実行される。以下ではステップを単にSと記載する。
FIG. 3 is a diagram showing a flowchart illustrating the flow of processing performed by each of the feedback control system, the feedforward control system, the learning data generation unit 130, and the learning unit 170 in FIG. 1. The routines corresponding to the flowcharts of the feedback control system and the feedforward control system are executed, for example, at each sampling time. The routines corresponding to the flowcharts of the learning data generation unit 130 and the learning unit 170 are executed, for example, in response to the first execution of the routines corresponding to the flowcharts of the feedforward control system. Below, steps are simply abbreviated as S.
図3に示されるように、減算器150は、S111において、目標値qrと制御量qとの誤差eqを算出してフィードバック制御部110に出力する。フィードバック制御部110は、S312において、誤差eqに基づいてフィードバック操作量rbを決定して減算器160に出力して、処理を終了する。
As shown in FIG. 3, in S111, the subtractor 150 calculates the error eq between the target value qr and the controlled variable q, and outputs it to the feedback control unit 110. In S312, the feedback control unit 110 determines the feedback control variable rb based on the error eq, outputs it to the subtractor 160, and ends the process.
フィードフォワード補償部120は、S121において、計測値doからフィードフォワード補償値rfを決定して減算器160に出力する。減算器160は、S122において、フィードバック操作量rbからフィードフォワード補償値rfを引いた差を操作量rとして制御対象200および学習データ生成部130に出力して、処理を終了する。
In S121, the feedforward compensation unit 120 determines a feedforward compensation value rf from the measurement value do and outputs it to the subtractor 160. In S122, the subtractor 160 outputs the difference obtained by subtracting the feedforward compensation value rf from the feedback control amount rb as the control amount r to the control object 200 and the learning data generation unit 130, and the process ends.
学習データ生成部130は、S130において教師データDtを生成して処理を終了する。学習部170は、S170において予測モデルMpに対して教師データDtを用いた機械学習を行って、処理を終了する。
The training data generation unit 130 generates training data Dt in S130 and ends the process. The learning unit 170 performs machine learning on the prediction model Mp using the training data Dt in S170 and ends the process.
図4は、図3の学習データ生成処理S130の具体的な処理の流れを示すフローチャートである。図5は、図4の計測値do、操作量r、制御量q、および推定一次制御量qwhの時刻と、制御量qの時刻から推定むだ時間Lhだけ戻された推定二次制御量qohの時刻との対応を説明するための図である。図5においては、時刻t21~t24までの試行期間において、複数の計測値do、複数の操作量r、および複数の制御量qが取得される場合が示されている。時刻t22は、時刻t21から推定むだ時間Lhだけ経過した時刻(t21+Lh)である。時刻t23(>t22)は、時刻t24から推定むだ時間Lhだけ遡った時刻(t24-Lh)である。
FIG. 4 is a flowchart showing a specific process flow of the learning data generation process S130 in FIG. 3. FIG. 5 is a diagram for explaining the correspondence between the time of the measured value do, the operation amount r, the control amount q, and the estimated primary control amount qwh in FIG. 4, and the time of the estimated secondary control amount qoh, which is set back from the time of the control amount q by the estimated dead time Lh. FIG. 5 shows a case where multiple measured values do, multiple operation amounts r, and multiple control amounts q are obtained during the trial period from time t21 to t24. Time t22 is the time (t21+Lh) that is the estimated dead time Lh after time t21. Time t23 (>t22) is the time (t24-Lh) that is the estimated dead time Lh back from time t24.
図4および図5を併せて参照しながら、学習データ生成部130は、S131において、複数のタイミングの各々において計測値do、操作量r、および制御量qを取得し、処理をS132に進める。学習データ生成部130は、S132において、数式モデル131を用いて操作量rから推定一次制御量qwhを取得し、処理をS133に進める。学習データ生成部130は、制御量qの時刻を推定むだ時間Lhだけ戻して推定二次制御量qohを生成し、処理をS134に進める。図5に示されるように、S133の処理によって、S131において取得された時刻t22~t23に含まれる複数の制御量qは、推定むだ時間Lhだけ時刻を遡る方向に、全体としてスライドされる。
Referring to both FIG. 4 and FIG. 5, in S131, the learning data generation unit 130 acquires the measurement value do, the operation amount r, and the control amount q at each of the multiple timings, and proceeds to S132. In S132, the learning data generation unit 130 acquires an estimated primary control amount qwh from the operation amount r using the mathematical model 131, and proceeds to S133. The learning data generation unit 130 moves the time of the control amount q back by the estimated dead time Lh to generate an estimated secondary control amount qoh, and proceeds to S134. As shown in FIG. 5, by the process of S133, the multiple control amounts q included in times t22 to t23 acquired in S131 are slid as a whole in the direction going back in time by the estimated dead time Lh.
再び図4も参照して、学習データ生成部130は、推定二次制御量qohから推定二次制御量qohの時刻の推定一次制御量qwhを引いて推定変動量dhを生成し、処理をS135に進める。学習データ生成部130は、計測値doに対する正解を、計測値doの時刻の推定変動量dhに設定して教師データDtを生成し、処理を終了する。
Referring again to FIG. 4, the learning data generation unit 130 subtracts the estimated primary control amount qwh at the time of the estimated secondary control amount qoh from the estimated secondary control amount qoh to generate an estimated variation amount dh, and proceeds to S135. The learning data generation unit 130 sets the correct answer for the measured value do to the estimated variation amount dh at the time of the measured value do, generates teacher data Dt, and ends the process.
図5に示されるように、時刻t21~t24の試行期間において取得された複数の制御量qのうち、時刻t21~t22に取得された制御量qの時刻は、S133の処理によって時刻t21の前まで戻される。時刻t21の前には、対応する推定一次制御量qwhが存在しない。そのため、時刻t21~t22に取得された制御量qは、教師データDtの生成において使用されない。また、時刻t23~t24の推定二次制御量qoh、および推定変動量dhは存在しない。そのため、時刻t23~t24の期間に取得された計測値do、操作量r、および推定一次制御量qwhは、教師データDtの生成において使用されない。
As shown in FIG. 5, among the multiple control variables q acquired during the trial period from time t21 to t24, the time of the control variables q acquired from time t21 to t22 is returned to before time t21 by the processing of S133. There is no corresponding estimated primary control variable qwh before time t21. Therefore, the control variables q acquired from time t21 to t22 are not used in generating the teacher data Dt. Furthermore, there are no estimated secondary control variables qoh and estimated fluctuation variables dh from time t23 to t24. Therefore, the measurement value do, operation variable r, and estimated primary control variable qwh acquired during the period from time t23 to t24 are not used in generating the teacher data Dt.
[実施の形態1の変形例1]
実施の形態1においては、フィードバック制御系およびフィードフォワード制御系の両方が含まれる構成について説明した。実施の形態1の変形例1においては、フィードバック制御系が含まれない構成について説明する。 [First Modification of First Embodiment]
In the first embodiment, a configuration including both a feedback control system and a feedforward control system has been described. In the first modification of the first embodiment, a configuration not including a feedback control system will be described.
実施の形態1においては、フィードバック制御系およびフィードフォワード制御系の両方が含まれる構成について説明した。実施の形態1の変形例1においては、フィードバック制御系が含まれない構成について説明する。 [First Modification of First Embodiment]
In the first embodiment, a configuration including both a feedback control system and a feedforward control system has been described. In the first modification of the first embodiment, a configuration not including a feedback control system will be described.
図6は、実施の形態1の変形例1に係る制御装置100Aの機能構成を示すブロック図である。制御装置100Aの構成は、図1の制御装置100から減算器150およびフィードバック制御部110が除かれた構成である。これ以外は同様であるため、同様の構成についての説明を繰り返さない。なお、減算器160は、制御装置100Aに含まれていなくてもよい。
FIG. 6 is a block diagram showing the functional configuration of control device 100A according to variant 1 of embodiment 1. Control device 100A has a configuration in which subtractor 150 and feedback control unit 110 are removed from control device 100 in FIG. 1. Since the rest of the configuration is similar, the description of the similar configuration will not be repeated. Note that subtractor 160 does not have to be included in control device 100A.
図6に示されるように、制御装置100Aは、計測値doを受ける制御対象200の制御量qが目標値qrに近づくように制御対象200へのフィードバック操作量rbのフィードフォワード補償値rfを決定する。制御装置100Aによれば、既存のフィードバック制御系を残存させながら、当該フィードバック制御系に制御装置を追加することにより、既存のフィードバック制御系をフィードフォワード制御系および学習機能を含む制御系に容易に拡張することができる。
As shown in FIG. 6, the control device 100A determines a feedforward compensation value rf of the feedback operation amount rb to the control object 200 such that the control amount q of the control object 200 receiving the measurement value do approaches a target value qr. According to the control device 100A, by adding a control device to an existing feedback control system while leaving the existing feedback control system, the existing feedback control system can be easily expanded into a feedforward control system and a control system including a learning function.
以上、実施の形態1および変形例1に係る装置および方法によれば、制御対象に対するフィードフォワード制御の精度を向上させることができる。
As described above, the device and method according to the first embodiment and the first modification can improve the accuracy of feedforward control of the control target.
[実施の形態2]
実施の形態1においてはフィードバック制御系と、フィードフォワード制御系と、予測モデルに対して機械学習を行う構成とが1つの制御装置に含まれている場合について説明した。実施の形態2においては、フィードバック制御系と、フィードフォワード制御系と、予測モデルに対して機械学習を行う構成とが互いに別個の装置に分かれている構成について説明する。 [Embodiment 2]
In the first embodiment, a case where a feedback control system, a feedforward control system, and a configuration for performing machine learning on a predictive model are included in one control device is described. In the second embodiment, a configuration is described in which the feedback control system, the feedforward control system, and a configuration for performing machine learning on a predictive model are separated into separate devices.
実施の形態1においてはフィードバック制御系と、フィードフォワード制御系と、予測モデルに対して機械学習を行う構成とが1つの制御装置に含まれている場合について説明した。実施の形態2においては、フィードバック制御系と、フィードフォワード制御系と、予測モデルに対して機械学習を行う構成とが互いに別個の装置に分かれている構成について説明する。 [Embodiment 2]
In the first embodiment, a case where a feedback control system, a feedforward control system, and a configuration for performing machine learning on a predictive model are included in one control device is described. In the second embodiment, a configuration is described in which the feedback control system, the feedforward control system, and a configuration for performing machine learning on a predictive model are separated into separate devices.
図7は、実施の形態2に係る制御システム2の機能構成を示すブロック図である。図7において図1と同様の参照符号が付されている構成は、実施の形態1において説明された当該参照符号によって特定される構成と同様の機能を有するため、当該同様の構成についての説明を繰り返さない。
FIG. 7 is a block diagram showing the functional configuration of a control system 2 according to embodiment 2. In FIG. 7, components with the same reference symbols as those in FIG. 1 have the same functions as the components identified by the reference symbols described in embodiment 1, and therefore the description of the similar components will not be repeated.
図7に示されるように、制御システム2は、フィードバック制御装置11と、フィードフォワード補償装置12と、学習データ生成装置13と、記憶装置14と、学習装置17とを備える。学習データ生成装置13、記憶装置14、および学習装置17は、図1の学習データ生成部130、記憶部140、および学習部170にそれぞれ対応する。フィードバック制御装置11は、フィードバック制御部110と、減算器150とを含む。フィードフォワード補償装置12は、フィードフォワード補償部120と、減算器160とを含む。フィードバック制御装置11、フィードフォワード補償装置12、学習データ生成装置13、記憶装置14、学習装置17、および制御対象200は、ネットワークを介して互いに接続され、互いに遠隔に配置されていてもよい。なお、減算器160は、フィードフォワード補償装置12ではなく、フィードバック制御装置11に含まれていてもよい。
As shown in FIG. 7, the control system 2 includes a feedback control device 11, a feedforward compensation device 12, a learning data generation device 13, a storage device 14, and a learning device 17. The learning data generation device 13, the storage device 14, and the learning device 17 correspond to the learning data generation unit 130, the storage unit 140, and the learning unit 170 in FIG. 1, respectively. The feedback control device 11 includes a feedback control unit 110 and a subtractor 150. The feedforward compensation device 12 includes a feedforward compensation unit 120 and a subtractor 160. The feedback control device 11, the feedforward compensation device 12, the learning data generation device 13, the storage device 14, the learning device 17, and the control target 200 may be connected to each other via a network and may be located remotely from each other. The subtractor 160 may be included in the feedback control device 11, not in the feedforward compensation device 12.
制御システム2によれば、既存のフィードバック制御装置を残存させながら、当該フィードバック制御装置にフィードフォワード補償装置、学習データ生成装置、および学習装置を追加することにより、既存の制御システムを容易に拡張することができる。
According to control system 2, an existing control system can be easily expanded by adding a feedforward compensation device, a learning data generation device, and a learning device to the existing feedback control device while leaving the existing feedback control device in place.
以上、実施の形態2に係る装置および方法によれば、制御対象に対するフィードフォワード制御の精度を向上させることができる。
As described above, the device and method according to the second embodiment can improve the accuracy of feedforward control of the control target.
[実施の形態3]
実施の形態3においては、実施の形態1に係る制御装置の一例として、当該制御装置がPLC(Programmable Logic Controller)を含む構成について説明する。 [Embodiment 3]
In the third embodiment, as an example of the control device according to the first embodiment, a configuration in which the control device includes a PLC (Programmable Logic Controller) will be described.
実施の形態3においては、実施の形態1に係る制御装置の一例として、当該制御装置がPLC(Programmable Logic Controller)を含む構成について説明する。 [Embodiment 3]
In the third embodiment, as an example of the control device according to the first embodiment, a configuration in which the control device includes a PLC (Programmable Logic Controller) will be described.
<制御システムのネットワーク構成例>
図8は、実施の形態3に係る制御システム3のネットワーク構成例を示す模式図である。図8に示されるように、制御システム3は、複数のデバイスが互いに通信可能に構成されたデバイス群を含む。典型的には、デバイスは、制御プログラムを実行する処理主体である制御装置300と、制御装置300に接続される周辺装置とを含み得る。制御装置300は、図1に示される制御装置100と同様の機能構成を有する。 <Control system network configuration example>
Fig. 8 is a schematic diagram showing an example of a network configuration of acontrol system 3 according to the third embodiment. As shown in Fig. 8, the control system 3 includes a device group in which a plurality of devices are configured to be able to communicate with each other. Typically, the devices may include a control device 300 that is a processing entity that executes a control program, and peripheral devices connected to the control device 300. The control device 300 has a functional configuration similar to that of the control device 100 shown in Fig. 1.
図8は、実施の形態3に係る制御システム3のネットワーク構成例を示す模式図である。図8に示されるように、制御システム3は、複数のデバイスが互いに通信可能に構成されたデバイス群を含む。典型的には、デバイスは、制御プログラムを実行する処理主体である制御装置300と、制御装置300に接続される周辺装置とを含み得る。制御装置300は、図1に示される制御装置100と同様の機能構成を有する。 <Control system network configuration example>
Fig. 8 is a schematic diagram showing an example of a network configuration of a
制御装置300は、各種の設備または装置などの制御対象を制御する産業用コントローラに相当する。制御装置300は、制御演算を実行する一種のコンピュータであり、典型的には、PLC(Programmable Logic Controller)を含む。制御装置300は、フィールドネットワーク20を介してフィールドデバイス200Cに接続されている。制御装置300は、フィールドネットワーク20を介して、少なくとも1つのフィールドデバイス200Cとデータを遣り取りする。
The control device 300 corresponds to an industrial controller that controls control targets such as various facilities or devices. The control device 300 is a type of computer that executes control calculations, and typically includes a PLC (Programmable Logic Controller). The control device 300 is connected to a field device 200C via a field network 20. The control device 300 exchanges data with at least one field device 200C via the field network 20.
制御装置300において実行される制御演算は、フィールドデバイス200Cにおいて収集または生成されたデータを収集する処理、フィールドデバイス200Cに対する指令値(操作量)などのデータを生成する処理、および生成した出力データを対象のフィールドデバイス200Cへ送信する処理などを含む。フィールドデバイス200Cにおいて収集または生成されたデータには、フィールドデバイス200Cに入力された外乱に関するデータ、および指令値に従ってフィールドデバイス200Cが実際に動作した結果としての制御量が含まれる。フィールドデバイス200Cに対する指令値は、制御装置300によって実行される制御プログラムに基づいて算出された制御目標値(目標値)と実際の制御量との誤差に基づいて暫定的に算出された操作量に、予測モデルによって外乱から予測されたフィードフォワード補償値が加算されることによって決定される。
The control calculations executed in the control device 300 include a process of collecting data collected or generated in the field device 200C, a process of generating data such as a command value (operation amount) for the field device 200C, and a process of transmitting the generated output data to the target field device 200C. The data collected or generated in the field device 200C includes data on disturbances input to the field device 200C, and a control amount resulting from the actual operation of the field device 200C in accordance with the command value. The command value for the field device 200C is determined by adding a feedforward compensation value predicted from the disturbance by a prediction model to the operation amount provisionally calculated based on the error between the control target value (target value) calculated based on the control program executed by the control device 300 and the actual control amount.
フィールドネットワーク20は、定周期通信を行うバスまたはネットワークを採用することが好ましい。このような定周期通信を行うバスまたはネットワークとしては、EtherCAT(登録商標)、EtherNet/IP(登録商標)、DeviceNet(登録商標)、またはCompoNet(登録商標)などが知られている。データの到達時間が保証される点において、EtherCAT(登録商標)が好ましい。
The field network 20 preferably employs a bus or network that performs periodic communication. Known examples of such buses or networks that perform periodic communication include EtherCAT (registered trademark), EtherNet/IP (registered trademark), DeviceNet (registered trademark), and CompoNet (registered trademark). EtherCAT (registered trademark) is preferable because it guarantees the arrival time of data.
フィールドネットワーク20には、任意のフィールドデバイス200Cを接続することができる。フィールドデバイス200Cは、フィールド側にあるロボットまたはコンベアなどに対して何らかの物理的な作用を与えるアクチュエータ、および、フィールドとの間で情報を遣り取りする入出力装置などを含む。
Any field device 200C can be connected to the field network 20. The field device 200C includes an actuator that exerts some kind of physical action on a robot or conveyor in the field, and an input/output device that exchanges information with the field.
制御システム3においてフィールドデバイス200Cは、複数のサーボドライバ220_1、および220_2と、複数のサーボドライバ220_1、および220_2にそれぞれ接続された複数のサーボモータ222_1、および222_2とを含む。フィールドデバイス200Cは、「制御対象」の一例である。
In the control system 3, the field device 200C includes a plurality of servo drivers 220_1 and 220_2, and a plurality of servo motors 222_1 and 222_2 connected to the plurality of servo drivers 220_1 and 220_2, respectively. The field device 200C is an example of a "controlled object."
サーボドライバ220_1、および220_2は、制御装置300からの指令値(たとえば、位置指令値または速度指令値など)に従って、サーボモータ222_1および222_2のうちの対応するサーボモータを駆動する。このようにして、制御装置300は、フィールドデバイス200Cを制御することができる。
Servo drivers 220_1 and 220_2 drive corresponding servo motors of servo motors 222_1 and 222_2 according to command values (such as position command values or speed command values) from control device 300. In this way, control device 300 can control field device 200C.
制御装置300は、上位ネットワーク32を介して、他の装置にも接続されている。上位ネットワーク32は、ゲートウェイ700を介して、外部ネットワークであるインターネット900に接続されている。上位ネットワーク32には、一般的なネットワークプロトコルであるイーサネット(登録商標)、あるいはEtherNet/IP(登録商標)が採用されてもよい。より具体的には、上位ネットワーク32には、少なくとも1つのサーバ装置600および少なくとも1つの表示装置500が接続されてもよい。
The control device 300 is also connected to other devices via a higher-level network 32. The higher-level network 32 is connected to the Internet 900, which is an external network, via a gateway 700. The higher-level network 32 may use Ethernet (registered trademark) or EtherNet/IP (registered trademark), which are common network protocols. More specifically, at least one server device 600 and at least one display device 500 may be connected to the higher-level network 32.
サーバ装置600としては、データベースシステム、または製造実行システム(MES:Manufacturing Execution System)などが想定される。製造実行システムは、制御対象の製造装置または設備からの情報を取得して、生産全体を監視および管理するものであり、オーダ情報、品質情報、あるいは出荷情報などを扱うこともできる。これらに限らず、情報系サービスを提供する装置を上位ネットワーク32に接続するようにしてもよい。情報系サービスとしては、制御対象の製造装置または設備からの情報を取得して、マクロ的またはミクロ的な分析などを行う処理が想定される。たとえば、情報系サービスとしては、制御対象の製造装置または設備からの情報に含まれる何らかの特徴的な傾向を抽出するデータマイニング、あるいは制御対象の設備または機械からの情報に基づく機械学習を行うための機械学習ツールなどが想定される。
The server device 600 may be a database system or a manufacturing execution system (MES). The manufacturing execution system acquires information from the controlled manufacturing device or facility to monitor and manage the entire production, and may also handle order information, quality information, shipping information, and the like. In addition to these, a device that provides an information system service may be connected to the upper network 32. An example of an information system service is processing that acquires information from the controlled manufacturing device or facility and performs macro or micro analysis. For example, an example of an information system service is data mining that extracts some characteristic tendency contained in information from the controlled manufacturing device or facility, or a machine learning tool that performs machine learning based on information from the controlled facility or machine.
表示装置500は、ユーザからの操作を受けて、制御装置300に対してユーザ操作に応じたコマンドなどを出力するとともに、制御装置300での演算結果などをグラフィカルに表示する。
The display device 500 receives operations from the user and outputs commands to the control device 300 in response to the user operations, and also graphically displays the results of calculations performed by the control device 300.
制御装置300には、サポート装置400が接続可能になっている。サポート装置400は、上位ネットワーク32またはインターネット900を介して制御装置300に接続されてもよい。サポート装置400は、制御装置300が制御対象を制御するために必要な準備を支援する装置である。具体的には、サポート装置400は、制御装置300で実行されるプログラムの開発環境(プログラム作成編集ツール、パーサ、およびコンパイラなど)、制御装置300および制御装置300に接続される各種デバイスの構成情報(コンフィギュレーション)を設定するための設定環境、生成したプログラムを制御装置300へ出力する機能、および制御装置300上で実行されるプログラムなどをオンラインで修正および変更を行う機能などを提供する。
The support device 400 can be connected to the control device 300. The support device 400 may be connected to the control device 300 via the higher-level network 32 or the Internet 900. The support device 400 is a device that assists the control device 300 in making the necessary preparations to control the control target. Specifically, the support device 400 provides a development environment (program creation and editing tools, parsers, compilers, etc.) for programs executed by the control device 300, a setting environment for setting configuration information (configuration) for the control device 300 and various devices connected to the control device 300, a function for outputting the generated program to the control device 300, and a function for modifying and changing the program executed on the control device 300 online.
制御システム3においては、制御装置300、サポート装置400、および表示装置500がそれぞれ別体として構成されているが、これらの機能の全部または一部を単一の装置に集約するような構成が採用されてもよい。
In the control system 3, the control device 300, the support device 400, and the display device 500 are each configured as separate entities, but a configuration may be adopted in which all or part of these functions are integrated into a single device.
制御装置300は、一の生産現場のみで使用される場合に限らず、他の生産現場においても使用される。また、一の生産現場内においても複数の異なるラインで使用される場合もある。
The control device 300 is not limited to being used at one production site, but may also be used at other production sites. It may also be used at multiple different lines at one production site.
<制御装置のハードウェア構成例>
図9は、図8の制御装置300のハードウェア構成例を示すブロック図である。図7に示されるように、制御装置300は、プロセッサ302と、メインメモリ304と、ストレージ360と、メモリカードインターフェイス312と、上位ネットワークコントローラ306と、フィールドネットワークコントローラ308と、ローカルバスコントローラ316と、USB(Universal Serial Bus)インターフェイスを提供するUSBコントローラ370とを含む。これらのコンポーネントは、プロセッサバス318を介して接続されている。 <Example of hardware configuration of control device>
Fig. 9 is a block diagram showing an example of a hardware configuration of thecontrol device 300 in Fig. 8. As shown in Fig. 7, the control device 300 includes a processor 302, a main memory 304, a storage 360, a memory card interface 312, a host network controller 306, a field network controller 308, a local bus controller 316, and a Universal Serial Bus (USB) controller 370 that provides a USB interface. These components are connected via a processor bus 318.
図9は、図8の制御装置300のハードウェア構成例を示すブロック図である。図7に示されるように、制御装置300は、プロセッサ302と、メインメモリ304と、ストレージ360と、メモリカードインターフェイス312と、上位ネットワークコントローラ306と、フィールドネットワークコントローラ308と、ローカルバスコントローラ316と、USB(Universal Serial Bus)インターフェイスを提供するUSBコントローラ370とを含む。これらのコンポーネントは、プロセッサバス318を介して接続されている。 <Example of hardware configuration of control device>
Fig. 9 is a block diagram showing an example of a hardware configuration of the
図9に示されるように、プロセッサ302は、制御演算を実行する演算処理部に相当し、CPU(Central Processing Unit)などで構成される。プロセッサ302は、GPU(Graphics Processing Unit)を含んでいてもよい。具体的には、プロセッサ302は、ストレージ360に保存されたプログラムを読み出して、メインメモリ304に展開して実行することで、制御対象に対する制御演算を実現する。
As shown in FIG. 9, the processor 302 corresponds to an arithmetic processing unit that executes control calculations, and is composed of a CPU (Central Processing Unit) and the like. The processor 302 may also include a GPU (Graphics Processing Unit). Specifically, the processor 302 reads out a program stored in the storage 360, expands it in the main memory 304, and executes it to realize control calculations for a control object.
メインメモリ304は、DRAM(Dynamic Random Access Memory)などの揮発性記憶装置などで構成される。メインメモリ304は、SRAM(Static Random Access Memory)を含んでいてもよい。ストレージ360は、たとえば、SSD(Solid State Drive)などの不揮発性記憶装置などで構成される。ストレージ360は、HDD(Hard Disk Drive)を含んでいてもよい。
Main memory 304 is composed of a volatile storage device such as a dynamic random access memory (DRAM). Main memory 304 may also include a static random access memory (SRAM). Storage 360 is composed of a non-volatile storage device such as a solid state drive (SSD). Storage 360 may also include a hard disk drive (HDD).
ストレージ360には、制御プログラムPcと、教師データDtと、予測モデルMpとが保存されている。ストレージ360は、図1の記憶部140に対応する。制御プログラムPcは、制御装置300を統合的に制御して、制御装置300の各機能を実現するためのプログラムを含む。すなわち、制御プログラムPcを実行するプロセッサ302が、図1のフィードバック制御系(フィードバック制御部110および減算器150)、フィードフォワード制御系(フィードフォワード補償部120および減算器160)、ならびに学習データ生成部130および学習部170に対応する。
Storage 360 stores control program Pc, teacher data Dt, and prediction model Mp. Storage 360 corresponds to memory unit 140 in FIG. 1. Control program Pc includes a program for comprehensively controlling control device 300 and realizing each function of control device 300. In other words, processor 302 that executes control program Pc corresponds to the feedback control system (feedback control unit 110 and subtractor 150), feedforward control system (feedforward compensation unit 120 and subtractor 160), learning data generation unit 130, and learning unit 170 in FIG. 1.
メモリカードインターフェイス312は、着脱可能な記憶媒体の一例であるメモリカード314を受け付ける。メモリカードインターフェイス312は、メモリカード314に対して任意のデータの読み書きが可能になっている。
The memory card interface 312 accepts a memory card 314, which is an example of a removable storage medium. The memory card interface 312 is capable of reading and writing any data to the memory card 314.
上位ネットワークコントローラ306は、上位ネットワーク32(たとえばローカルエリアネットワーク)を介して、上位ネットワーク32に接続された任意の情報処理装置との間でデータを遣り取りする。
The upper network controller 306 exchanges data with any information processing device connected to the upper network 32 (e.g., a local area network) via the upper network 32.
フィールドネットワークコントローラ308は、フィールドネットワーク20を介して、サーボモータ222_1、および222_2等の任意のデバイスとの間でデータを遣り取りする。
The field network controller 308 exchanges data with any devices such as servo motors 222_1 and 222_2 via the field network 20.
ローカルバスコントローラ316は、ローカルバス122を介して、制御装置300を構成する任意の機能ユニット380との間でデータを遣り取りする。機能ユニット380は、たとえば、アナログ信号の入力および出力の少なくとも一方を担当するアナログI/Oユニット、デジタル信号の入力および出力の少なくとも一方を担当するデジタルI/Oユニット、ならびにエンコーダなどからのパルスを受け付けるカウンタユニットなどからなる。
The local bus controller 316 exchanges data with any of the functional units 380 constituting the control device 300 via the local bus 122. The functional units 380 may, for example, be an analog I/O unit responsible for at least one of the input and output of analog signals, a digital I/O unit responsible for at least one of the input and output of digital signals, and a counter unit that receives pulses from an encoder or the like.
USBコントローラ370は、USB接続を介して、任意の情報処理装置との間でデータを遣り取りする。USBコントローラ370には、たとえばサポート装置400が接続される。
The USB controller 370 exchanges data with any information processing device via a USB connection. For example, the support device 400 is connected to the USB controller 370.
以上、実施の形態3に係る装置、プログラム、および制御方法によれば、制御対象に対するフィードフォワード制御の精度を向上させることができる。
As described above, the device, program, and control method according to the third embodiment can improve the accuracy of feedforward control of the control target.
<付記>
上記したような本実施の形態は、以下のような技術思想を含む。 <Additional Notes>
The present embodiment as described above includes the following technical idea.
上記したような本実施の形態は、以下のような技術思想を含む。 <Additional Notes>
The present embodiment as described above includes the following technical idea.
[構成1]
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための装置(100)であって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記装置(100)は、学習データ生成部(130)を備え、
前記学習データ生成部(130)は、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得し、
前記数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成し、
前記制御量(q)と、前記制御量(q)が取得された時刻から特定時間(Lh)だけ戻された時刻の前記第1推定値(qwh)とに基づいて前記変動量(d)の第2推定値(dh)を生成し、
前記計測値(do)に、前記計測値(do)が取得された時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力する、装置(100)。 [Configuration 1]
An apparatus (100) for machine learning of a prediction model (Mp) for predicting a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The device (100) includes a learning data generation unit (130),
The learning data generation unit (130)
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
The device (100) outputs teacher data (Dt) in which the second estimated value (dh) at the time when the measurement value (do) is acquired corresponds to the measurement value (do) as a correct answer.
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための装置(100)であって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記装置(100)は、学習データ生成部(130)を備え、
前記学習データ生成部(130)は、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得し、
前記数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成し、
前記制御量(q)と、前記制御量(q)が取得された時刻から特定時間(Lh)だけ戻された時刻の前記第1推定値(qwh)とに基づいて前記変動量(d)の第2推定値(dh)を生成し、
前記計測値(do)に、前記計測値(do)が取得された時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力する、装置(100)。 [Configuration 1]
An apparatus (100) for machine learning of a prediction model (Mp) for predicting a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The device (100) includes a learning data generation unit (130),
The learning data generation unit (130)
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
The device (100) outputs teacher data (Dt) in which the second estimated value (dh) at the time when the measurement value (do) is acquired corresponds to the measurement value (do) as a correct answer.
[構成2]
前記教師データ(Dt)を用いて前記予測モデル(Mp)に対して前記機械学習を行う学習部(170)をさらに備え、
前記学習部(170)は、前記予測モデル(Mp)によって表現される、前記計測値(do)と前記第2推定値(dh)との関係を、前記第2推定値(dh)を目的変数とするとともに、前記計測値(do)を説明変数とする関数として近似する、構成1に記載の装置(100)。 [Configuration 2]
A learning unit (170) that performs the machine learning on the prediction model (Mp) using the teacher data (Dt),
The device (100) described inconfiguration 1, wherein the learning unit (170) approximates the relationship between the measurement value (do) and the second estimated value (dh), as expressed by the prediction model (Mp), as a function having the second estimated value (dh) as a response variable and the measurement value (do) as an explanatory variable.
前記教師データ(Dt)を用いて前記予測モデル(Mp)に対して前記機械学習を行う学習部(170)をさらに備え、
前記学習部(170)は、前記予測モデル(Mp)によって表現される、前記計測値(do)と前記第2推定値(dh)との関係を、前記第2推定値(dh)を目的変数とするとともに、前記計測値(do)を説明変数とする関数として近似する、構成1に記載の装置(100)。 [Configuration 2]
A learning unit (170) that performs the machine learning on the prediction model (Mp) using the teacher data (Dt),
The device (100) described in
[構成3]
前記数式モデル(131)の逆モデル(121)を用いて、前記予測モデル(Mp)によって予測された前記変動量(d)から前記フィードフォワード補償値(rf)を生成するフィードフォワード補償部(120)をさらに備える、構成1または2に記載の装置(100)。 [Configuration 3]
The apparatus (100) according to configuration 1 or 2, further comprising a feedforward compensation unit (120) that generates the feedforward compensation value (rf) from the fluctuation amount (d) predicted by the prediction model (Mp) using an inverse model (121) of the mathematical model (131).
前記数式モデル(131)の逆モデル(121)を用いて、前記予測モデル(Mp)によって予測された前記変動量(d)から前記フィードフォワード補償値(rf)を生成するフィードフォワード補償部(120)をさらに備える、構成1または2に記載の装置(100)。 [Configuration 3]
The apparatus (100) according to
[構成4]
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための教師データの生成プログラムであって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記プログラムは、プロセッサに実行されることによって、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得し、
前記制御量(q)の時刻を特定時間(Lh)だけ戻し、
前記数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成し、
前記制御量(q)から前記制御量(q)の時刻の前記第1推定値(qwh)を引くことによって前記変動量(d)の第2推定値(dh)を生成し、
前記計測値(do)に、前記計測値(do)の時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力する、教師データの生成プログラム。 [Configuration 4]
A program for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds),
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The program, when executed by a processor,
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
The time of the control amount (q) is returned by a specific time (Lh),
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) by subtracting the first estimate (qwh) at the time of the controlled variable (q) from the controlled variable (q);
A program for generating teacher data that outputs teacher data (Dt) in which the second estimated value (dh) at the time of the measurement value (do) corresponds to the measurement value (do) as a correct answer.
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための教師データの生成プログラムであって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記プログラムは、プロセッサに実行されることによって、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得し、
前記制御量(q)の時刻を特定時間(Lh)だけ戻し、
前記数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成し、
前記制御量(q)から前記制御量(q)の時刻の前記第1推定値(qwh)を引くことによって前記変動量(d)の第2推定値(dh)を生成し、
前記計測値(do)に、前記計測値(do)の時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力する、教師データの生成プログラム。 [Configuration 4]
A program for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds),
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The program, when executed by a processor,
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
The time of the control amount (q) is returned by a specific time (Lh),
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) by subtracting the first estimate (qwh) at the time of the controlled variable (q) from the controlled variable (q);
A program for generating teacher data that outputs teacher data (Dt) in which the second estimated value (dh) at the time of the measurement value (do) corresponds to the measurement value (do) as a correct answer.
[構成5]
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための教師データの生成方法であって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記方法は、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得することと、
前記制御要素(210)の数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成することと、
前記制御量(q)と、前記制御量(q)が取得された時刻から特定時間(Lh)だけ戻された時刻の前記第1推定値(qwh)とに基づいて前記変動量(d)の第2推定値(dh)を生成することと、
前記計測値(do)に、前記計測値(do)が取得された時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力することとを含む、教師データの生成方法。 [Configuration 5]
A method for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The method comprises:
Obtaining the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (q) of the primary controlled variable (q) from the manipulated variable (r) using a mathematical model (131) of the control element (210);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
A method for generating teacher data, comprising: outputting teacher data (Dt) in which the measurement value (do) corresponds to the second estimated value (dh) at the time when the measurement value (do) was obtained as a correct answer.
外乱(ds)を受ける制御対象(200)へのフィードバック操作量(rb)のフィードフォワード補償値(rf)の生成に必要な特定値を予測する予測モデル(Mp)に対する機械学習のための教師データの生成方法であって、
前記フィードバック操作量(rb)は、目標値(qr)と前記制御対象(200)の制御量(q)との誤差に基づいて、前記制御量(q)が前記目標値(qr)に近づくように前記制御対象(200)の制御装置(100)のフィードバック制御によって決定され、
前記フィードバック操作量(rb)から前記フィードフォワード補償値(rf)を引いた値が、操作量(r)として前記制御対象(200)に前記制御装置(100)から出力され、
前記制御対象(200)は、前記操作量(r)を受ける数式モデル(131)として表現された制御要素(210)と、前記外乱(ds)を受ける外乱要素(220)と、むだ時間要素(230)とによって表現され、
前記制御要素(210)は、前記操作量(r)を受けて一次制御量(qw)を出力し、
前記外乱要素(220)は、前記外乱(ds)を受けて変動量(d)を出力し、
前記むだ時間要素(230)は、前記一次制御量(qw)と前記変動量(d)との和が前記制御対象(200)の外部に出力されることを遅延させ、
前記予測モデル(Mp)は、前記外乱(ds)の計測値(do)から前記変動量(d)を前記特定値として予測し、
前記方法は、
前記計測値(do)、前記操作量(r)、および前記制御量(q)を取得することと、
前記制御要素(210)の数式モデル(131)を用いて前記操作量(r)から前記一次制御量(qw)の第1推定値(qwh)を生成することと、
前記制御量(q)と、前記制御量(q)が取得された時刻から特定時間(Lh)だけ戻された時刻の前記第1推定値(qwh)とに基づいて前記変動量(d)の第2推定値(dh)を生成することと、
前記計測値(do)に、前記計測値(do)が取得された時刻の前記第2推定値(dh)を正解として対応させた教師データ(Dt)を出力することとを含む、教師データの生成方法。 [Configuration 5]
A method for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The method comprises:
Obtaining the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (q) of the primary controlled variable (q) from the manipulated variable (r) using a mathematical model (131) of the control element (210);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
A method for generating teacher data, comprising: outputting teacher data (Dt) in which the measurement value (do) corresponds to the second estimated value (dh) at the time when the measurement value (do) was obtained as a correct answer.
今回開示された各実施の形態は、矛盾しない範囲で適宜組み合わされて実施されることも予定されている。今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて請求の範囲によって示され、請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。
The embodiments disclosed herein are intended to be combined as appropriate within the scope of compatibility. The embodiments disclosed herein should be considered in all respects as illustrative and not restrictive. The scope of the present invention is defined by the claims, not the above description, and is intended to include all modifications within the meaning and scope of the claims.
2,3 制御システム、11 フィードバック制御装置、12 フィードフォワード補償装置、13 学習データ生成装置、14 記憶装置、17 学習装置、20 フィールドネットワーク、32 上位ネットワーク、100,100A,300 制御装置、110 フィードバック制御部、120 フィードフォワード補償部、121 逆モデル、122 ローカルバス、130 学習データ生成部、131 数式モデル、132 時刻変換器、133,150,160 減算器、134 出力生成器、140 記憶部、170 学習部、200 制御対象、200C フィールドデバイス、210 制御要素、220 外乱要素、222 サーボモータ、230 むだ時間要素、240 加算器、302 プロセッサ、304 メインメモリ、306 上位ネットワークコントローラ、308 フィールドネットワークコントローラ、312 メモリカードインターフェイス、314 メモリカード、316 ローカルバスコントローラ、318 プロセッサバス、360 ストレージ、370 コントローラ、380 機能ユニット、400 サポート装置、500 表示装置、600 サーバ装置、700 ゲートウェイ、900 インターネット、Dt 教師データ、L むだ時間、Lh 推定むだ時間、Mp 予測モデル、Pc 制御プログラム、d 変動量、dh 推定変動量、do 計測値、ds 外乱、eq 誤差、q 制御量、qo 二次制御量、qoh 推定二次制御量、qr 目標値、qw 一次制御量、qwh 推定一次制御量、r 操作量、rb フィードバック操作量、rf フィードフォワード補償値。
2, 3 Control system, 11 Feedback control device, 12 Feedforward compensation device, 13 Learning data generation device, 14 Storage device, 17 Learning device, 20 Field network, 32 Upper network, 100, 100A, 300 Control device, 110 Feedback control unit, 120 Feedforward compensation unit, 121 Inverse model, 122 Local bus, 130 Learning data generation unit, 131 Mathematical model, 132 Time converter, 133, 150, 160 Subtractor, 134 Output generator, 140 Storage unit, 170 Learning unit, 200 Control target, 200C Field device, 210 Control element, 220 Disturbance element, 222 Servo motor, 230 Dead time element, 240 Adder, 302 Processor, 304 Main memory, 306 Upper network Work controller, 308 field network controller, 312 memory card interface, 314 memory card, 316 local bus controller, 318 processor bus, 360 storage, 370 controller, 380 functional unit, 400 support device, 500 display device, 600 server device, 700 gateway, 900 Internet, Dt teacher data, L dead time, Lh estimated dead time, Mp prediction model, Pc control program, d fluctuation amount, dh estimated fluctuation amount, do measured value, ds disturbance, eq error, q controlled variable, qo secondary controlled variable, qoh estimated secondary controlled variable, qr target value, qw primary controlled variable, qwh estimated primary controlled variable, r manipulated variable, rb feedback manipulated variable, rf feedforward compensation value.
Claims (5)
- 外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のための装置であって、
前記フィードバック操作量は、目標値と前記制御対象の制御量との誤差に基づいて、前記制御量が前記目標値に近づくように前記制御対象の制御装置のフィードバック制御によって決定され、
前記フィードバック操作量から前記フィードフォワード補償値を引いた値が、操作量として前記制御対象に前記制御装置から出力され、
前記制御対象は、前記操作量を受ける数式モデルとして表現された制御要素と、前記外乱を受ける外乱要素と、むだ時間要素とによって表現され、
前記制御要素は、前記操作量を受けて一次制御量を出力し、
前記外乱要素は、前記外乱を受けて変動量を出力し、
前記むだ時間要素は、前記一次制御量と前記変動量との和が前記制御対象の外部に出力されることを遅延させ、
前記予測モデルは、前記外乱の計測値から前記変動量を前記特定値として予測し、
前記装置は、学習データ生成部を備え、
前記学習データ生成部は、
前記計測値、前記操作量、および前記制御量を取得し、
前記数式モデルを用いて前記操作量から前記一次制御量の第1推定値を生成し、
前記制御量と、前記制御量が取得された時刻から特定時間だけ戻された時刻の前記第1推定値とに基づいて前記変動量の第2推定値を生成し、
前記計測値に、前記計測値が取得された時刻の前記第2推定値を正解として対応させた教師データを出力する、装置。 An apparatus for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The device includes a training data generation unit,
The learning data generation unit
Acquiring the measured value, the manipulated variable, and the controlled variable;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was acquired;
and outputting teacher data in which the second estimate at the time when the measurement value was acquired corresponds to the measurement value as a correct answer. - 前記教師データを用いて前記予測モデルに対して前記機械学習を行う学習部をさらに備え、
前記学習部は、前記予測モデルによって表現される、前記計測値と前記第2推定値との関係を、前記第2推定値を目的変数とするとともに、前記計測値を説明変数とする関数として近似する、請求項1に記載の装置。 A learning unit that performs the machine learning on the prediction model using the teacher data,
The device according to claim 1 , wherein the learning unit approximates the relationship between the measurement value and the second estimated value, as represented by the prediction model, as a function having the second estimated value as a response variable and the measurement value as an explanatory variable. - 前記数式モデルの逆モデルを用いて、前記予測モデルによって予測された前記変動量から前記フィードフォワード補償値を生成するフィードフォワード補償部をさらに備える、請求項1または2に記載の装置。 The device according to claim 1 or 2, further comprising a feedforward compensation unit that generates the feedforward compensation value from the amount of variation predicted by the prediction model using an inverse model of the mathematical model.
- 外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のための教師データの生成プログラムであって、
前記フィードバック操作量は、目標値と前記制御対象の制御量との誤差に基づいて、前記制御量が前記目標値に近づくように前記制御対象の制御装置のフィードバック制御によって決定され、
前記フィードバック操作量から前記フィードフォワード補償値を引いた値が、操作量として前記制御対象に前記制御装置から出力され、
前記制御対象は、前記操作量を受ける数式モデルとして表現された制御要素と、前記外乱を受ける外乱要素と、むだ時間要素とによって表現され、
前記制御要素は、前記操作量を受けて一次制御量を出力し、
前記外乱要素は、前記外乱を受けて変動量を出力し、
前記むだ時間要素は、前記一次制御量と前記変動量との和が前記制御対象の外部に出力されることを遅延させ、
前記予測モデルは、前記外乱の計測値から前記変動量を前記特定値として予測し、
前記教師データの生成プログラムは、プロセッサに実行されることによって、
前記計測値、前記操作量、および前記制御量を取得し、
前記数式モデルを用いて前記操作量から前記一次制御量の第1推定値を生成し、
前記制御量と、前記制御量が取得された時刻から特定時間だけ戻された時刻の前記第1推定値とに基づいて前記変動量の第2推定値を生成し、
前記計測値に、前記計測値が取得された時刻の前記第2推定値を正解として対応させた教師データを出力する、教師データの生成プログラム。 A program for generating teacher data for machine learning for a prediction model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The program for generating teacher data is executed by a processor,
Acquiring the measured value, the manipulated variable, and the controlled variable;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was acquired;
a teacher data generating program that outputs teacher data in which the measurement value corresponds to the second estimated value at the time when the measurement value was acquired as a correct answer; - 外乱を受ける制御対象へのフィードバック操作量のフィードフォワード補償値の生成に必要な特定値を予測する予測モデルに対する機械学習のための教師データの生成方法であって、
前記フィードバック操作量は、目標値と前記制御対象の制御量との誤差に基づいて、前記制御量が前記目標値に近づくように前記制御対象の制御装置のフィードバック制御によって決定され、
前記フィードバック操作量から前記フィードフォワード補償値を引いた値が、操作量として前記制御対象に前記制御装置から出力され、
前記制御対象は、前記操作量を受ける数式モデルとして表現された制御要素と、前記外乱を受ける外乱要素と、むだ時間要素とによって表現され、
前記制御要素は、前記操作量を受けて一次制御量を出力し、
前記外乱要素は、前記外乱を受けて変動量を出力し、
前記むだ時間要素は、前記一次制御量と前記変動量との和が前記制御対象の外部に出力されることを遅延させ、
前記予測モデルは、前記外乱の計測値から前記変動量を前記特定値として予測し、
前記教師データの生成方法は、
前記計測値、前記操作量、および前記制御量を取得することと、
前記数式モデルを用いて前記操作量から前記一次制御量の第1推定値を生成することと、
前記制御量と、前記制御量が取得された時刻から特定時間だけ戻された時刻の前記第1推定値とに基づいて前記変動量の第2推定値を生成することと、
前記計測値に、前記計測値が取得された時刻の前記第2推定値を正解として対応させた教師データを出力することとを含む、教師データの生成方法。 A method for generating training data for machine learning for a prediction model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The method for generating teacher data includes:
Obtaining the measured values, the manipulated variables, and the controlled variables;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was obtained;
and outputting teacher data in which the second estimate value at the time when the measurement value was acquired corresponds to the measurement value as a correct answer.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023-032164 | 2023-03-02 | ||
JP2023032164A JP2024124174A (en) | 2023-03-02 | 2023-03-02 | Apparatus for machine learning related to feedforward control of a control target, program for generating teacher data, and method for generating teacher data |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024181536A1 true WO2024181536A1 (en) | 2024-09-06 |
Family
ID=92589894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2024/007551 WO2024181536A1 (en) | 2023-03-02 | 2024-02-29 | Device for machine learning related to feedforward control for control target, teacher data generation program, and teacher data generation method |
Country Status (2)
Country | Link |
---|---|
JP (1) | JP2024124174A (en) |
WO (1) | WO2024181536A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0527808A (en) * | 1991-07-18 | 1993-02-05 | Toshiba Corp | Controller using neural network model |
JP2010218007A (en) * | 2009-03-13 | 2010-09-30 | Omron Corp | Disturbance estimation device, control object model estimation device, feedforward amount estimation device, and controller |
JP2020060827A (en) * | 2018-10-05 | 2020-04-16 | 株式会社日立製作所 | Control device and control method |
JP2021176047A (en) * | 2020-05-01 | 2021-11-04 | 株式会社Mhiパワーコントロールシステムズ | Control device |
-
2023
- 2023-03-02 JP JP2023032164A patent/JP2024124174A/en active Pending
-
2024
- 2024-02-29 WO PCT/JP2024/007551 patent/WO2024181536A1/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0527808A (en) * | 1991-07-18 | 1993-02-05 | Toshiba Corp | Controller using neural network model |
JP2010218007A (en) * | 2009-03-13 | 2010-09-30 | Omron Corp | Disturbance estimation device, control object model estimation device, feedforward amount estimation device, and controller |
JP2020060827A (en) * | 2018-10-05 | 2020-04-16 | 株式会社日立製作所 | Control device and control method |
JP2021176047A (en) * | 2020-05-01 | 2021-11-04 | 株式会社Mhiパワーコントロールシステムズ | Control device |
Also Published As
Publication number | Publication date |
---|---|
JP2024124174A (en) | 2024-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6008898B2 (en) | Online adaptive model predictive control in process control systems. | |
US7272454B2 (en) | Multiple-input/multiple-output control blocks with non-linear predictive capabilities | |
JP6043348B2 (en) | How to monitor industrial processes | |
JP4722388B2 (en) | Setting and browsing display screen for integrated model predictive control and optimizer functional blocks | |
US11604459B2 (en) | Real-time control using directed predictive simulation within a control system of a process plant | |
CN103038714B (en) | The method of simulation industrial process, trace simulation device and automated system | |
JP2009277239A (en) | Integrated model predictive control and optimization in process control system | |
Mattera et al. | Optimal data-driven control of manufacturing processes using reinforcement learning: an application to wire arc additive manufacturing | |
Cui et al. | Adaptive fuzzy fault-tolerant control of high-order nonlinear systems: A fully actuated system approach | |
WO2024181536A1 (en) | Device for machine learning related to feedforward control for control target, teacher data generation program, and teacher data generation method | |
Ferdowsi et al. | Decentralized fault tolerant control of a class of nonlinear interconnected systems | |
EP3428746B1 (en) | A method and apparatus for providing an adaptive self-learning control program for deployment on a target field device | |
WO2016203757A1 (en) | Control device, information processing device in which same is used, control method, and computer-readable memory medium in which computer program is stored | |
Hoekstra et al. | Computationally efficient predictive control based on ANN state-space models | |
WO2023053514A1 (en) | Control device, control system, and control method | |
WO2024181534A1 (en) | Device for carrying out machine learning relating to feedforward control with respect to control object, program for generating teaching data, and method for generating teaching data | |
Du et al. | Reinforcement learning | |
JPH08152902A (en) | Adaptive processor | |
Abonyi et al. | Hybrid fuzzy convolution model and its application in predictive control | |
JP7521712B1 (en) | Control device, control method, and program | |
Nguyet et al. | A Computed Torque Controller for Robotic Manipulators Using Nonlinear Neural Network | |
Ghder Soliman et al. | Traditional versus fuzzy optimal coupling in multi‐axis distributed motion control: a pre‐IoT‐integration procedure | |
Ahmed | Developing and Testing of an MPC strategy for a four tanks multivariable process using Emerson Delta V system. | |
Heller et al. | Collision Prevention In Operation-Synchronized Simulations Using Dynamic Prescheduling Of Simulation Parameters | |
Siraj et al. | A robust adaptive predictive load frequency controller to compensate for model mismatch |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24764013 Country of ref document: EP Kind code of ref document: A1 |