WO2024181536A1

WO2024181536A1 - Device for machine learning related to feedforward control for control target, teacher data generation program, and teacher data generation method

Info

Publication number: WO2024181536A1
Application number: PCT/JP2024/007551
Authority: WO
Inventors: 高史藤井; 正樹浪江
Original assignee: オムロン株式会社
Priority date: 2023-03-02
Filing date: 2024-02-29
Publication date: 2024-09-06
Also published as: JP2024124174A

Abstract

The present invention improves the accuracy of feedforward control for a control target. A training data generation unit (130): acquires a measurement value (do), an operation amount (r), and a control amount (q); generates a first estimation value (qwh) for a primary control amount (qw) from the operation amount (r) by using a mathematical model (131) of a control element (210); generates a second estimation value (dh) for a variation amount (d) on the basis of the control amount (q) and the first estimation value (qwh) at a point in time a prescribed period (Lh) before the point in time when the control amount (q) was acquired; and outputs teacher data (Dt) in which the second estimation value (dh) for the point in time of the measurement value (do) has been associated with the measurement value (do) as a correct answer.

Description

Apparatus for machine learning related to feedforward control of a control target, program for generating teacher data, and method for generating teacher data

The present disclosure relates to a device for machine learning related to feedforward control of a control target, a program for generating teacher data, and a method for generating teacher data.

　Control devices that perform feedforward control on a control target are known. For example, JP 10-222207 A (Patent Document 1) discloses a feedforward control device that uses a feedforward signal generator to control a specific process that is subject to disturbances. This feedforward control device includes a parameter learner that learns parameters required for calculations in the feedforward signal generator from parameters calculated in the feedforward signal generator.

Japanese Patent Application Publication No. 10-222207

The configuration of the feedforward compensator included in the feedforward signal generator disclosed in Patent Document 1 is predetermined. In general, a configuration for calculating a feedforward compensation value for feedback control such as a feedforward compensator is designed as an inverse model (inverse system) of the process to be controlled. Designing an inverse model of a process requires analysis of the process, so it is difficult to update the structure of the inverse model in a short period of time in response to fluctuations in disturbances and changes in the characteristics of the process. Therefore, with the feedforward control device disclosed in Patent Document 1, it may be difficult to suppress the effect of disturbances on the control of the controlled object depending on the degree of fluctuation in the disturbance or the degree of change in the characteristics of the process.

This disclosure has been made to solve the problems described above, and its purpose is to improve the accuracy of feedforward control of the controlled object.

An apparatus according to one aspect of the present disclosure is an apparatus for machine learning for a predictive model that predicts a specific value required to generate a feedforward compensation value for a feedback manipulated variable to a controlled object that is subject to a disturbance. The feedback manipulated variable is determined by feedback control of a control device for the controlled object, based on an error between a target value and the controlled variable of the controlled object, so that the controlled variable approaches a target value. A value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable. The controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element. The control element receives the manipulated variable and outputs a primary controlled variable. The disturbance element receives the disturbance and outputs a fluctuation amount. The dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object. The predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance. The apparatus includes a learning data generating unit. The learning data generation unit acquires the measurement values, the operation variables, and the control variables, generates a first estimate of the primary control variables from the operation variables using a mathematical model of the control elements, generates a second estimate of the fluctuation variables based on the control variables and the first estimate at a time that is a specific time back from the time at which the control variables were acquired, and outputs training data in which the measurement values are associated with the second estimate at the time at which the measurement values were acquired as the correct answer.

According to this disclosure, since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the control object.

In the above disclosure, the device further includes a learning unit that performs machine learning on the prediction model using teacher data. The learning unit approximates the relationship between the measurement value and the second estimated value, which is expressed by the prediction model, as a function with the second estimated value as the objective variable and the measurement value as the explanatory variable.

According to this disclosure, a predictive model can be adapted in real time to the measured values and characteristics of the controlled object in parallel with feedforward control.

In the above disclosure, the device further includes a feedforward compensation unit that generates a feedforward compensation value from the amount of variation predicted by the prediction model using an inverse model of the mathematical model.

According to this disclosure, it is possible to make the control amount output from the controlled object after a specific time has elapsed closer to a normal value corresponding to the feedback manipulation amount at a time that is a specific time before the time of the control amount.

A teacher data generation program according to another aspect of the present disclosure is a program for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulation variable for a control object that is subject to a disturbance. The feedback manipulation variable is determined by feedback control of a control device for the control object so that the control variable approaches a target value based on an error between a target value and the control variable of the control object. A value obtained by subtracting the feedforward compensation value from the feedback manipulation variable is output from the control device to the control object as a manipulation variable. The control object is represented by a control element expressed as a mathematical model that receives the manipulation variable, a disturbance element that is subject to a disturbance, and a dead time element. The control element receives the manipulation variable and outputs a primary control variable. The disturbance element receives the disturbance and outputs a fluctuation variable. The dead time element delays the output of the sum of the primary control variable and the fluctuation variable to the outside of the control object. The predictive model predicts the fluctuation variable as a specific value from a measured value of the disturbance. When executed by the processor, the program acquires the measurement values, the manipulated variables, and the controlled variables, generates a first estimate of the primary controlled variable from the manipulated variables using a mathematical model of the control element, generates a second estimate of the fluctuation variable based on the controlled variable and the first estimate at a time that is a specific time back from the time when the controlled variable was acquired, and outputs training data that corresponds the measurement values to the second estimate at the time when the measurement values were acquired as the correct answer.

According to this disclosure, since the teacher data associates an appropriate correct answer with the measured value, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of a control target. Note that the invention disclosed herein can also be realized as a non-transitory computer-readable medium that stores the above program.

A method for generating teacher data according to another aspect of the present disclosure is a method for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a controlled object that is subject to a disturbance. The feedback manipulated variable is determined by feedback control of a control device for the controlled object so that the controlled variable approaches a target value based on an error between a target value and the controlled variable of the controlled object. A value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable. The controlled object is represented by a control element expressed as a mathematical model that receives the manipulated variable, a disturbance element that receives a disturbance, and a dead time element. The control element receives the manipulated variable and outputs a primary controlled variable. The disturbance element receives the disturbance and outputs a fluctuation amount. The dead time element delays the sum of the primary controlled variable and the fluctuation amount from being output to the outside of the controlled object. The predictive model predicts the fluctuation amount as a specific value from a measured value of the disturbance. The method includes acquiring a measurement value, a manipulated variable, and a controlled variable, generating a first estimate of a primary controlled variable from the manipulated variable using a mathematical model of the control element, generating a second estimate of a fluctuation amount based on the controlled variable and the first estimate at the time when the controlled variable was acquired, and outputting training data in which the measurement value is associated with the second estimate at the time when the measurement value was acquired as a correct answer.

According to this disclosure, since the teacher data associates appropriate correct answers with the measured values, the teacher data can be used to perform machine learning on a predictive model, thereby improving the accuracy of feedforward control of the controlled object.

The device, program, and method disclosed herein can improve the accuracy of feedforward control of a control target.

2 is a block diagram showing a functional configuration of a control device according to the first embodiment; FIG. 13 is a diagram showing an example of a distribution of a plurality of training data, and an example of a prediction model as a regression surface. FIG. FIG. 2 is a flowchart showing the flow of processing performed by each of a feedback control system, a feedforward control system, a learning data generating unit, and a learning unit in FIG. 1 . 4 is a flowchart showing a specific process flow of the learning data generation process of FIG. 3 . 5 is a diagram for explaining the correspondence between the times of the measurement values, the manipulated variables, the controlled variables, and the estimated primary controlled variables in FIG. 4 and the times of the estimated secondary controlled variables that are set back from the times of the controlled variables by a dead time. FIG. 2 is a block diagram showing a functional configuration of a control device according to a first modified example of the first embodiment. FIG. 11 is a block diagram showing a functional configuration of a control system according to a second embodiment. FIG. 13 is a schematic diagram showing an example of a network configuration of a control system according to a third embodiment. 9 is a block diagram showing an example of a hardware configuration of the control device shown in FIG. 8 .

The following describes the embodiments in detail with reference to the drawings. Note that the same or corresponding parts in the drawings are given the same reference numerals, and as a rule, their explanations will not be repeated.

[First embodiment]
<Application Examples>
Fig. 1 is a block diagram showing a functional configuration of a control device 100 according to embodiment 1. As shown in Fig. 1, the control device 100 includes a feedback control unit 110, a feedforward compensation unit 120, a learning data generation unit 130, a storage unit 140, a subtractor 150, a subtractor 160, and a learning unit 170.

The control device 100 outputs to the control object 200 as the manipulated variable r, a value obtained by subtracting the feedforward compensation value rf from the feedback manipulated variable rb to the control object 200, so that the controlled variable q, which is the output value of the control object 200 subjected to the disturbance ds, approaches the target value qr. The memory unit 140 stores the prediction model Mp and the teacher data Dt. The control device 100 and the control object 200 are connected via a network (for example, the Internet or a cloud system) and may be located remotely from each other. Examples of machine learning algorithms for constructing the prediction model Mp include a deep binary tree or a support vector machine.

The disturbance ds is a quantity that disturbs the state of the control system including the control device 100 and the controlled object 200. The disturbance ds includes multiple disturbance elements, such as the amount of light, voltage, current, and temperature that are accidentally or suddenly input to the controlled object. The measured value of the disturbance ds is expressed as do.

The transmission of information (signals) in the controlled object 200 is represented by the control element 210, the disturbance element 220, the dead time element 230, and the adder 240. The control element 210 receives an operation amount r and outputs a primary control amount qw to the adder 240. The disturbance element 220 receives a disturbance ds and outputs a fluctuation amount d to the adder 240. The adder 240 adds the fluctuation amount d to the primary control amount qw and outputs a secondary control amount qo to the dead time element 230. The dead time element 230 delays the output of the secondary control amount qo to the outside of the controlled object 200. Specifically, the dead time element 230 outputs the secondary control amount qo as the control amount q to the outside of the controlled object 200 after the lapse of dead time L after receiving the secondary control amount qo.

Hereinafter, the configuration including the feedback control unit 110 and the subtractor 150 is also referred to as a feedback control system, and the configuration including the feedforward compensation unit 120 and the subtractor 160 is also referred to as a feedforward control system.

The subtractor 150 outputs the error eq (=qr-q) between the target value qr and the controlled variable q to the feedback control unit 110. The feedback control unit 110 determines the feedback control variable rb based on the error eq and outputs it to the subtractor 160.

The learning data generating unit 130 generates teacher data Dt used in machine learning for the prediction model Mp. The learning data generating unit 130 acquires a measurement value do, an operation amount r, and a control amount q at each of a plurality of timings. The learning data generating unit 130 estimates a fluctuation amount d as an estimated fluctuation amount dh from the operation amount r and the control amount q. The learning data generating unit 130 outputs a combination of the measurement value do (explanatory variable) and the estimated fluctuation amount dh (objective variable) to the storage unit 140 as teacher data Dt. The learning data generating unit 130 may output the teacher data Dt directly to the learning unit 170.

The variation d corresponding to the measured value do at time t1 is added to the primary control amount qw at time t1 to become the secondary control amount qo. However, the timing at which the secondary control amount qo is output from the controlled object 200 as the control amount q is time t2 (= t1 + L), which is the time when the dead time L has elapsed from time t1. Therefore, the learning data generation unit 130 associates the measured value do at time t1 with the control amount q at time t2, not with the control amount q at time t1. As a result, in the teacher data Dt output from the learning data generation unit 130, the estimated variation amount dh of the variation amount d at the same time as the measured value do is measured is considered to be the correct answer for the measured value do. In other words, in the teacher data Dt generated by the learning data generation unit 130, the correspondence between the explanatory variable and the objective variable is correctly set. According to the control device 100, machine learning can be performed on the prediction model Mp using the teacher data Dt having an appropriate correspondence, so that the prediction accuracy of the prediction model Mp can be improved. As a result, the accuracy of the feedforward control of the control object 200 performed using the trained prediction model Mp can be improved.

The transmission of information in the learning data generation unit 130 is represented by a mathematical model 131, a time converter 132, a subtractor 133, and an output generator 134. The mathematical model 131 is estimated as a correspondence between the operation amount r and the primary control amount qw, and is represented as a function with the operation amount r as an argument. Below, the mathematical model 131 is also represented as Gwh(r).

The mathematical model 131 receives the manipulated variable r and outputs an estimated primary control variable qwh (first estimated value) to the subtractor 133. The estimated primary control variable qwh is an estimated value corresponding to the primary control variable qw. The time converter 132 returns the measurement time t10 of the control variable q by an estimated dead time Lh (specific time) and outputs an estimated secondary control variable qoh to the subtractor 133. The estimated dead time Lh is an estimated value of the dead time L. The estimated secondary control variable qoh is an estimated value of the secondary control variable qo. The mathematical model 131 and the estimated dead time Lh can be determined in physical modeling or system identification based on general design information in feedback control of the control object 200. For example, the feedback control unit 110 in an open loop state may apply a step signal to the control object 200, and the gain, time constant, and dead time may be calculated from the response data of the control variable. From these calculation results, the PID (Proportional-Integral-Differential) parameters for feedback control can be determined using the Ziegler Nichols method or the CHR (Chien-Hrones-Reswick) method.

The value obtained by subtracting the primary control amount qw acquired at time t11 from the secondary control amount qo acquired at time t11 is the fluctuation amount d at time t11. The primary control amount qw is the control amount q at time t12, which is a period of dead time L that has elapsed since time t11. The control amount q acquired at time t12 is set by the time converter 132 to the estimated secondary control amount qoh at time t11, which is set back from time t12 by the estimated dead time Lh. The value obtained by subtracting the estimated primary control amount qw at time t11 from the estimated secondary control amount qoh at time t11 is the estimated fluctuation amount dh (second estimated value), which is an estimate of the fluctuation amount d at time t11.

The subtractor 133 subtracts the estimated primary control amount qwh at the same time as the estimated secondary control amount qoh from the estimated secondary control amount qoh, and outputs the estimated fluctuation amount dh to the output generator 134. The output generator 134 outputs a combination of the measured value do and the estimated fluctuation amount dh at the same time as the measured value do to the memory unit 140 as the teacher data Dt.

The learning unit 170 uses multiple pieces of teacher data Dt to approximate the relationship between the measured value do and the estimated variation amount dh as a function (regression curve or regression surface) with the estimated variation amount dh as the objective variable and the measured value do as the explanatory variable. The prediction model Mp includes this function. When machine learning (initial learning or additional learning) for the prediction model Mp is completed and the characteristics of the control object 200 have changed, the learning unit 170 resumes machine learning (additional learning) for the prediction model Mp. The characteristics of the control object 200 include, for example, the correspondence between the measured value do and the operation amount r and the control amount q.

FIG. 2 shows an example of the distribution of multiple training data Dt, and an example of a prediction model Mp as a regression surface. In FIG. 2, the measured value do includes multiple elements (dimensions) do1 and do2, and Gwh(r) = 1.23 x r.

1 again, the feedforward compensation unit 120 acquires an estimated fluctuation amount dh from the measurement value do using the prediction model Mp. The feedforward compensation unit 120 includes an inverse model 121. The inverse model 121 is configured as an inverse function Gwh ^-1 (qwh) of the mathematical model 131 that takes an estimated primary controlled variable qwh as an argument and outputs an operation variable r. The inverse model 121 receives the estimated fluctuation amount dh from the prediction model Mp as an input.

The purpose of feedforward control is to suppress disturbance of the primary control amount qw due to disturbance ds. Therefore, the feedforward compensation unit 120 obtains from the inverse model 121, as a feedforward compensation value rf, a manipulation amount to be input to the control element 210 such that the value output from the control element 210 becomes the estimated fluctuation amount dh. The subtractor 160 subtracts the feedforward compensation value rf from the feedforward compensation unit 120 from the feedback manipulation amount rb, and outputs the manipulation amount r (= rb - rf) to the control element 210. In the primary control amount qw corresponding to the manipulation amount r, the increase in the fluctuation amount d is suppressed in advance by the feedforward compensation value rf. Therefore, even if the fluctuation amount d is added to the primary control amount qw, the disturbance due to the fluctuation amount d in the secondary control amount qo is canceled out by the feedforward compensation value rf. As a result, the controlled variable q output from the controlled object 200 after the dead time L has elapsed approaches a normal value corresponding to the feedback manipulated variable rb at a time that is the dead time L before the time of the controlled variable q.

According to the control device 100, since an appropriate correct answer is associated with the measurement value do in the teacher data Dt, the accuracy of feedforward control for the control object 200 can be improved by performing machine learning on the prediction model Mp using the teacher data Dt. Furthermore, according to the control device 100, in parallel with the feedforward control, the prediction model Mp can be adapted in real time to the measurement value do and the characteristics of the control object 200. Furthermore, since machine learning is continued until the accuracy of the prediction model Mp becomes sufficiently high, the accuracy of the feedforward control for the control object 200 can be sufficiently improved. Furthermore, since the prediction model Mp is re-adapted to the characteristics in response to changes in the characteristics of the control object 200, a decrease in the accuracy of the feedforward control due to changes in the characteristics of the control object 200 can be suppressed.

FIG. 3 is a diagram showing a flowchart illustrating the flow of processing performed by each of the feedback control system, the feedforward control system, the learning data generation unit 130, and the learning unit 170 in FIG. 1. The routines corresponding to the flowcharts of the feedback control system and the feedforward control system are executed, for example, at each sampling time. The routines corresponding to the flowcharts of the learning data generation unit 130 and the learning unit 170 are executed, for example, in response to the first execution of the routines corresponding to the flowcharts of the feedforward control system. Below, steps are simply abbreviated as S.

As shown in FIG. 3, in S111, the subtractor 150 calculates the error eq between the target value qr and the controlled variable q, and outputs it to the feedback control unit 110. In S312, the feedback control unit 110 determines the feedback control variable rb based on the error eq, outputs it to the subtractor 160, and ends the process.

In S121, the feedforward compensation unit 120 determines a feedforward compensation value rf from the measurement value do and outputs it to the subtractor 160. In S122, the subtractor 160 outputs the difference obtained by subtracting the feedforward compensation value rf from the feedback control amount rb as the control amount r to the control object 200 and the learning data generation unit 130, and the process ends.

The training data generation unit 130 generates training data Dt in S130 and ends the process. The learning unit 170 performs machine learning on the prediction model Mp using the training data Dt in S170 and ends the process.

FIG. 4 is a flowchart showing a specific process flow of the learning data generation process S130 in FIG. 3. FIG. 5 is a diagram for explaining the correspondence between the time of the measured value do, the operation amount r, the control amount q, and the estimated primary control amount qwh in FIG. 4, and the time of the estimated secondary control amount qoh, which is set back from the time of the control amount q by the estimated dead time Lh. FIG. 5 shows a case where multiple measured values do, multiple operation amounts r, and multiple control amounts q are obtained during the trial period from time t21 to t24. Time t22 is the time (t21+Lh) that is the estimated dead time Lh after time t21. Time t23 (>t22) is the time (t24-Lh) that is the estimated dead time Lh back from time t24.

Referring to both FIG. 4 and FIG. 5, in S131, the learning data generation unit 130 acquires the measurement value do, the operation amount r, and the control amount q at each of the multiple timings, and proceeds to S132. In S132, the learning data generation unit 130 acquires an estimated primary control amount qwh from the operation amount r using the mathematical model 131, and proceeds to S133. The learning data generation unit 130 moves the time of the control amount q back by the estimated dead time Lh to generate an estimated secondary control amount qoh, and proceeds to S134. As shown in FIG. 5, by the process of S133, the multiple control amounts q included in times t22 to t23 acquired in S131 are slid as a whole in the direction going back in time by the estimated dead time Lh.

Referring again to FIG. 4, the learning data generation unit 130 subtracts the estimated primary control amount qwh at the time of the estimated secondary control amount qoh from the estimated secondary control amount qoh to generate an estimated variation amount dh, and proceeds to S135. The learning data generation unit 130 sets the correct answer for the measured value do to the estimated variation amount dh at the time of the measured value do, generates teacher data Dt, and ends the process.

As shown in FIG. 5, among the multiple control variables q acquired during the trial period from time t21 to t24, the time of the control variables q acquired from time t21 to t22 is returned to before time t21 by the processing of S133. There is no corresponding estimated primary control variable qwh before time t21. Therefore, the control variables q acquired from time t21 to t22 are not used in generating the teacher data Dt. Furthermore, there are no estimated secondary control variables qoh and estimated fluctuation variables dh from time t23 to t24. Therefore, the measurement value do, operation variable r, and estimated primary control variable qwh acquired during the period from time t23 to t24 are not used in generating the teacher data Dt.

[First Modification of First Embodiment]
In the first embodiment, a configuration including both a feedback control system and a feedforward control system has been described. In the first modification of the first embodiment, a configuration not including a feedback control system will be described.

FIG. 6 is a block diagram showing the functional configuration of control device 100A according to variant 1 of embodiment 1. Control device 100A has a configuration in which subtractor 150 and feedback control unit 110 are removed from control device 100 in FIG. 1. Since the rest of the configuration is similar, the description of the similar configuration will not be repeated. Note that subtractor 160 does not have to be included in control device 100A.

As shown in FIG. 6, the control device 100A determines a feedforward compensation value rf of the feedback operation amount rb to the control object 200 such that the control amount q of the control object 200 receiving the measurement value do approaches a target value qr. According to the control device 100A, by adding a control device to an existing feedback control system while leaving the existing feedback control system, the existing feedback control system can be easily expanded into a feedforward control system and a control system including a learning function.

As described above, the device and method according to the first embodiment and the first modification can improve the accuracy of feedforward control of the control target.

[Embodiment 2]
In the first embodiment, a case where a feedback control system, a feedforward control system, and a configuration for performing machine learning on a predictive model are included in one control device is described. In the second embodiment, a configuration is described in which the feedback control system, the feedforward control system, and a configuration for performing machine learning on a predictive model are separated into separate devices.

FIG. 7 is a block diagram showing the functional configuration of a control system 2 according to embodiment 2. In FIG. 7, components with the same reference symbols as those in FIG. 1 have the same functions as the components identified by the reference symbols described in embodiment 1, and therefore the description of the similar components will not be repeated.

As shown in FIG. 7, the control system 2 includes a feedback control device 11, a feedforward compensation device 12, a learning data generation device 13, a storage device 14, and a learning device 17. The learning data generation device 13, the storage device 14, and the learning device 17 correspond to the learning data generation unit 130, the storage unit 140, and the learning unit 170 in FIG. 1, respectively. The feedback control device 11 includes a feedback control unit 110 and a subtractor 150. The feedforward compensation device 12 includes a feedforward compensation unit 120 and a subtractor 160. The feedback control device 11, the feedforward compensation device 12, the learning data generation device 13, the storage device 14, the learning device 17, and the control target 200 may be connected to each other via a network and may be located remotely from each other. The subtractor 160 may be included in the feedback control device 11, not in the feedforward compensation device 12.

According to control system 2, an existing control system can be easily expanded by adding a feedforward compensation device, a learning data generation device, and a learning device to the existing feedback control device while leaving the existing feedback control device in place.

As described above, the device and method according to the second embodiment can improve the accuracy of feedforward control of the control target.

[Embodiment 3]
In the third embodiment, as an example of the control device according to the first embodiment, a configuration in which the control device includes a PLC (Programmable Logic Controller) will be described.

<Control system network configuration example>
Fig. 8 is a schematic diagram showing an example of a network configuration of a control system 3 according to the third embodiment. As shown in Fig. 8, the control system 3 includes a device group in which a plurality of devices are configured to be able to communicate with each other. Typically, the devices may include a control device 300 that is a processing entity that executes a control program, and peripheral devices connected to the control device 300. The control device 300 has a functional configuration similar to that of the control device 100 shown in Fig. 1.

The control device 300 corresponds to an industrial controller that controls control targets such as various facilities or devices. The control device 300 is a type of computer that executes control calculations, and typically includes a PLC (Programmable Logic Controller). The control device 300 is connected to a field device 200C via a field network 20. The control device 300 exchanges data with at least one field device 200C via the field network 20.

The control calculations executed in the control device 300 include a process of collecting data collected or generated in the field device 200C, a process of generating data such as a command value (operation amount) for the field device 200C, and a process of transmitting the generated output data to the target field device 200C. The data collected or generated in the field device 200C includes data on disturbances input to the field device 200C, and a control amount resulting from the actual operation of the field device 200C in accordance with the command value. The command value for the field device 200C is determined by adding a feedforward compensation value predicted from the disturbance by a prediction model to the operation amount provisionally calculated based on the error between the control target value (target value) calculated based on the control program executed by the control device 300 and the actual control amount.

The field network 20 preferably employs a bus or network that performs periodic communication. Known examples of such buses or networks that perform periodic communication include EtherCAT (registered trademark), EtherNet/IP (registered trademark), DeviceNet (registered trademark), and CompoNet (registered trademark). EtherCAT (registered trademark) is preferable because it guarantees the arrival time of data.

Any field device 200C can be connected to the field network 20. The field device 200C includes an actuator that exerts some kind of physical action on a robot or conveyor in the field, and an input/output device that exchanges information with the field.

In the control system 3, the field device 200C includes a plurality of servo drivers 220_1 and 220_2, and a plurality of servo motors 222_1 and 222_2 connected to the plurality of servo drivers 220_1 and 220_2, respectively. The field device 200C is an example of a "controlled object."

Servo drivers 220_1 and 220_2 drive corresponding servo motors of servo motors 222_1 and 222_2 according to command values (such as position command values or speed command values) from control device 300. In this way, control device 300 can control field device 200C.

The control device 300 is also connected to other devices via a higher-level network 32. The higher-level network 32 is connected to the Internet 900, which is an external network, via a gateway 700. The higher-level network 32 may use Ethernet (registered trademark) or EtherNet/IP (registered trademark), which are common network protocols. More specifically, at least one server device 600 and at least one display device 500 may be connected to the higher-level network 32.

The server device 600 may be a database system or a manufacturing execution system (MES). The manufacturing execution system acquires information from the controlled manufacturing device or facility to monitor and manage the entire production, and may also handle order information, quality information, shipping information, and the like. In addition to these, a device that provides an information system service may be connected to the upper network 32. An example of an information system service is processing that acquires information from the controlled manufacturing device or facility and performs macro or micro analysis. For example, an example of an information system service is data mining that extracts some characteristic tendency contained in information from the controlled manufacturing device or facility, or a machine learning tool that performs machine learning based on information from the controlled facility or machine.

The display device 500 receives operations from the user and outputs commands to the control device 300 in response to the user operations, and also graphically displays the results of calculations performed by the control device 300.

The support device 400 can be connected to the control device 300. The support device 400 may be connected to the control device 300 via the higher-level network 32 or the Internet 900. The support device 400 is a device that assists the control device 300 in making the necessary preparations to control the control target. Specifically, the support device 400 provides a development environment (program creation and editing tools, parsers, compilers, etc.) for programs executed by the control device 300, a setting environment for setting configuration information (configuration) for the control device 300 and various devices connected to the control device 300, a function for outputting the generated program to the control device 300, and a function for modifying and changing the program executed on the control device 300 online.

In the control system 3, the control device 300, the support device 400, and the display device 500 are each configured as separate entities, but a configuration may be adopted in which all or part of these functions are integrated into a single device.

The control device 300 is not limited to being used at one production site, but may also be used at other production sites. It may also be used at multiple different lines at one production site.

<Example of hardware configuration of control device>
Fig. 9 is a block diagram showing an example of a hardware configuration of the control device 300 in Fig. 8. As shown in Fig. 7, the control device 300 includes a processor 302, a main memory 304, a storage 360, a memory card interface 312, a host network controller 306, a field network controller 308, a local bus controller 316, and a Universal Serial Bus (USB) controller 370 that provides a USB interface. These components are connected via a processor bus 318.

As shown in FIG. 9, the processor 302 corresponds to an arithmetic processing unit that executes control calculations, and is composed of a CPU (Central Processing Unit) and the like. The processor 302 may also include a GPU (Graphics Processing Unit). Specifically, the processor 302 reads out a program stored in the storage 360, expands it in the main memory 304, and executes it to realize control calculations for a control object.

Main memory 304 is composed of a volatile storage device such as a dynamic random access memory (DRAM). Main memory 304 may also include a static random access memory (SRAM). Storage 360 is composed of a non-volatile storage device such as a solid state drive (SSD). Storage 360 may also include a hard disk drive (HDD).

Storage 360 stores control program Pc, teacher data Dt, and prediction model Mp. Storage 360 corresponds to memory unit 140 in FIG. 1. Control program Pc includes a program for comprehensively controlling control device 300 and realizing each function of control device 300. In other words, processor 302 that executes control program Pc corresponds to the feedback control system (feedback control unit 110 and subtractor 150), feedforward control system (feedforward compensation unit 120 and subtractor 160), learning data generation unit 130, and learning unit 170 in FIG. 1.

The memory card interface 312 accepts a memory card 314, which is an example of a removable storage medium. The memory card interface 312 is capable of reading and writing any data to the memory card 314.

The upper network controller 306 exchanges data with any information processing device connected to the upper network 32 (e.g., a local area network) via the upper network 32.

The field network controller 308 exchanges data with any devices such as servo motors 222_1 and 222_2 via the field network 20.

The local bus controller 316 exchanges data with any of the functional units 380 constituting the control device 300 via the local bus 122. The functional units 380 may, for example, be an analog I/O unit responsible for at least one of the input and output of analog signals, a digital I/O unit responsible for at least one of the input and output of digital signals, and a counter unit that receives pulses from an encoder or the like.

The USB controller 370 exchanges data with any information processing device via a USB connection. For example, the support device 400 is connected to the USB controller 370.

As described above, the device, program, and control method according to the third embodiment can improve the accuracy of feedforward control of the control target.

<Additional Notes>
The present embodiment as described above includes the following technical idea.

[Configuration 1]
An apparatus (100) for machine learning of a prediction model (Mp) for predicting a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The device (100) includes a learning data generation unit (130),
The learning data generation unit (130)
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
The device (100) outputs teacher data (Dt) in which the second estimated value (dh) at the time when the measurement value (do) is acquired corresponds to the measurement value (do) as a correct answer.

[Configuration 2]
A learning unit (170) that performs the machine learning on the prediction model (Mp) using the teacher data (Dt),
The device (100) described in configuration 1, wherein the learning unit (170) approximates the relationship between the measurement value (do) and the second estimated value (dh), as expressed by the prediction model (Mp), as a function having the second estimated value (dh) as a response variable and the measurement value (do) as an explanatory variable.

[Configuration 3]
The apparatus (100) according to

configuration

1 or 2, further comprising a feedforward compensation unit (120) that generates the feedforward compensation value (rf) from the fluctuation amount (d) predicted by the prediction model (Mp) using an inverse model (121) of the mathematical model (131).

[Configuration 4]
A program for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds),
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The program, when executed by a processor,
Acquire the measured value (do), the manipulated variable (r), and the controlled variable (q);
The time of the control amount (q) is returned by a specific time (Lh),
generating a first estimate (qwh) of the primary controlled variable (qw) from the manipulated variable (r) using the mathematical model (131);
generating a second estimate (dh) of the fluctuation amount (d) by subtracting the first estimate (qwh) at the time of the controlled variable (q) from the controlled variable (q);
A program for generating teacher data that outputs teacher data (Dt) in which the second estimated value (dh) at the time of the measurement value (do) corresponds to the measurement value (do) as a correct answer.

[Configuration 5]
A method for generating teacher data for machine learning for a prediction model (Mp) that predicts a specific value required for generating a feedforward compensation value (rf) of a feedback manipulated variable (rb) for a control target (200) subjected to a disturbance (ds), comprising:
The feedback manipulated variable (rb) is determined by feedback control of the control device (100) of the controlled object (200) based on an error between a target value (qr) and a controlled variable (q) of the controlled object (200) so that the controlled variable (q) approaches the target value (qr);
A value obtained by subtracting the feedforward compensation value (rf) from the feedback operation amount (rb) is output from the control device (100) to the controlled object (200) as an operation amount (r);
The controlled object (200) is represented by a control element (210) represented as a mathematical model (131) that receives the manipulated variable (r), a disturbance element (220) that receives the disturbance (ds), and a dead time element (230),
The control element (210) receives the manipulated variable (r) and outputs a primary controlled variable (qw);
The disturbance element (220) receives the disturbance (ds) and outputs a fluctuation amount (d);
The dead time element (230) delays the sum of the primary controlled variable (qw) and the fluctuation variable (d) from being output to the outside of the controlled object (200);
The prediction model (Mp) predicts the fluctuation amount (d) as the specific value from a measurement value (do) of the disturbance (ds),
The method comprises:
Obtaining the measured value (do), the manipulated variable (r), and the controlled variable (q);
generating a first estimate (q) of the primary controlled variable (q) from the manipulated variable (r) using a mathematical model (131) of the control element (210);
generating a second estimate (dh) of the fluctuation amount (d) based on the controlled variable (q) and the first estimate (qwh) at a time that is a specific time (Lh) back from the time when the controlled variable (q) is acquired;
A method for generating teacher data, comprising: outputting teacher data (Dt) in which the measurement value (do) corresponds to the second estimated value (dh) at the time when the measurement value (do) was obtained as a correct answer.

The embodiments disclosed herein are intended to be combined as appropriate within the scope of compatibility. The embodiments disclosed herein should be considered in all respects as illustrative and not restrictive. The scope of the present invention is defined by the claims, not the above description, and is intended to include all modifications within the meaning and scope of the claims.

2, 3 Control system, 11 Feedback control device, 12 Feedforward compensation device, 13 Learning data generation device, 14 Storage device, 17 Learning device, 20 Field network, 32 Upper network, 100, 100A, 300 Control device, 110 Feedback control unit, 120 Feedforward compensation unit, 121 Inverse model, 122 Local bus, 130 Learning data generation unit, 131 Mathematical model, 132 Time converter, 133, 150, 160 Subtractor, 134 Output generator, 140 Storage unit, 170 Learning unit, 200 Control target, 200C Field device, 210 Control element, 220 Disturbance element, 222 Servo motor, 230 Dead time element, 240 Adder, 302 Processor, 304 Main memory, 306 Upper network Work controller, 308 field network controller, 312 memory card interface, 314 memory card, 316 local bus controller, 318 processor bus, 360 storage, 370 controller, 380 functional unit, 400 support device, 500 display device, 600 server device, 700 gateway, 900 Internet, Dt teacher data, L dead time, Lh estimated dead time, Mp prediction model, Pc control program, d fluctuation amount, dh estimated fluctuation amount, do measured value, ds disturbance, eq error, q controlled variable, qo secondary controlled variable, qoh estimated secondary controlled variable, qr target value, qw primary controlled variable, qwh estimated primary controlled variable, r manipulated variable, rb feedback manipulated variable, rf feedforward compensation value.

Claims

An apparatus for machine learning for a predictive model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The device includes a training data generation unit,
The learning data generation unit
Acquiring the measured value, the manipulated variable, and the controlled variable;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was acquired;
and outputting teacher data in which the second estimate at the time when the measurement value was acquired corresponds to the measurement value as a correct answer.
A learning unit that performs the machine learning on the prediction model using the teacher data,
The device according to claim 1 , wherein the learning unit approximates the relationship between the measurement value and the second estimated value, as represented by the prediction model, as a function having the second estimated value as a response variable and the measurement value as an explanatory variable.
The device according to claim 1 or 2, further comprising a feedforward compensation unit that generates the feedforward compensation value from the amount of variation predicted by the prediction model using an inverse model of the mathematical model.
A program for generating teacher data for machine learning for a prediction model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The program for generating teacher data is executed by a processor,
Acquiring the measured value, the manipulated variable, and the controlled variable;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was acquired;
a teacher data generating program that outputs teacher data in which the measurement value corresponds to the second estimated value at the time when the measurement value was acquired as a correct answer;
A method for generating training data for machine learning for a prediction model that predicts a specific value required for generating a feedforward compensation value of a feedback manipulated variable for a control target subjected to a disturbance, comprising:
the feedback manipulated variable is determined by feedback control of a control device of the controlled object based on an error between a target value and a controlled variable of the controlled object so that the controlled variable approaches the target value;
a value obtained by subtracting the feedforward compensation value from the feedback manipulated variable is output from the control device to the controlled object as a manipulated variable;
the controlled object is represented by a control element represented as a mathematical model that receives the manipulated variable, a disturbance element that receives the disturbance, and a dead time element;
The control element receives the manipulated variable and outputs a primary controlled variable;
The disturbance element outputs a fluctuation amount in response to the disturbance,
the dead time element delays the sum of the primary controlled variable and the fluctuation variable from being output to an outside of the controlled object;
the prediction model predicts the amount of fluctuation as the specific value from the measured value of the disturbance;
The method for generating teacher data includes:
Obtaining the measured values, the manipulated variables, and the controlled variables;
generating a first estimate of the primary controlled variable from the manipulated variable using the mathematical model;
generating a second estimate of the amount of variation based on the controlled variable and the first estimate at a time that is a specific time behind the time at which the controlled variable was obtained;
and outputting teacher data in which the second estimate value at the time when the measurement value was acquired corresponds to the measurement value as a correct answer.