[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2022009419A1 - Learning device, utilization device, program, learning method, and utilization method - Google Patents

Learning device, utilization device, program, learning method, and utilization method Download PDF

Info

Publication number
WO2022009419A1
WO2022009419A1 PCT/JP2020/027027 JP2020027027W WO2022009419A1 WO 2022009419 A1 WO2022009419 A1 WO 2022009419A1 JP 2020027027 W JP2020027027 W JP 2020027027W WO 2022009419 A1 WO2022009419 A1 WO 2022009419A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
subject
target
thermal image
visible
Prior art date
Application number
PCT/JP2020/027027
Other languages
French (fr)
Japanese (ja)
Inventor
康平 栗原
大祐 鈴木
Original Assignee
三菱電機株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 三菱電機株式会社 filed Critical 三菱電機株式会社
Priority to PCT/JP2020/027027 priority Critical patent/WO2022009419A1/en
Priority to JP2020552066A priority patent/JP6797344B1/en
Publication of WO2022009419A1 publication Critical patent/WO2022009419A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • This disclosure relates to learning devices, utilization devices, programs, learning methods and utilization methods.
  • a general thermal infrared solid-state image sensor (hereinafter referred to as a thermal image sensor) visualizes incident infrared rays emitted by a subject, and the difference in temperature rise caused by absorbing the infrared rays becomes the shade of the image. Infrared rays emitted by the subject are focused by the lens and imaged on the image sensor.
  • thermal image sensor capable of acquiring thermal information can acquire information that cannot be acquired by a visible camera, for example, an inexpensive small sensor has a high image resolution, contrast, contour sharpness, or SN ratio. It gets smaller. Further, the thermal image sensor formed by the large sensor is expensive.
  • a technique for generating a trained model by inputting a visible image or a distance image and a posture information (correct answer) of a subject, and estimating a posture information from a visible image or a distance image using the generated trained model See, for example, Patent Document 1).
  • Patent Document 1 describes a posture estimation device that estimates a posture from a visible image or a distance image.
  • the thermal image has a problem that the image quality such as the resolution or the SN ratio is smaller than that of the visible image or the distance image, and the posture estimation is not easy.
  • one or more aspects of the present disclosure are intended to enable highly accurate estimation of the posture of a subject in a thermal image.
  • the learning device uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject.
  • the thermal image is described as described above. It is characterized by including a model generation unit that generates a trained model for inferring a visible image.
  • the learning device uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject. Inference from the thermal image to the visible image using a data acquisition unit that acquires learning data including a visible image of the subject and posture information indicating the posture of the subject, and the learning data. It is characterized by including a model generation unit that generates a trained model for inferring the posture from the thermal image by learning the inference from the visible image to the posture.
  • the learning device uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject.
  • a data acquisition unit that acquires learning data including a visible image of the subject, a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and the thermal image and the contour image. It is characterized by including a model generation unit that generates a trained model for inferring the visible image from the combination of the thermal image and the contour image by learning the inference from the combination to the visible image. ..
  • the utilization device uses an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject. Learning to infer the visible image from the thermal image generated by learning the inference from the thermal image to the visible image using the learning data including the visible image in which the subject is imaged.
  • the storage unit that stores the completed model, the data acquisition unit that acquires the target thermal image data that shows the target thermal image that is the thermal image of the target subject that is the target subject, and the target heat. It is characterized by including a reasoning unit that infers a target visible image that is a visible image of the target subject from an image, and a posture estimation unit that estimates the posture of the target subject from the target visible image.
  • the utilization device uses an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject.
  • the learning data including the visible image obtained by imaging the subject and the posture information indicating the posture of the subject
  • the inference from the thermal image to the visible image is learned, and the visible image is changed to the posture.
  • a storage unit that stores a trained model for inferring the posture from the thermal image generated by learning the inference of the above, and a target thermal image that is a thermal image of the target subject that is the target subject are shown. It is characterized by including a data acquisition unit for acquiring target thermal image data and an inference unit for inferring the posture of the target subject from the target thermal image using the trained model.
  • the utilization device uses an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject.
  • the thermal image and the contour image are used.
  • a storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image, which is generated by learning the inference from the combination to the visible image, and a target subject.
  • a data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of a target subject, and a contour extraction unit that extracts a target contour image that is a contour image indicating the contour of the target subject from the target thermal image.
  • the inference unit that infers the target visible image that is the visible image of the target subject from the combination of the target thermal image and the target contour image using the trained model, and the target subject from the target visible image. It is characterized by including a posture estimation unit for estimating the posture of the image.
  • the program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject.
  • the heat is obtained by learning the inference from the thermal image to the visible image using the data acquisition unit that acquires the learning data including the visible image of the subject and the learning data. It is characterized in that it functions as a model generation unit that generates a trained model for inferring the visible image from an image.
  • the program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject.
  • a data acquisition unit that acquires learning data including a visible image of the subject and posture information indicating the posture of the subject, and the visible image from the thermal image using the learning data. It is characterized in that it functions as a model generation unit that generates a trained model for inferring the attitude from the thermal image by learning the inference to the image and the inference from the visible image to the attitude.
  • the program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject.
  • a data acquisition unit that acquires learning data including a visible image of the subject, a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and the thermal image and the contour.
  • the program according to one aspect of the present disclosure uses a computer to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject by using infrared rays emitted from the subject.
  • a thermal image that images the temperature distribution of the subject and visible light reflected from the subject by using infrared rays emitted from the subject.
  • a storage unit that stores the trained model, a data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and the target heat using the trained model. It is characterized by functioning as an inference unit that infers a target visible image that is a visible image of the target subject from an image, and a posture estimation unit that estimates the posture of the target subject from the target visible image.
  • the program according to one aspect of the present disclosure uses a computer to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject by using infrared rays emitted from the subject. Then, using the learning data including the visible image obtained by imaging the subject and the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the visible image is used as the above-mentioned.
  • a storage unit that stores a trained model for inferring the posture from the thermal image generated by learning the inference to the posture, and a target thermal image that is a thermal image of the target subject that is the target subject. It is characterized in that it functions as a data acquisition unit for acquiring the target thermal image data to be shown and a reasoning unit for inferring the posture of the target subject from the target thermal image using the trained model.
  • the program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject. Then, using the learning data including the visible image obtained by imaging the subject and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the thermal image and the contour are used.
  • a storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image, which is generated by learning the inference from the combination of images to the visible image, and the target subject.
  • a data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject, and a contour extraction unit that extracts a target contour image that is a contour image indicating the contour of the target subject from the target thermal image.
  • the inference unit that infers the target visible image that is the visible image of the target subject from the combination of the target thermal image and the target contour image using the trained model, and the target subject from the target visible image. It is characterized in that it functions as a posture estimation unit that estimates the posture of the image.
  • the learning method is to use an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject.
  • the visible image is inferred from the thermal image by acquiring learning data including a visible image obtained by imaging the subject and learning inference from the thermal image to the visible image using the learning data. It is characterized by generating a trained model for this purpose.
  • the learning method is to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject.
  • Learning data including a visible image obtained by imaging the subject and posture information indicating the posture of the subject is acquired, and the training data is used to infer from the thermal image to the visible image and the visible image.
  • the learning method is to use an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject.
  • Learning data including a visible image of the subject is acquired, a contour image showing the contour of the subject is extracted from the thermal image, and inference from the combination of the thermal image and the contour image to the visible image.
  • the utilization method is described above by acquiring target thermal image data showing a target thermal image which is a thermal image of the target subject which is the target subject and using infrared rays emitted from the subject.
  • the visible from the thermal image using learning data including a thermal image that images the temperature distribution of the subject and a visible image that images the subject by using the visible light reflected from the subject.
  • the target visible image which is the visible image of the target subject is obtained from the target thermal image. It is characterized in that the posture of the target subject is estimated from the target visible image by inferring.
  • the utilization method is described above by acquiring target thermal image data showing a target thermal image which is a thermal image of the target subject which is the target subject and using infrared rays emitted from the subject.
  • Learning data including a thermal image that images the temperature distribution of the subject, a visible image that images the subject by using visible light reflected from the subject, and posture information indicating the posture of the subject.
  • Has been learned to infer the posture from the thermal image which is generated by learning the inference from the thermal image to the visible image and learning the inference from the visible image to the posture using the above. It is characterized in that the posture of the target subject is inferred from the target thermal image using a model.
  • the target thermal image data showing the target thermal image which is the thermal image of the target subject which is the target subject is acquired, and the contour showing the contour of the target subject is obtained from the target thermal image.
  • the target contour image which is an image and using the infrared rays emitted from the subject the thermal image which imaged the temperature distribution of the subject and the visible light reflected from the subject are used.
  • the target visible image which is a visible image of the target subject, is inferred from the combination of the above, and the posture of the target subject is estimated from the target visible image.
  • the posture of the subject in the thermal image can be estimated with high accuracy.
  • FIG. 1 It is a block diagram which shows schematic structure of the posture estimation system which concerns on Embodiments 1 and 2. It is a block diagram which shows schematic structure of the learning apparatus in Embodiments 1 and 2. It is a schematic diagram which shows an example of a three-layer neural network. It is a schematic diagram which shows an example of the structure of the trained model of the image conversion process which converts a thermal image into a visible image in Embodiment 1.
  • FIG. It is a block diagram which shows the structure of a computer schematicly. It is a flowchart which shows the process which a learning apparatus learns. It is a block diagram which shows schematic structure of the posture estimation apparatus in Embodiment 1.
  • FIG. 1 It is a block diagram which shows schematic structure of the posture estimation system which concerns on Embodiments 1 and 2. It is a block diagram which shows schematic structure of the learning apparatus in Embodiments 1 and 2. It is a schematic diagram which shows an example of a three-layer neural network. It is a schematic diagram which shows an example of the structure
  • FIG. 3 is a block diagram schematically showing a configuration of a posture estimation device according to a second embodiment. It is a block diagram which shows schematic structure of the learning apparatus in Embodiment 3. FIG. It is a schematic diagram which shows an example of the structure of the trained model of the image conversion process which converts a thermal image and a contour image into a visible image in Embodiment 3.
  • FIG. FIG. 3 is a block diagram schematically showing a configuration of a posture estimation device according to a third embodiment.
  • FIG. 1 is a block diagram schematically showing the configuration of the posture estimation system 100 according to the first embodiment.
  • the posture estimation system 100 includes a learning device 110 that functions as a model generation device, and a posture estimation device 130 that functions as a utilization device.
  • the processing method performed by the posture estimation device 130 is a utilization method.
  • the posture estimation device 130 estimates the posture using the trained model learned by the learning device 110.
  • FIG. 2 is a block diagram schematically showing the configuration of the learning device 110.
  • the learning device 110 includes a learning side input unit 111, a learning side data acquisition unit 112, a model generation unit 113, a learning side learned model storage unit 114, and a learning side communication unit 115.
  • the learning side input unit 111 is an input unit that accepts input of learning data.
  • the input learning data is given to the learning side data acquisition unit 112.
  • the learning data is teacher data showing a combination of a thermal image and a visible image as a correct answer to be inferred from the thermal image.
  • the thermal image is acquired by imaging the temperature distribution of the subject by using the infrared rays radiated from the subject. Further, the visible image is acquired by imaging the subject by using the visible light reflected from the subject. In the visible image, the appearance of the subject is imaged.
  • the learning side data acquisition unit 112 is a data acquisition unit that acquires learning data via the learning side input unit 111.
  • the acquired learning data is given to the model generation unit 113.
  • the model generation unit 113 learns the visible image corresponding to the thermal image based on the learning data given from the learning side data acquisition unit 112. In other words, the model generation unit 113 generates a trained model for inferring the optimum visible image corresponding to the thermal image by learning the combination of the thermal image and the visible image shown in the training data. Specifically, the model generation unit 113 generates a trained model for inferring a visible image from a thermal image by learning inference from a thermal image to a visible image using training data. Then, the model generation unit 113 stores the generated trained model in the learning side trained model storage unit 114 as the learning side trained model.
  • the learning algorithm used by the model generation unit 113 known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used. As an example, here, a case where a neural network is applied will be described.
  • the thermal image and the visible image shown in the learning data need to be paired data containing the same subject.
  • the thermal image and the visible image do not have to contain the same subject.
  • the model generation unit 113 learns a visible image corresponding to a thermal image by so-called supervised learning according to, for example, a neural network model.
  • supervised learning is a method of learning the characteristics of the learning data by giving a set of input and result (label) data to the learning device as learning data, and inferring the result from the input. To say.
  • a neural network is composed of an input layer consisting of a plurality of neurons, an intermediate layer (hidden layer) consisting of a plurality of neurons, and an output layer consisting of a plurality of neurons.
  • the intermediate layer may be one layer or two or more layers.
  • FIG. 3 is a schematic diagram showing an example of a three-layer neural network.
  • the first weights w11 to w16 (hereinafter, the first) are applied to the input values.
  • the weight of one (also called W1) is multiplied.
  • the calculated value which is the value obtained by multiplying the input value by the first weights w11 to w16, is input to the intermediate layers Y1 and Y2.
  • the calculated value is multiplied by the second weights w21 to w26 (hereinafter, also referred to as the second weight W2), and the output value, which is the value obtained by multiplying the calculated value by the second weights w21 to w26, is the output layer. It is output from Z1 to Z3. This output value varies depending on the value of the first weight W1 and the value of the second weight W2.
  • the neural network is subjected to so-called supervised learning according to the learning data created based on the combination of the thermal image and the visible image represented by the learning data acquired by the learning side data acquisition unit 112. Learn a trained model to infer the optimal visible image for the thermal image.
  • the neural network inputs the thermal image to the input layer and adjusts the first weight W1 and the second weight W2 so that the result output from the output layer approaches the visible image as the correct answer. Learn the trained model.
  • FIG. 4 is a schematic diagram showing an example of the structure of a trained model of an image conversion process for converting a thermal image into a visible image.
  • the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure, and have a U-Net structure connected by a skip connection.
  • the learning side trained model storage unit 114 stores the learning side trained model which is the trained model given by the model generation unit 113.
  • the learning side communication unit 115 sends the learning side learned model stored in the learning side learned model storage unit 114 to the posture estimation device 130.
  • the learning device 110 described above can be realized by a computer 160 as shown in FIG.
  • FIG. 5 is a block diagram schematically showing the configuration of the computer 160.
  • the computer 160 includes a communication device 161, an auxiliary storage device 162, a memory 163, and a processor 164.
  • the communication device 161 communicates data via a network, for example.
  • the auxiliary storage device 162 stores data and programs necessary for processing in the computer 160.
  • the memory 163 temporarily stores programs and data and provides a work area for the processor 164.
  • the processor 164 reads the program stored in the auxiliary storage device 162 into the memory 163, and executes the program to execute the processing in the computer 160.
  • the learning side input unit 111 and the learning side communication unit 115 described above can be realized by the communication device 161.
  • the learned model storage unit 114 on the learning side can be realized by the auxiliary storage device 162.
  • the learning side data acquisition unit 112 and the model generation unit 113 can be realized by the processor 164 executing the program read into the memory 163.
  • a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
  • FIG. 6 is a flowchart showing a process of learning by the learning device 110.
  • the learning side data acquisition unit 112 acquires learning data via the learning side input unit 111 (S10).
  • the thermal image data which is the image data of the thermal image
  • the visible image data which is the image data of the visible image, which are used as the training data
  • the thermal image data can be associated with the visible image data used as the correct answer for the thermal image data, they may be acquired at different timings.
  • the acquired learning data is given to the model generation unit 113.
  • the model generation unit 113 learns the visible image, which is the output corresponding to the thermal image, by so-called supervised learning based on the combination of the thermal image and the visible image shown in the training data, and obtains the trained model.
  • the learning side learned model storage unit 114 stores the generated learning model (S12). Then, the learning side communication unit 115 transmits the learning model to the posture estimation device 130.
  • FIG. 7 is a block diagram schematically showing the configuration of the posture estimation device 130.
  • the posture estimation device 130 includes an inference device 140 and a posture estimation execution device 150 that functions as a posture estimation unit.
  • the inference device 140 infers a visible image from a thermal image by using the trained model given by the learning device 110 as an inference side learning model.
  • the inference device 140 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 144, and an inference unit 145.
  • the inference side communication unit 141 receives the trained model from the learning device 110, and stores the trained model as the inference side trained model in the inference side trained model storage unit 142.
  • the inference side trained model storage unit 142 is a storage unit that stores the inference side trained model.
  • the inference side input unit 143 is an input unit that accepts input of thermal image data indicating a thermal image of a subject.
  • the thermal image data input here is also referred to as target thermal image data.
  • the thermal image shown by the target thermal image data is also referred to as a target thermal image, and the subject included in the target thermal image, which is the target for estimating the posture, is also referred to as the target subject.
  • the inference side data acquisition unit 144 is a data acquisition unit that acquires target thermal image data via the inference side input unit 143. The acquired target thermal image data is given to the inference unit 145.
  • the inference unit 145 infers a visible image of the target subject from the thermal image shown by the target thermal image data by using the inference side learned model stored in the inference side learned model storage unit 142.
  • the inference unit 145 inputs the thermal image indicated by the target thermal image data into the inference side trained model, and acquires the visible image corresponding to the thermal image inferred from the thermal image. Can be done.
  • the inference unit 145 generates visible image data indicating the inferred visible image, and gives the visible image data to the posture estimation execution device 150.
  • the visible image data generated here is also referred to as target visible image data.
  • the visible image shown by the target visible image data in other words, the inferred visible image is also referred to as the target visible image.
  • the posture estimation execution device 150 estimates the posture of the subject existing in the visible image from the visible image indicated by the target visible image data.
  • a method of estimating the posture a large amount of correspondence between the visible image and the posture of the person (for example, the positional relationship of parts) is learned in advance, and when the visible image is input, the person corresponding to the visible image is used. There is a method of determining the posture based on the learning result.
  • the posture estimation device 130 described above can also be realized by a computer 160 as shown in FIG.
  • the inference side communication unit 141 and the inference side input unit 143 can be realized by the communication device 161.
  • the inference side learned model storage unit 142 can be realized by the auxiliary storage device 162.
  • the inference side data acquisition unit 144 and the inference unit 145 can be realized by the processor 164 executing the program read into the memory 163.
  • a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
  • FIG. 8 is a flowchart showing a process in which the posture estimation device 130 infers a visible image corresponding to a thermal image and estimates a posture from the visible image.
  • the inference side data acquisition unit 144 acquires the target thermal image data showing the thermal image via the inference side input unit 143 (S20).
  • the acquired target thermal image data is given to the inference unit 145.
  • the inference unit 145 inputs the thermal image shown by the target thermal image data into the inference side learned model stored in the inference side learned model storage unit 142, and obtains a visible image corresponding to the thermal image. (S21).
  • the inference unit 145 generates target visible image data indicating a visible image corresponding to the thermal image obtained by the inference side trained model, and gives the target visible image data to the posture estimation execution device 150 (S22). ).
  • the posture estimation execution device 150 estimates the posture of the subject in the visible image indicated by the target visible image data (S23). Based on the posture estimated in this way, for example, it is possible to detect the abnormal behavior of the subject reflected in the thermal image.
  • the thermal image output from the thermal image sensor or the like is converted into a visible image, and the posture estimation execution which is the learned posture estimator for the visible image is executed.
  • the attitude of the subject in the thermal image can be estimated using the device 150. Therefore, it is possible to estimate the posture using an existing trained posture estimator for visible images.
  • the first embodiment the case where supervised learning is applied to the learning algorithm used by the model generation unit 113 has been described, but the first embodiment is not limited to such an example.
  • the learning algorithm reinforcement learning, unsupervised learning, semi-supervised learning, or the like can be used in addition to supervised learning.
  • the model generation unit 113 may learn the visible image corresponding to the thermal image according to the learning data created for the plurality of posture estimation devices including the posture estimation device 130.
  • the model generation unit 113 may acquire learning data from a plurality of posture estimation devices used in the same area, or may collect learning data from a plurality of posture estimation devices that operate independently in different areas.
  • the visible image corresponding to the thermal image may be learned by using the data.
  • the model generation unit 113 can add or remove the posture estimation device for collecting learning data from the target on the way. Further, the model generation unit 113 applies the trained model that has learned the visible image corresponding to the thermal image for one posture estimation device to another posture estimation device, and applies the trained model to the thermal image for the other posture estimation device. The corresponding visible image may be retrained to update the trained model.
  • model generation unit 113 may execute machine learning according to other known methods such as genetic programming, functional logic programming, or a support vector machine.
  • the learning device 110 and the inference device 140 are used to learn the visible image corresponding to the thermal image of the posture estimation system 100, and are connected to the posture estimation execution device 150 via a network, for example. It may be. Further, the learning device 110, the inference device 140, or the posture estimation execution device 150 may exist on the cloud server.
  • the learning device 110 and the posture estimation device 130 are separate devices.
  • the learning device 110 is provided in the posture estimation device 130. You may be.
  • the learning side communication unit 115 and the inference side communication unit 141 become unnecessary, and the learning side learned model storage unit 114 and the inference side learned model storage unit 142 can be integrated as the learned model storage unit. can.
  • the posture estimation device 130 infers a visible image corresponding to a thermal image by using the trained model generated by the learning device 110.
  • the posture estimation device 130 may acquire a trained model from the outside such as another system and infer a visible image corresponding to a thermal image based on the trained model.
  • the posture estimation system 200 includes a learning device 210 and a posture estimation device 230.
  • the learning device 210 in the second embodiment includes a learning side input unit 111, a learning side data acquisition unit 212, a model generation unit 213, and a learning side learned model storage unit 114.
  • the learning side communication unit 115 is provided.
  • the learning side input unit 111, the learning side learned model storage unit 114, and the learning side communication unit 115 of the learning device 210 according to the second embodiment have the learning side input unit 111 and the learning side learned side of the learning device 110 according to the first embodiment. This is the same as the model storage unit 114 and the learning side communication unit 115.
  • the learning side data acquisition unit 212 acquires learning data via the learning side input unit 111.
  • the learning data acquired in the second embodiment is the thermal image data showing the thermal image, the visible image data showing the visible image which is the correct answer corresponding to the thermal image, and the correct answer corresponding to the visible image. Includes posture information indicating the posture of the subject.
  • the acquired learning data is given to the model generation unit 213.
  • the model generation unit 213 learns the visible image corresponding to the thermal image and the posture corresponding to the visible image based on the learning data given from the learning side data acquisition unit 212. In other words, the model generation unit 213 learns the combination of the thermal image and the visible image shown in the training data, and the combination of the visible image and the posture, in order to infer the optimum posture corresponding to the thermal image. Generate a trained model. Specifically, the model generation unit 113 learns to infer the posture from the thermal image by learning the inference from the thermal image to the visible image and the inference from the visible image to the posture using the learning data. Generate a finished model. Then, the model generation unit 213 stores the generated trained model in the learning side trained model storage unit 114 as the learning side trained model.
  • FIG. 9 is a block diagram schematically showing the configuration of the posture estimation device 230 according to the second embodiment.
  • the attitude estimation device 230 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 144, and an inference side 245.
  • the inference side communication unit 141, the inference side learned model storage unit 142, the inference side input unit 143, and the inference side data acquisition unit 144 of the attitude estimation device 230 according to the second embodiment are used for inference of the attitude estimation device 130 in the first embodiment. This is the same as the side communication unit 141, the inference side learned model storage unit 142, the inference side input unit 143, and the inference side data acquisition unit 144.
  • the inference unit 245 infers a visible image from the thermal image indicated by the target thermal image data by using the inference side learned model stored in the inference side trained model storage unit 142, and determines the posture from the visible image. Infer. In other words, the inference unit 145 estimates the posture of the subject existing in the thermal image, which is inferred from the thermal image, by inputting the thermal image indicated by the target thermal image data into the inference side trained model. do.
  • the posture estimation device 230 described above can also be realized by a computer 160 as shown in FIG.
  • the inference side communication unit 141 and the inference side input unit 143 can be realized by the communication device 161.
  • the inference side learned model storage unit 142 can be realized by the auxiliary storage device 162.
  • the inference side data acquisition unit 144 and the inference unit 245 can be realized by the processor 164 executing the program read into the memory 163.
  • a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
  • the posture estimation system 200 it is possible to estimate the posture of the subject from the thermal image output from the thermal image sensor or the like. By inputting the visible image and the posture as teacher data at the time of learning, it is possible to avoid the work of annotating the posture information on the thermal image.
  • the scale of the network can be suppressed and the amount of calculation can be reduced by not generating and outputting the visible image in the utilization phase.
  • the posture estimation system 300 includes a learning device 310 and a posture estimation device 330.
  • FIG. 10 is a block diagram schematically showing the configuration of the learning device 310.
  • the learning device 310 includes a learning side input unit 111, a learning side data acquisition unit 312, a model generation unit 313, a learning side learned model storage unit 114, a learning side communication unit 115, and a learning side contour extraction unit 316. To prepare for.
  • the learning side input unit 111, the learning side learned model storage unit 114, and the learning side communication unit 115 of the learning device 310 according to the third embodiment have the learning side input unit 111 and the learning side learned side of the learning device 110 according to the first embodiment. This is the same as the model storage unit 114 and the learning side communication unit 115.
  • the learning side data acquisition unit 312 acquires learning data via the learning side input unit 111.
  • the acquired learning data is given to the model generation unit 313. Further, the learning side data acquisition unit 312 gives the learning side contour extraction unit 316 the thermal image data indicating the thermal image included in the acquired learning data as the learning side thermal image data.
  • the learning side contour extraction unit 316 is a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image shown by the learning side thermal image data.
  • the extraction method there are a method using an edge detection process such as a canny method or a Sobel method, a method of combining binarization process and edge detection, and the like.
  • the edge detection process the edge of the subject is detected. Further, in the combination of the binarization process and the edge detection, the edge detection process may be performed after the binarization process is performed on the thermal image. Then, the learning side contour extraction unit 316 gives the contour image data indicating the extracted contour image to the model generation unit 313 as the learning side contour image data.
  • the model generation unit 313 learns a visible image corresponding to a thermal image based on the learning data given by the learning side data acquisition unit 312 and the learning side contour image data given by the learning side contour extraction unit 316. In other words, the model generation unit 313 learns the combination of the thermal image shown by the training data, the contour image shown by the learning side contour image data, and the visible image shown by the training data, thereby forming the thermal image and the thermal image. Generate a trained model to infer the optimal visible image corresponding to the contour image. Specifically, the model generation unit 313 generates a trained model for inferring a visible image from a combination of thermal images and contour images by learning inference from a combination of thermal images and contour images to a visible image. do. Then, the model generation unit 313 stores the generated learned model as the learning side learned model in the learning side learned model storage unit 114.
  • FIG. 11 is a schematic diagram showing an example of the structure of a trained model of an image conversion process for converting a thermal image and a contour image into a visible image in the third embodiment.
  • the trained model shown in FIG. 11 has a U-Net structure in which the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure and are connected by a skip connection.
  • the decoder portion comprises two parallel paths, the two paths being a path for decoding the thermal image and a path for decoding the contour image.
  • the decoder portion has two paths in parallel, one of which decodes the thermal image and the other of which decodes the contour image.
  • the two vector information decoded in the center layer of the model are concatenated, and the concatenated information is input to the encoder part.
  • the learning device 310 described above can also be realized by a computer 160 as shown in FIG.
  • the learning side data acquisition unit 312, the model generation unit 313, and the learning side contour extraction unit 316 can also be realized by the processor 164 executing the program read into the memory 163.
  • Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
  • FIG. 12 is a block diagram schematically showing the configuration of the posture estimation device 330.
  • the posture estimation device 330 includes an inference device 340 and a posture estimation execution device 150.
  • the posture estimation execution device 150 of the posture estimation device 330 in the third embodiment is the same as the posture estimation execution device 150 in the first embodiment.
  • the inference device 340 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 344, an inference side 345, and an inference side contour extraction unit 346. Be prepared.
  • the inference side communication unit 141, the inference side learned model storage unit 142 and the inference side input unit 143 of the inference device 340 in the third embodiment have the inference side communication unit 141 and the inference side learned in the inference device 140 in the first embodiment. This is the same as the model storage unit 142 and the inference side input unit 143.
  • the inference side data acquisition unit 144 acquires the target thermal image data via the inference side input unit 143. Then, the inference side data acquisition unit 144 gives the acquired target thermal image data to the inference unit 345 and the inference side contour extraction unit 346.
  • the inference side contour extraction unit 346 is a contour extraction unit that extracts a contour image from the thermal image indicated by the target thermal image data.
  • the extraction method is the same as that of the learning side contour extraction unit 316.
  • the inference side contour extraction unit 346 provides the inference side contour image data to the inference unit 345 with contour image data indicating the extracted contour image.
  • the contour image extracted here is also referred to as a target contour image
  • the inference side contour image data is also referred to as a target contour image data.
  • the inference unit 345 uses the inference side trained model stored in the inference side trained model storage unit 142 from the combination of the thermal image shown in the target thermal image data and the contour image shown in the inference side contour image data. , Infer the visible image.
  • the inference unit 345 is inferred from the thermal image by inputting the thermal image shown by the target thermal image data and the contour image indicated by the inference side contour image data into the inference side trained model.
  • a visible image corresponding to a thermal image can be acquired.
  • the inference unit 345 generates visible image data indicating the inferred visible image, and gives the visible image data to the posture estimation execution device 150.
  • the visible image data generated here is also referred to as target visible image data.
  • the visible image indicated by the target visible image data is also referred to as a target visible image.
  • the posture estimation device 330 described above can also be realized by a computer 160 as shown in FIG.
  • the inference side data acquisition unit 344, the inference side 345, and the inference side contour extraction unit 346 can be realized by the processor 164 executing the program read into the memory 163.
  • Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
  • the thermal image has ambiguous contour information
  • the visible image generated by using the trained model also has an ambiguous contour. Since contour information is important for posture estimation, the accuracy of posture estimation is reduced for images with ambiguous contours.
  • the posture estimation system 300 according to the third embodiment by inputting the thermal image and the contour image into the trained model at the same time, it is possible to generate a visible image in which the contour is not ambiguous. As a result, the attitude estimation accuracy from the generated visible image can be improved as compared with inputting the thermal image alone into the trained model.
  • 100,200,300 posture estimation system 110,210,310 learning device, 111 learning side input unit, 112,212,312 learning side data acquisition unit, 113,213,313 model generation unit, 114 learning side learned model storage Unit, 115 learning side communication unit, 316 learning side contour extraction unit, 130, 230, 330 attitude estimation device, 140, 340 inference device, 141 inference side communication unit, 142 inference side learned model storage unit, 143 inference side input unit , 144,344 Inference side data acquisition unit, 145,245,345 Inference unit, 346 Inference side contour extraction unit, 150 Attitude estimation execution device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

This learning device is characterized by comprising a learning-side data acquisition unit (112) that acquires learning data including a thermal image, which is an image of a temperature distribution of a subject created using infrared rays emitted from the subject, and a visible image, which is an image of the subject created using visible light reflected from the subject, and a model generation unit (113) that generates a learned model for inferring a visible image from a thermal image by learning inference from the thermal image to the visible image using the learning data.

Description

学習装置、活用装置、プログラム、学習方法及び活用方法Learning device, utilization device, program, learning method and utilization method
 本開示は、学習装置、活用装置、プログラム、学習方法及び活用方法に関する。 This disclosure relates to learning devices, utilization devices, programs, learning methods and utilization methods.
 一般的な熱型赤外線固体撮像素子(以下、熱画像センサという)は、被写体が放射する入射赤外線を映像化し、赤外線を吸収することにより生じる温度上昇の差が画像の濃淡となる。被写体が放射する赤外線はレンズにより集光され、撮像素子上に結像する。 A general thermal infrared solid-state image sensor (hereinafter referred to as a thermal image sensor) visualizes incident infrared rays emitted by a subject, and the difference in temperature rise caused by absorbing the infrared rays becomes the shade of the image. Infrared rays emitted by the subject are focused by the lens and imaged on the image sensor.
 熱情報を取得可能な熱画像センサは、可視カメラでは取得できない情報を取得可能な一方で、例えば、安価な小型センサであると、画像の解像度、コントラスト、輪郭の鮮明度、又は、SN比が小さくなる。また、大型センサで形成された熱画像センサは、コストが高い。 While a thermal image sensor capable of acquiring thermal information can acquire information that cannot be acquired by a visible camera, for example, an inexpensive small sensor has a high image resolution, contrast, contour sharpness, or SN ratio. It gets smaller. Further, the thermal image sensor formed by the large sensor is expensive.
 一方、宅内モニタリング、スマートビルディング又は防犯等の分野では、人間の行動又は姿勢を識別し、異常行動を検出するサービスが存在する。人間の姿勢は、立つ(立位)、座る(座位)、横たわる(臥位)等がある。熱画像センサは、プライバシー保護の観点から可視カメラと比較して導入の障壁が低く有利である。 On the other hand, in the fields of home monitoring, smart building, crime prevention, etc., there are services that identify human behavior or posture and detect abnormal behavior. Human postures include standing (standing position), sitting (sitting position), and lying down (lying position). Thermal image sensors are advantageous because they have lower barriers to introduction than visible cameras from the viewpoint of privacy protection.
 ここで、可視画像又は距離画像と、被写体の姿勢情報(正解)とを入力として学習済モデルを生成し、生成された学習済モデルを用いて可視画像又は距離画像から姿勢情報を推定する技術がある(例えば、特許文献1参照)。 Here, a technique for generating a trained model by inputting a visible image or a distance image and a posture information (correct answer) of a subject, and estimating a posture information from a visible image or a distance image using the generated trained model. (See, for example, Patent Document 1).
特開2017-97577号公報(第5頁)Japanese Unexamined Patent Publication No. 2017-97577 (page 5)
 特許文献1には、可視画像又は距離画像から姿勢を推定する姿勢推定装置が記載されている。この姿勢推定装置では、熱画像は、解像度又はSN比等の画質が可視画像又は距離画像と比べて小さく、姿勢推定が容易ではないという課題がある。 Patent Document 1 describes a posture estimation device that estimates a posture from a visible image or a distance image. In this posture estimation device, the thermal image has a problem that the image quality such as the resolution or the SN ratio is smaller than that of the visible image or the distance image, and the posture estimation is not easy.
 そこで、本開示の一又は複数の態様は、熱画像中の被写体の姿勢を高精度に推定できるようにすることを目的とする。 Therefore, one or more aspects of the present disclosure are intended to enable highly accurate estimation of the posture of a subject in a thermal image.
 本開示の一態様に係る学習装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部と、前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成するモデル生成部と、を備えることを特徴とする。 The learning device according to one aspect of the present disclosure uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject. By learning the inference from the thermal image to the visible image using the data acquisition unit that acquires the learning data including the visible image of the subject and the learning data, the thermal image is described as described above. It is characterized by including a model generation unit that generates a trained model for inferring a visible image.
 本開示の一態様に係る学習装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得するデータ取得部と、前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成するモデル生成部と、を備えることを特徴とする。 The learning device according to one aspect of the present disclosure uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject. Inference from the thermal image to the visible image using a data acquisition unit that acquires learning data including a visible image of the subject and posture information indicating the posture of the subject, and the learning data. It is characterized by including a model generation unit that generates a trained model for inferring the posture from the thermal image by learning the inference from the visible image to the posture.
 本開示の一態様に係る学習装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部と、前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出する輪郭抽出部と、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成するモデル生成部と、を備えることを特徴とする。 The learning device according to one aspect of the present disclosure uses infrared rays emitted from a subject to image a thermal image of the temperature distribution of the subject, and visible light reflected from the subject. A data acquisition unit that acquires learning data including a visible image of the subject, a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and the thermal image and the contour image. It is characterized by including a model generation unit that generates a trained model for inferring the visible image from the combination of the thermal image and the contour image by learning the inference from the combination to the visible image. ..
 本開示の一態様に係る活用装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを記憶する記憶部と、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論する推論部と、前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部と、を備えることを特徴とする。 The utilization device according to one aspect of the present disclosure uses an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject. Learning to infer the visible image from the thermal image generated by learning the inference from the thermal image to the visible image using the learning data including the visible image in which the subject is imaged. Using the trained model, the storage unit that stores the completed model, the data acquisition unit that acquires the target thermal image data that shows the target thermal image that is the thermal image of the target subject that is the target subject, and the target heat. It is characterized by including a reasoning unit that infers a target visible image that is a visible image of the target subject from an image, and a posture estimation unit that estimates the posture of the target subject from the target visible image.
 本開示の一態様に係る活用装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを記憶する記憶部と、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論する推論部と、を備えることを特徴とする。 The utilization device according to one aspect of the present disclosure uses an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject. Using the learning data including the visible image obtained by imaging the subject and the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the visible image is changed to the posture. A storage unit that stores a trained model for inferring the posture from the thermal image generated by learning the inference of the above, and a target thermal image that is a thermal image of the target subject that is the target subject are shown. It is characterized by including a data acquisition unit for acquiring target thermal image data and an inference unit for inferring the posture of the target subject from the target thermal image using the trained model.
 本開示の一態様に係る活用装置は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを記憶する記憶部と、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出する輪郭抽出部と、前記学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論する推論部と、前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部と、を備えることを特徴とする。 The utilization device according to one aspect of the present disclosure uses an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject, and a visible light reflected from the subject. Using the learning data including the visible image obtained by imaging the subject and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the thermal image and the contour image are used. A storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image, which is generated by learning the inference from the combination to the visible image, and a target subject. A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of a target subject, and a contour extraction unit that extracts a target contour image that is a contour image indicating the contour of the target subject from the target thermal image. And the inference unit that infers the target visible image that is the visible image of the target subject from the combination of the target thermal image and the target contour image using the trained model, and the target subject from the target visible image. It is characterized by including a posture estimation unit for estimating the posture of the image.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部、及び、前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成するモデル生成部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject. The heat is obtained by learning the inference from the thermal image to the visible image using the data acquisition unit that acquires the learning data including the visible image of the subject and the learning data. It is characterized in that it functions as a model generation unit that generates a trained model for inferring the visible image from an image.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得するデータ取得部、及び、前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成するモデル生成部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject. A data acquisition unit that acquires learning data including a visible image of the subject and posture information indicating the posture of the subject, and the visible image from the thermal image using the learning data. It is characterized in that it functions as a model generation unit that generates a trained model for inferring the attitude from the thermal image by learning the inference to the image and the inference from the visible image to the attitude.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部、前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出する輪郭抽出部、及び、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成するモデル生成部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject. A data acquisition unit that acquires learning data including a visible image of the subject, a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and the thermal image and the contour. By learning the inference from the combination of images to the visible image, it is characterized by functioning as a model generation unit that generates a trained model for inferring the visible image from the combination of the thermal image and the contour image. And.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを記憶する記憶部、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論する推論部、及び、前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject by using infrared rays emitted from the subject. In order to infer the visible image from the thermal image generated by learning the inference from the thermal image to the visible image using the learning data including the visible image in which the subject is imaged. A storage unit that stores the trained model, a data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and the target heat using the trained model. It is characterized by functioning as an inference unit that infers a target visible image that is a visible image of the target subject from an image, and a posture estimation unit that estimates the posture of the target subject from the target visible image.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを記憶する記憶部、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、及び、前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論する推論部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use a thermal image that images the temperature distribution of the subject and visible light reflected from the subject by using infrared rays emitted from the subject. Then, using the learning data including the visible image obtained by imaging the subject and the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the visible image is used as the above-mentioned. A storage unit that stores a trained model for inferring the posture from the thermal image generated by learning the inference to the posture, and a target thermal image that is a thermal image of the target subject that is the target subject. It is characterized in that it functions as a data acquisition unit for acquiring the target thermal image data to be shown and a reasoning unit for inferring the posture of the target subject from the target thermal image using the trained model.
 本開示の一態様に係るプログラムは、コンピュータを、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを記憶する記憶部、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出する輪郭抽出部、前記学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論する推論部、及び、前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部、として機能させることを特徴とする。 The program according to one aspect of the present disclosure uses a computer to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and visible light reflected from the subject. Then, using the learning data including the visible image obtained by imaging the subject and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the thermal image and the contour are used. A storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image, which is generated by learning the inference from the combination of images to the visible image, and the target subject. A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject, and a contour extraction unit that extracts a target contour image that is a contour image indicating the contour of the target subject from the target thermal image. , The inference unit that infers the target visible image that is the visible image of the target subject from the combination of the target thermal image and the target contour image using the trained model, and the target subject from the target visible image. It is characterized in that it functions as a posture estimation unit that estimates the posture of the image.
 本開示の一態様に係る学習方法は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得し、前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成することを特徴とする。 The learning method according to one aspect of the present disclosure is to use an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject. The visible image is inferred from the thermal image by acquiring learning data including a visible image obtained by imaging the subject and learning inference from the thermal image to the visible image using the learning data. It is characterized by generating a trained model for this purpose.
 本開示の一態様に係る学習方法は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得し、前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成することを特徴とする。 The learning method according to one aspect of the present disclosure is to use an infrared image emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject. Learning data including a visible image obtained by imaging the subject and posture information indicating the posture of the subject is acquired, and the training data is used to infer from the thermal image to the visible image and the visible image. By learning the inference to the posture from the above, it is characterized in that a trained model for inferring the posture from the thermal image is generated.
 本開示の一態様に係る学習方法は、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得し、前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出し、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成することを特徴とする。 The learning method according to one aspect of the present disclosure is to use an infrared ray emitted from a subject to image a thermal image of the temperature distribution of the subject and a visible light reflected from the subject. Learning data including a visible image of the subject is acquired, a contour image showing the contour of the subject is extracted from the thermal image, and inference from the combination of the thermal image and the contour image to the visible image. Is characterized by generating a trained model for inferring the visible image from the combination of the thermal image and the contour image.
 本開示の一態様に係る活用方法は、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論し、前記対象可視画像から、前記対象被写体の姿勢を推定することを特徴とする。 The utilization method according to one aspect of the present disclosure is described above by acquiring target thermal image data showing a target thermal image which is a thermal image of the target subject which is the target subject and using infrared rays emitted from the subject. The visible from the thermal image using learning data including a thermal image that images the temperature distribution of the subject and a visible image that images the subject by using the visible light reflected from the subject. Using the trained model for inferring the visible image from the thermal image generated by learning the inference to the image, the target visible image which is the visible image of the target subject is obtained from the target thermal image. It is characterized in that the posture of the target subject is estimated from the target visible image by inferring.
 本開示の一態様に係る活用方法は、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論することを特徴とする。 The utilization method according to one aspect of the present disclosure is described above by acquiring target thermal image data showing a target thermal image which is a thermal image of the target subject which is the target subject and using infrared rays emitted from the subject. Learning data including a thermal image that images the temperature distribution of the subject, a visible image that images the subject by using visible light reflected from the subject, and posture information indicating the posture of the subject. Has been learned to infer the posture from the thermal image, which is generated by learning the inference from the thermal image to the visible image and learning the inference from the visible image to the posture using the above. It is characterized in that the posture of the target subject is inferred from the target thermal image using a model.
 本開示の一態様に係る活用方法は、対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出し、被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論し、前記対象可視画像から、前記対象被写体の姿勢を推定することを特徴とする。 In the utilization method according to one aspect of the present disclosure, the target thermal image data showing the target thermal image which is the thermal image of the target subject which is the target subject is acquired, and the contour showing the contour of the target subject is obtained from the target thermal image. By extracting the target contour image which is an image and using the infrared rays emitted from the subject, the thermal image which imaged the temperature distribution of the subject and the visible light reflected from the subject are used. A combination of the thermal image and the contour image using learning data including a visible image of the subject and contour image data showing a contour image showing the contour of the subject extracted from the thermal image. Using the trained model for inferring the visible image from the combination of the thermal image and the contour image generated by learning the inference to the visible image from, the target thermal image and the target contour image. The target visible image, which is a visible image of the target subject, is inferred from the combination of the above, and the posture of the target subject is estimated from the target visible image.
 本開示の一又は複数の態様によれば、熱画像中の被写体の姿勢を高精度に推定することができる。 According to one or more aspects of the present disclosure, the posture of the subject in the thermal image can be estimated with high accuracy.
実施の形態1~2に係る姿勢推定システムの構成を概略的に示すブロック図である。It is a block diagram which shows schematic structure of the posture estimation system which concerns on Embodiments 1 and 2. 実施の形態1及び2における学習装置の構成を概略的に示すブロック図である。It is a block diagram which shows schematic structure of the learning apparatus in Embodiments 1 and 2. 三層のニューラルネットワークの一例を示す概略図である。It is a schematic diagram which shows an example of a three-layer neural network. 実施の形態1における、熱画像を可視画像へ変換する画像変換処理の学習済モデルの構造の一例を示す概略図である。It is a schematic diagram which shows an example of the structure of the trained model of the image conversion process which converts a thermal image into a visible image in Embodiment 1. FIG. コンピュータの構成を概略的に示すブロック図である。It is a block diagram which shows the structure of a computer schematicly. 学習装置が学習する処理を示すフローチャートである。It is a flowchart which shows the process which a learning apparatus learns. 実施の形態1における姿勢推定装置の構成を概略的に示すブロック図である。It is a block diagram which shows schematic structure of the posture estimation apparatus in Embodiment 1. FIG. 姿勢推定装置が、熱画像に対応する可視画像を推論し、その可視画像から姿勢を推定する処理を示すフローチャートである。It is a flowchart which shows the process which the posture estimation apparatus infers the visible image corresponding to a thermal image, and estimates the posture from the visible image. 実施の形態2における姿勢推定装置の構成を概略的に示すブロック図である。FIG. 3 is a block diagram schematically showing a configuration of a posture estimation device according to a second embodiment. 実施の形態3における学習装置の構成を概略的に示すブロック図である。It is a block diagram which shows schematic structure of the learning apparatus in Embodiment 3. FIG. 実施の形態3における、熱画像及び輪郭画像を可視画像へ変換する画像変換処理の学習済モデルの構造の一例を示す概略図である。It is a schematic diagram which shows an example of the structure of the trained model of the image conversion process which converts a thermal image and a contour image into a visible image in Embodiment 3. FIG. 実施の形態3における姿勢推定装置の構成を概略的に示すブロック図である。FIG. 3 is a block diagram schematically showing a configuration of a posture estimation device according to a third embodiment.
実施の形態1.
 図1は、実施の形態1に係る姿勢推定システム100の構成を概略的に示すブロック図である。
 姿勢推定システム100は、モデル生成装置として機能する学習装置110と、活用装置として機能する姿勢推定装置130とを備える。なお、姿勢推定装置130で行われる処理方法が活用方法となる。
 姿勢推定システム100では、学習装置110で学習された学習済モデルを用いて、姿勢推定装置130が、姿勢の推定を行う。
Embodiment 1.
FIG. 1 is a block diagram schematically showing the configuration of the posture estimation system 100 according to the first embodiment.
The posture estimation system 100 includes a learning device 110 that functions as a model generation device, and a posture estimation device 130 that functions as a utilization device. The processing method performed by the posture estimation device 130 is a utilization method.
In the posture estimation system 100, the posture estimation device 130 estimates the posture using the trained model learned by the learning device 110.
 図2は、学習装置110の構成を概略的に示すブロック図である。
 学習装置110は、学習側入力部111と、学習側データ取得部112と、モデル生成部113と、学習側学習済モデル記憶部114と、学習側通信部115とを備える。
FIG. 2 is a block diagram schematically showing the configuration of the learning device 110.
The learning device 110 includes a learning side input unit 111, a learning side data acquisition unit 112, a model generation unit 113, a learning side learned model storage unit 114, and a learning side communication unit 115.
 学習側入力部111は、学習用データの入力を受け付ける入力部である。入力された学習用データは、学習側データ取得部112に与えられる。
 ここで、学習用データは、熱画像と、熱画像から推論されるべき正解として可視画像との組み合わせ示す教師データである。
The learning side input unit 111 is an input unit that accepts input of learning data. The input learning data is given to the learning side data acquisition unit 112.
Here, the learning data is teacher data showing a combination of a thermal image and a visible image as a correct answer to be inferred from the thermal image.
 熱画像は、被写体から放射される赤外線を利用することで、被写体の温度分布を画像化することで取得される。
 また、可視画像は、被写体から反射される可視光を利用することで、被写体を画像化することで取得される。可視画像では、被写体の外観が画像化される。
The thermal image is acquired by imaging the temperature distribution of the subject by using the infrared rays radiated from the subject.
Further, the visible image is acquired by imaging the subject by using the visible light reflected from the subject. In the visible image, the appearance of the subject is imaged.
 学習側データ取得部112は、学習側入力部111を介して、学習用データを取得するデータ取得部である。取得された学習用データは、モデル生成部113に与えられる。 The learning side data acquisition unit 112 is a data acquisition unit that acquires learning data via the learning side input unit 111. The acquired learning data is given to the model generation unit 113.
 モデル生成部113は、学習側データ取得部112から与えられる学習用データに基づいて、熱画像に対応する可視画像を学習する。言い換えると、モデル生成部113は、学習用データで示される熱画像及び可視画像の組み合わせを学習することで、熱画像に対応する最適な可視画像を推論するための学習済モデルを生成する。具体的には、モデル生成部113は、学習用データを用いて熱画像から可視画像への推論を学習することで、熱画像から可視画像を推論するための学習済モデルを生成する。
 そして、モデル生成部113は、生成された学習済モデルを学習側学習済モデルとして学習側学習済モデル記憶部114に記憶させる。
The model generation unit 113 learns the visible image corresponding to the thermal image based on the learning data given from the learning side data acquisition unit 112. In other words, the model generation unit 113 generates a trained model for inferring the optimum visible image corresponding to the thermal image by learning the combination of the thermal image and the visible image shown in the training data. Specifically, the model generation unit 113 generates a trained model for inferring a visible image from a thermal image by learning inference from a thermal image to a visible image using training data.
Then, the model generation unit 113 stores the generated trained model in the learning side trained model storage unit 114 as the learning side trained model.
 モデル生成部113が用いる学習アルゴリズムは、教師あり学習、教師なし学習、強化学習等の公知のアルゴリズムを用いることができる。一例として、ここでは、ニューラルネットワークを適用した場合について説明する。 As the learning algorithm used by the model generation unit 113, known algorithms such as supervised learning, unsupervised learning, and reinforcement learning can be used. As an example, here, a case where a neural network is applied will be described.
 ここで教師あり学習の場合、学習用データで示される熱画像と可視画像とは、同一被写体を収めたペアのデータである必要がある。教師なし学習の場合、熱画像と、可視画像とは、同一被写体を収めている必要はない。 Here, in the case of supervised learning, the thermal image and the visible image shown in the learning data need to be paired data containing the same subject. In the case of unsupervised learning, the thermal image and the visible image do not have to contain the same subject.
 モデル生成部113は、例えば、ニューラルネットワークモデルに従って、いわゆる教師あり学習により、熱画像に対応する可視画像を学習する。
 ここで、教師あり学習とは、入力と結果(ラベル)のデータの組を学習用データとして学習装置に与えることで、それらの学習用データにある特徴を学習し、入力から結果を推論する手法をいう。
The model generation unit 113 learns a visible image corresponding to a thermal image by so-called supervised learning according to, for example, a neural network model.
Here, supervised learning is a method of learning the characteristics of the learning data by giving a set of input and result (label) data to the learning device as learning data, and inferring the result from the input. To say.
 ニューラルネットワークは、複数のニューロンからなる入力層、複数のニューロンからなる中間層(隠れ層)、及び、複数のニューロンからなる出力層で構成される。中間層は、一層又は二層以上でもよい。 A neural network is composed of an input layer consisting of a plurality of neurons, an intermediate layer (hidden layer) consisting of a plurality of neurons, and an output layer consisting of a plurality of neurons. The intermediate layer may be one layer or two or more layers.
 図3は、三層のニューラルネットワークの一例を示す概略図である。
 図3に示されているように、三層のニューラルネットワークであれば、複数の入力値が入力層X1~X3に入力されると、その入力値に第一の重みw11~w16(以下、第一の重みW1ともいう)が掛けられる。入力値に第一の重みw11~w16が掛けられた値である算出値は、中間層Y1、Y2に入力される。算出値には、第二の重みw21~w26(以下、第二の重みW2ともいう)が掛けられ、算出値に第二の重みw21~w26が掛けられた値である出力値が、出力層Z1~Z3から出力される。この出力値は、第一の重みW1の値と、第二の重みW2の値とによって変わる。
FIG. 3 is a schematic diagram showing an example of a three-layer neural network.
As shown in FIG. 3, in the case of a three-layer neural network, when a plurality of input values are input to the input layers X1 to X3, the first weights w11 to w16 (hereinafter, the first) are applied to the input values. The weight of one (also called W1) is multiplied. The calculated value, which is the value obtained by multiplying the input value by the first weights w11 to w16, is input to the intermediate layers Y1 and Y2. The calculated value is multiplied by the second weights w21 to w26 (hereinafter, also referred to as the second weight W2), and the output value, which is the value obtained by multiplying the calculated value by the second weights w21 to w26, is the output layer. It is output from Z1 to Z3. This output value varies depending on the value of the first weight W1 and the value of the second weight W2.
 本実施の形態において、ニューラルネットワークは、学習側データ取得部112によって取得される学習用データで示される熱画像及び可視画像の組み合せに基づいて作成される学習用データに従って、いわゆる教師あり学習により、熱画像に対応する最適な可視画像を推論するための学習済モデルを学習する。 In the present embodiment, the neural network is subjected to so-called supervised learning according to the learning data created based on the combination of the thermal image and the visible image represented by the learning data acquired by the learning side data acquisition unit 112. Learn a trained model to infer the optimal visible image for the thermal image.
 すなわち、ニューラルネットワークは、入力層に熱画像を入力して出力層から出力された結果が、正解としての可視画像に近づくように第一の重みW1及び第二の重みW2を調整することで、学習済モデルを学習する。 That is, the neural network inputs the thermal image to the input layer and adjusts the first weight W1 and the second weight W2 so that the result output from the output layer approaches the visible image as the correct answer. Learn the trained model.
 図4は、熱画像を可視画像へ変換する画像変換処理の学習済モデルの構造の一例を示す概略図である。
 図4に示されている学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となっており、スキップコネクションで接続されたU-Net構造を有している。
FIG. 4 is a schematic diagram showing an example of the structure of a trained model of an image conversion process for converting a thermal image into a visible image.
In the trained model shown in FIG. 4, the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure, and have a U-Net structure connected by a skip connection.
 図2に戻り、学習側学習済モデル記憶部114は、モデル生成部113から与えられた学習済モデルである学習側学習済モデルを記憶する。 Returning to FIG. 2, the learning side trained model storage unit 114 stores the learning side trained model which is the trained model given by the model generation unit 113.
 学習側通信部115は、学習側学習済モデル記憶部114に記憶されている学習側学習済モデルを姿勢推定装置130に送る。 The learning side communication unit 115 sends the learning side learned model stored in the learning side learned model storage unit 114 to the posture estimation device 130.
 以上に記載された学習装置110は、図4に示されているようなコンピュータ160で実現することができる。
 図5は、コンピュータ160の構成を概略的に示すブロック図である。
 コンピュータ160は、通信装置161と、補助記憶装置162と、メモリ163と、プロセッサ164とを備える。
The learning device 110 described above can be realized by a computer 160 as shown in FIG.
FIG. 5 is a block diagram schematically showing the configuration of the computer 160.
The computer 160 includes a communication device 161, an auxiliary storage device 162, a memory 163, and a processor 164.
 通信装置161は、例えば、ネットワークを介してデータを通信する。
 補助記憶装置162は、コンピュータ160での処理に必要なデータ及びプログラムを記憶する。
 メモリ163は、プログラム及びデータを一時的に記憶し、プロセッサ164の作業領域を提供する。
 プロセッサ164は、補助記憶装置162に記憶されているプログラムをメモリ163に読み出し、そのプログラムを実行することで、コンピュータ160での処理を実行する。
The communication device 161 communicates data via a network, for example.
The auxiliary storage device 162 stores data and programs necessary for processing in the computer 160.
The memory 163 temporarily stores programs and data and provides a work area for the processor 164.
The processor 164 reads the program stored in the auxiliary storage device 162 into the memory 163, and executes the program to execute the processing in the computer 160.
 以上に記載された、学習側入力部111及び学習側通信部115は、通信装置161により実現することができる。
 学習側学習済モデル記憶部114は、補助記憶装置162により実現することができる。
The learning side input unit 111 and the learning side communication unit 115 described above can be realized by the communication device 161.
The learned model storage unit 114 on the learning side can be realized by the auxiliary storage device 162.
 学習側データ取得部112及びモデル生成部113は、プロセッサ164が、メモリ163に読み出されたプログラムを実行することで実現することができる。このようなプログラムは、ネットワークを通じて提供されてもよく、また、記録媒体に記録されて提供されてもよい。即ち、このようなプログラムは、例えば、プログラムプロダクトとして提供されてもよい。 The learning side data acquisition unit 112 and the model generation unit 113 can be realized by the processor 164 executing the program read into the memory 163. Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
 図6は、学習装置110が学習する処理を示すフローチャートである。
 まず、学習側データ取得部112は、学習側入力部111を介して、学習用データを取得する(S10)。ここでは、学習用データとして用いられる、熱画像の画像データである熱画像データ及び可視画像の画像データである可視画像データが同時に取得されるものとしているが、実施の形態1はこのような例に限定されない。熱画像データと、その熱画像データの正解として用いられる可視画像データとを関連付けることができれば、これらは別のタイミングで取得されてもよい。取得された学習用データは、モデル生成部113に与えられる。
FIG. 6 is a flowchart showing a process of learning by the learning device 110.
First, the learning side data acquisition unit 112 acquires learning data via the learning side input unit 111 (S10). Here, it is assumed that the thermal image data, which is the image data of the thermal image, and the visible image data, which is the image data of the visible image, which are used as the training data, are acquired at the same time. Not limited to. If the thermal image data can be associated with the visible image data used as the correct answer for the thermal image data, they may be acquired at different timings. The acquired learning data is given to the model generation unit 113.
 次に、モデル生成部113は、学習用データで示される熱画像及び可視画像の組み合わせに基づいて、いわゆる教師あり学習により、熱画像に対応する出力である可視画像を学習し、学習済モデルを生成する(S11)。 Next, the model generation unit 113 learns the visible image, which is the output corresponding to the thermal image, by so-called supervised learning based on the combination of the thermal image and the visible image shown in the training data, and obtains the trained model. Generate (S11).
 次に、学習側学習済モデル記憶部114は、生成された学習モデルを記憶する(S12)。そして、学習側通信部115は、その学習モデルを姿勢推定装置130に送信する。 Next, the learning side learned model storage unit 114 stores the generated learning model (S12). Then, the learning side communication unit 115 transmits the learning model to the posture estimation device 130.
 図7は、姿勢推定装置130の構成を概略的に示すブロック図である。
 姿勢推定装置130は、推論装置140と、姿勢推定部として機能する姿勢推定実行装置150とを備える。
FIG. 7 is a block diagram schematically showing the configuration of the posture estimation device 130.
The posture estimation device 130 includes an inference device 140 and a posture estimation execution device 150 that functions as a posture estimation unit.
 推論装置140は、学習装置110から与えられる学習済モデルを推論側学習モデルとして用いて、熱画像から可視画像を推論する。
 推論装置140は、推論側通信部141と、推論側学習済モデル記憶部142と、推論側入力部143と、推論側データ取得部144と、推論部145とを備える。
The inference device 140 infers a visible image from a thermal image by using the trained model given by the learning device 110 as an inference side learning model.
The inference device 140 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 144, and an inference unit 145.
 推論側通信部141は、学習装置110からの学習済モデルを受信して、その学習済モデルを推論側学習済モデルとして、推論側学習済モデル記憶部142に記憶させる。
 推論側学習済モデル記憶部142は、推論側学習済モデルを記憶する記憶部である。
The inference side communication unit 141 receives the trained model from the learning device 110, and stores the trained model as the inference side trained model in the inference side trained model storage unit 142.
The inference side trained model storage unit 142 is a storage unit that stores the inference side trained model.
 推論側入力部143は、被写体の熱画像を示す熱画像データの入力を受け付ける入力部である。ここで入力される熱画像データを対象熱画像データともいう。また、対象熱画像データで示される熱画像を対象熱画像ともいい、対象熱画像に含まれている、姿勢を推定する対象である被写体を対象被写体ともいう。
 推論側データ取得部144は、推論側入力部143を介して、対象熱画像データを取得するデータ取得部である。取得された対象熱画像データは、推論部145に与えられる。
The inference side input unit 143 is an input unit that accepts input of thermal image data indicating a thermal image of a subject. The thermal image data input here is also referred to as target thermal image data. Further, the thermal image shown by the target thermal image data is also referred to as a target thermal image, and the subject included in the target thermal image, which is the target for estimating the posture, is also referred to as the target subject.
The inference side data acquisition unit 144 is a data acquisition unit that acquires target thermal image data via the inference side input unit 143. The acquired target thermal image data is given to the inference unit 145.
 推論部145は、推論側学習済モデル記憶部142に記憶されている推論側学習済モデルを用いて、対象熱画像データで示される熱画像から、対象被写体の可視画像を推論する。言い換えると、推論部145は、推論側学習済モデルに、対象熱画像データで示される熱画像を入力することで、その熱画像から推論される、その熱画像に対応する可視画像を取得することができる。そして、推論部145は、推論された可視画像を示す可視画像データを生成し、その可視画像データを姿勢推定実行装置150に与える。ここで生成される可視画像データを、対象可視画像データともいう。また、対象可視画像データで示される可視画像、言い換えると、推論された可視画像を対象可視画像ともいう。 The inference unit 145 infers a visible image of the target subject from the thermal image shown by the target thermal image data by using the inference side learned model stored in the inference side learned model storage unit 142. In other words, the inference unit 145 inputs the thermal image indicated by the target thermal image data into the inference side trained model, and acquires the visible image corresponding to the thermal image inferred from the thermal image. Can be done. Then, the inference unit 145 generates visible image data indicating the inferred visible image, and gives the visible image data to the posture estimation execution device 150. The visible image data generated here is also referred to as target visible image data. Further, the visible image shown by the target visible image data, in other words, the inferred visible image is also referred to as the target visible image.
 姿勢推定実行装置150は、対象可視画像データで示される可視画像から、その可視画像中に存在する被写体の姿勢を推定する。姿勢を推定する方法としては、予め可視画像と、人物の姿勢(例えば、パーツの位置関係)の対応関係を大量に学習しておき、可視画像が入力されたら、その可視画像に対応する人物の姿勢を学習結果に基づいて決定する、といった方法がある。 The posture estimation execution device 150 estimates the posture of the subject existing in the visible image from the visible image indicated by the target visible image data. As a method of estimating the posture, a large amount of correspondence between the visible image and the posture of the person (for example, the positional relationship of parts) is learned in advance, and when the visible image is input, the person corresponding to the visible image is used. There is a method of determining the posture based on the learning result.
 以上に記載された姿勢推定装置130も、図5に示されているようなコンピュータ160で実現することができる。
 例えば、推論側通信部141及び推論側入力部143は、通信装置161により実現することができる。
 推論側学習済モデル記憶部142は、補助記憶装置162により実現することができる。
The posture estimation device 130 described above can also be realized by a computer 160 as shown in FIG.
For example, the inference side communication unit 141 and the inference side input unit 143 can be realized by the communication device 161.
The inference side learned model storage unit 142 can be realized by the auxiliary storage device 162.
 推論側データ取得部144及び推論部145は、プロセッサ164が、メモリ163に読み出されたプログラムを実行することで実現することができる。このようなプログラムは、ネットワークを通じて提供されてもよく、また、記録媒体に記録されて提供されてもよい。即ち、このようなプログラムは、例えば、プログラムプロダクトとして提供されてもよい。 The inference side data acquisition unit 144 and the inference unit 145 can be realized by the processor 164 executing the program read into the memory 163. Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
 図8は、姿勢推定装置130が、熱画像に対応する可視画像を推論し、その可視画像から姿勢を推定する処理を示すフローチャートである。
 まず、推論側データ取得部144は、推論側入力部143を介して、熱画像を示す対象熱画像データを取得する(S20)。取得された対象熱画像データは、推論部145に与えられる。
FIG. 8 is a flowchart showing a process in which the posture estimation device 130 infers a visible image corresponding to a thermal image and estimates a posture from the visible image.
First, the inference side data acquisition unit 144 acquires the target thermal image data showing the thermal image via the inference side input unit 143 (S20). The acquired target thermal image data is given to the inference unit 145.
 次に、推論部145は、推論側学習済モデル記憶部142に記憶された推論側学習済モデルに、対象熱画像データで示される熱画像を入力し、その熱画像に対応する可視画像を得る(S21)。 Next, the inference unit 145 inputs the thermal image shown by the target thermal image data into the inference side learned model stored in the inference side learned model storage unit 142, and obtains a visible image corresponding to the thermal image. (S21).
 次に、推論部145は、推論側学習済モデルにより得られた、熱画像に対応する可視画像を示す対象可視画像データを生成し、その対象可視画像データを姿勢推定実行装置150に与える(S22)。 Next, the inference unit 145 generates target visible image data indicating a visible image corresponding to the thermal image obtained by the inference side trained model, and gives the target visible image data to the posture estimation execution device 150 (S22). ).
 次に、姿勢推定実行装置150は、対象可視画像データで示される可視画像中の被写体の姿勢を推定する(S23)。このようにして推定された姿勢に基づき、例えば、熱画像内に写る被写体の異常行動を検出することができる。 Next, the posture estimation execution device 150 estimates the posture of the subject in the visible image indicated by the target visible image data (S23). Based on the posture estimated in this way, for example, it is possible to detect the abnormal behavior of the subject reflected in the thermal image.
 以上のように、実施の形態1に係る姿勢推定システム100によれば、熱画像センサ等から出力される熱画像を可視画像へ変換し、学習済の可視画像向け姿勢推定器である姿勢推定実行装置150を用いて熱画像中の被写体の姿勢を推定することができる。このため、既存の学習済みの可視画像向け姿勢推定器を用いて、姿勢推定をすることが可能になる。 As described above, according to the posture estimation system 100 according to the first embodiment, the thermal image output from the thermal image sensor or the like is converted into a visible image, and the posture estimation execution which is the learned posture estimator for the visible image is executed. The attitude of the subject in the thermal image can be estimated using the device 150. Therefore, it is possible to estimate the posture using an existing trained posture estimator for visible images.
 また、熱画像向けの姿勢推定器を用いる場合は、熱画像と姿勢との関係を学習させる必要があり、熱画像への姿勢のアノテーション作業が必要となる。熱画像への人手でのアノテーション作業では、熱画像の解像度の不足から十分な精度で実施できない。実施の形態1では、熱画像向けの姿勢推定器を用いる必要がないため、これらの課題を回避することができる。 Also, when using a posture estimator for thermal images, it is necessary to learn the relationship between the thermal image and the posture, and it is necessary to annotate the posture to the thermal image. Manual annotation work on thermal images cannot be performed with sufficient accuracy due to insufficient resolution of thermal images. In the first embodiment, since it is not necessary to use the posture estimator for the thermal image, these problems can be avoided.
 なお、実施の形態1では、モデル生成部113が用いる学習アルゴリズムに教師あり学習を適用した場合について説明したが、実施の形態1はこのような例に限定されない。例えば、学習アルゴリズムについては、教師あり学習以外にも、強化学習、教師なし学習又は半教師あり学習等を使用することができる。 In the first embodiment, the case where supervised learning is applied to the learning algorithm used by the model generation unit 113 has been described, but the first embodiment is not limited to such an example. For example, as for the learning algorithm, reinforcement learning, unsupervised learning, semi-supervised learning, or the like can be used in addition to supervised learning.
 また、モデル生成部113は、姿勢推定装置130を含む複数の姿勢推定装置に対して作成される学習用データに従って、熱画像に対応する可視画像を学習するようにしてもよい。なお、モデル生成部113は、同一のエリアで使用される複数の姿勢推定装置から学習用データを取得してもよいし、異なるエリアで独立して動作する複数の姿勢推定装置から収集される学習用データを利用して熱画像に対応する可視画像を学習してもよい。 Further, the model generation unit 113 may learn the visible image corresponding to the thermal image according to the learning data created for the plurality of posture estimation devices including the posture estimation device 130. The model generation unit 113 may acquire learning data from a plurality of posture estimation devices used in the same area, or may collect learning data from a plurality of posture estimation devices that operate independently in different areas. The visible image corresponding to the thermal image may be learned by using the data.
 また、モデル生成部113は、学習用データを収集する姿勢推定装置を途中で対象に追加したり、対象から除去したりすることも可能である。
 さらに、モデル生成部113は、ある姿勢推定装置に関して熱画像に対応する可視画像を学習した学習済モデルを、これとは別の姿勢推定装置に適用し、その別の姿勢推定装置に関して熱画像に対応する可視画像を再学習して、学習済モデルを更新するようにしてもよい。
Further, the model generation unit 113 can add or remove the posture estimation device for collecting learning data from the target on the way.
Further, the model generation unit 113 applies the trained model that has learned the visible image corresponding to the thermal image for one posture estimation device to another posture estimation device, and applies the trained model to the thermal image for the other posture estimation device. The corresponding visible image may be retrained to update the trained model.
 また、モデル生成部113に用いられる学習アルゴリズムとしては、特徴量そのものの抽出を学習する、深層学習(Deep Learning)を用いることもできる。また、モデル生成部113は、他の公知の方法、例えば、遺伝的プログラミング、機能論理プログラミング、又は、サポートベクターマシン等に従って機械学習を実行してもよい。 Further, as the learning algorithm used in the model generation unit 113, deep learning, which learns the extraction of the feature amount itself, can also be used. Further, the model generation unit 113 may execute machine learning according to other known methods such as genetic programming, functional logic programming, or a support vector machine.
 なお、学習装置110及び推論装置140は、姿勢推定システム100の熱画像に対応する可視画像を学習するために使用されるが、例えば、ネットワークを介して姿勢推定実行装置150に接続されるようになっていてもよい。
 また、学習装置110、推論装置140又は姿勢推定実行装置150は、クラウドサーバ上に存在していてもよい。
The learning device 110 and the inference device 140 are used to learn the visible image corresponding to the thermal image of the posture estimation system 100, and are connected to the posture estimation execution device 150 via a network, for example. It may be.
Further, the learning device 110, the inference device 140, or the posture estimation execution device 150 may exist on the cloud server.
 また、以上に記載した実施の形態1における姿勢推定システム100では、学習装置110と、姿勢推定装置130とが別の装置であるが、例えば、学習装置110が、姿勢推定装置130内に設けられていてもよい。このような場合、学習側通信部115及び推論側通信部141は、不要となり、学習側学習済モデル記憶部114及び推論側学習済モデル記憶部142は、学習済モデル記憶部として統合することができる。 Further, in the posture estimation system 100 according to the first embodiment described above, the learning device 110 and the posture estimation device 130 are separate devices. For example, the learning device 110 is provided in the posture estimation device 130. You may be. In such a case, the learning side communication unit 115 and the inference side communication unit 141 become unnecessary, and the learning side learned model storage unit 114 and the inference side learned model storage unit 142 can be integrated as the learned model storage unit. can.
 なお、実施の形態1に係る姿勢推定システム100では、学習装置110で生成された学習済モデルを用いて、姿勢推定装置130が熱画像に対応する可視画像を推論しているが、実施の形態はこのような例に限定されない。例えば、姿勢推定装置130は、他のシステム等の外部から学習済モデルを取得し、この学習済モデルに基づいて熱画像に対応する可視画像を推論してもよい。 In the posture estimation system 100 according to the first embodiment, the posture estimation device 130 infers a visible image corresponding to a thermal image by using the trained model generated by the learning device 110. Is not limited to such an example. For example, the posture estimation device 130 may acquire a trained model from the outside such as another system and infer a visible image corresponding to a thermal image based on the trained model.
実施の形態2.
 図1に示されているように、実施の形態2に係る姿勢推定システム200は、学習装置210と、姿勢推定装置230とを備える。
Embodiment 2.
As shown in FIG. 1, the posture estimation system 200 according to the second embodiment includes a learning device 210 and a posture estimation device 230.
 図2に示されているように、実施の形態2における学習装置210は、学習側入力部111と、学習側データ取得部212と、モデル生成部213と、学習側学習済モデル記憶部114と、学習側通信部115とを備える。
 実施の形態2における学習装置210の学習側入力部111、学習側学習済モデル記憶部114及び学習側通信部115は、実施の形態1における学習装置110の学習側入力部111、学習側学習済モデル記憶部114及び学習側通信部115と同様である。
As shown in FIG. 2, the learning device 210 in the second embodiment includes a learning side input unit 111, a learning side data acquisition unit 212, a model generation unit 213, and a learning side learned model storage unit 114. , The learning side communication unit 115 is provided.
The learning side input unit 111, the learning side learned model storage unit 114, and the learning side communication unit 115 of the learning device 210 according to the second embodiment have the learning side input unit 111 and the learning side learned side of the learning device 110 according to the first embodiment. This is the same as the model storage unit 114 and the learning side communication unit 115.
 学習側データ取得部212は、学習側入力部111を介して、学習用データを取得する。実施の形態2において取得される学習用データは、熱画像を示す熱画像データと、その熱画像に対応する正解である可視画像を示す可視画像データと、その可視画像に対応する正解である、被写体の姿勢を示す姿勢情報とを含む。取得された学習用データは、モデル生成部213に与えられる。 The learning side data acquisition unit 212 acquires learning data via the learning side input unit 111. The learning data acquired in the second embodiment is the thermal image data showing the thermal image, the visible image data showing the visible image which is the correct answer corresponding to the thermal image, and the correct answer corresponding to the visible image. Includes posture information indicating the posture of the subject. The acquired learning data is given to the model generation unit 213.
 モデル生成部213は、学習側データ取得部212から与えられる学習用データに基づいて、熱画像に対応する可視画像と、その可視画像に対応する姿勢とを学習する。言い換えると、モデル生成部213は、学習用データで示される熱画像及び可視画像の組み合わせ、並びに、可視画像及び姿勢の組み合わせを学習することで、熱画像に対応する最適な姿勢を推論するための学習済モデルを生成する。具体的には、モデル生成部113は、学習用データを用いて、熱画像から可視画像への推論及び可視画像から姿勢への推論を学習することで、熱画像から姿勢を推論するための学習済モデルを生成する。
 そして、モデル生成部213は、生成された学習済モデルを学習側学習済モデルとして学習側学習済モデル記憶部114に記憶させる。
The model generation unit 213 learns the visible image corresponding to the thermal image and the posture corresponding to the visible image based on the learning data given from the learning side data acquisition unit 212. In other words, the model generation unit 213 learns the combination of the thermal image and the visible image shown in the training data, and the combination of the visible image and the posture, in order to infer the optimum posture corresponding to the thermal image. Generate a trained model. Specifically, the model generation unit 113 learns to infer the posture from the thermal image by learning the inference from the thermal image to the visible image and the inference from the visible image to the posture using the learning data. Generate a finished model.
Then, the model generation unit 213 stores the generated trained model in the learning side trained model storage unit 114 as the learning side trained model.
 図9は、実施の形態2における姿勢推定装置230の構成を概略的に示すブロック図である。
 姿勢推定装置230は、推論側通信部141と、推論側学習済モデル記憶部142と、推論側入力部143と、推論側データ取得部144と、推論部245とを備える。
FIG. 9 is a block diagram schematically showing the configuration of the posture estimation device 230 according to the second embodiment.
The attitude estimation device 230 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 144, and an inference side 245.
 実施の形態2における姿勢推定装置230の推論側通信部141、推論側学習済モデル記憶部142、推論側入力部143及び推論側データ取得部144は、実施の形態1における姿勢推定装置130の推論側通信部141、推論側学習済モデル記憶部142、推論側入力部143及び推論側データ取得部144と同様である。 The inference side communication unit 141, the inference side learned model storage unit 142, the inference side input unit 143, and the inference side data acquisition unit 144 of the attitude estimation device 230 according to the second embodiment are used for inference of the attitude estimation device 130 in the first embodiment. This is the same as the side communication unit 141, the inference side learned model storage unit 142, the inference side input unit 143, and the inference side data acquisition unit 144.
 推論部245は、推論側学習済モデル記憶部142に記憶されている推論側学習済モデルを用いて、対象熱画像データで示される熱画像から、可視画像を推論し、その可視画像から姿勢を推論する。言い換えると、推論部145は、推論側学習済モデルに、対象熱画像データで示される熱画像を入力することで、その熱画像から推論される、その熱画像中に存在する被写体の姿勢を推定する。 The inference unit 245 infers a visible image from the thermal image indicated by the target thermal image data by using the inference side learned model stored in the inference side trained model storage unit 142, and determines the posture from the visible image. Infer. In other words, the inference unit 145 estimates the posture of the subject existing in the thermal image, which is inferred from the thermal image, by inputting the thermal image indicated by the target thermal image data into the inference side trained model. do.
 以上に記載された姿勢推定装置230も、図5に示されているようなコンピュータ160で実現することができる。
 例えば、推論側通信部141及び推論側入力部143は、通信装置161により実現することができる。
 推論側学習済モデル記憶部142は、補助記憶装置162により実現することができる。
The posture estimation device 230 described above can also be realized by a computer 160 as shown in FIG.
For example, the inference side communication unit 141 and the inference side input unit 143 can be realized by the communication device 161.
The inference side learned model storage unit 142 can be realized by the auxiliary storage device 162.
 推論側データ取得部144及び推論部245は、プロセッサ164が、メモリ163に読み出されたプログラムを実行することで実現することができる。このようなプログラムは、ネットワークを通じて提供されてもよく、また、記録媒体に記録されて提供されてもよい。即ち、このようなプログラムは、例えば、プログラムプロダクトとして提供されてもよい。 The inference side data acquisition unit 144 and the inference unit 245 can be realized by the processor 164 executing the program read into the memory 163. Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
 以上のように、実施の形態2に係る姿勢推定システム200によれば、熱画像センサ等から出力される熱画像から被写体の姿勢を推定することが可能になる。学習時に可視画像及び姿勢を教師データとして入力することで、熱画像への姿勢情報のアノテーション作業を回避することが可能となる。 As described above, according to the posture estimation system 200 according to the second embodiment, it is possible to estimate the posture of the subject from the thermal image output from the thermal image sensor or the like. By inputting the visible image and the posture as teacher data at the time of learning, it is possible to avoid the work of annotating the posture information on the thermal image.
 更に、実施の形態1とは異なり、活用フェーズで可視画像を生成、出力しないことでネットワークの規模を抑えることができ、演算量を削減することができる。 Further, unlike the first embodiment, the scale of the network can be suppressed and the amount of calculation can be reduced by not generating and outputting the visible image in the utilization phase.
実施の形態3.
 図1に示されているように、実施の形態3に係る姿勢推定システム300は、学習装置310と、姿勢推定装置330とを備える。
Embodiment 3.
As shown in FIG. 1, the posture estimation system 300 according to the third embodiment includes a learning device 310 and a posture estimation device 330.
 図10は、学習装置310の構成を概略的に示すブロック図である。
 学習装置310は、学習側入力部111と、学習側データ取得部312と、モデル生成部313と、学習側学習済モデル記憶部114と、学習側通信部115と、学習側輪郭抽出部316とを備える。
FIG. 10 is a block diagram schematically showing the configuration of the learning device 310.
The learning device 310 includes a learning side input unit 111, a learning side data acquisition unit 312, a model generation unit 313, a learning side learned model storage unit 114, a learning side communication unit 115, and a learning side contour extraction unit 316. To prepare for.
 実施の形態3における学習装置310の学習側入力部111、学習側学習済モデル記憶部114及び学習側通信部115は、実施の形態1における学習装置110の学習側入力部111、学習側学習済モデル記憶部114及び学習側通信部115と同様である。 The learning side input unit 111, the learning side learned model storage unit 114, and the learning side communication unit 115 of the learning device 310 according to the third embodiment have the learning side input unit 111 and the learning side learned side of the learning device 110 according to the first embodiment. This is the same as the model storage unit 114 and the learning side communication unit 115.
 学習側データ取得部312は、学習側入力部111を介して、学習用データを取得する。取得された学習用データは、モデル生成部313に与えられる。
 また、学習側データ取得部312は、取得された学習用データに含まれている熱画像を示す熱画像データを学習側熱画像データとして学習側輪郭抽出部316に与える。
The learning side data acquisition unit 312 acquires learning data via the learning side input unit 111. The acquired learning data is given to the model generation unit 313.
Further, the learning side data acquisition unit 312 gives the learning side contour extraction unit 316 the thermal image data indicating the thermal image included in the acquired learning data as the learning side thermal image data.
 学習側輪郭抽出部316は、学習側熱画像データで示される熱画像から、被写体の輪郭を示す輪郭画像を抽出する輪郭抽出部である。抽出方法は、キャニー法若しくはソーベル法等のエッジ検出処理を用いる方法、又は、二値化処理とエッジ検出を組み合わせる方法等がある。エッジ検出処理では、被写体のエッジが検出される。また、二値化処理とエッジ検出の組み合わせは、熱画像に対して二値化処理を行ってから、エッジ検出処理が行われればよい。そして、学習側輪郭抽出部316は、抽出された輪郭画像を示す輪郭画像データを、学習側輪郭画像データとしてモデル生成部313に与える。 The learning side contour extraction unit 316 is a contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image shown by the learning side thermal image data. As the extraction method, there are a method using an edge detection process such as a canny method or a Sobel method, a method of combining binarization process and edge detection, and the like. In the edge detection process, the edge of the subject is detected. Further, in the combination of the binarization process and the edge detection, the edge detection process may be performed after the binarization process is performed on the thermal image. Then, the learning side contour extraction unit 316 gives the contour image data indicating the extracted contour image to the model generation unit 313 as the learning side contour image data.
 モデル生成部313は、学習側データ取得部312から与えられる学習用データ及び学習側輪郭抽出部316から与えられる学習側輪郭画像データに基づいて、熱画像に対応する可視画像を学習する。言い換えると、モデル生成部313は、学習用データで示される熱画像及び学習側輪郭画像データで示される輪郭画像と、学習用データで示される可視画像との組み合わせを学習することで、熱画像及びその輪郭画像に対応する最適な可視画像を推論するための学習済モデルを生成する。具体的には、モデル生成部313は、熱画像及び輪郭画像の組み合わせから可視画像への推論を学習することで、熱画像及び輪郭画像の組み合わせから可視画像を推論するための学習済モデルを生成する。
 そして、モデル生成部313は、生成された学習済モデルを学習側学習済モデルとして学習側学習済モデル記憶部114に記憶させる。
The model generation unit 313 learns a visible image corresponding to a thermal image based on the learning data given by the learning side data acquisition unit 312 and the learning side contour image data given by the learning side contour extraction unit 316. In other words, the model generation unit 313 learns the combination of the thermal image shown by the training data, the contour image shown by the learning side contour image data, and the visible image shown by the training data, thereby forming the thermal image and the thermal image. Generate a trained model to infer the optimal visible image corresponding to the contour image. Specifically, the model generation unit 313 generates a trained model for inferring a visible image from a combination of thermal images and contour images by learning inference from a combination of thermal images and contour images to a visible image. do.
Then, the model generation unit 313 stores the generated learned model as the learning side learned model in the learning side learned model storage unit 114.
 図11は、実施の形態3における、熱画像及び輪郭画像を可視画像へ変換する画像変換処理の学習済モデルの構造の一例を示す概略図である。
 図11に示されている学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有している。そのデコーダー部分は、並列の二つのパスを備えており、その二つのパスは、熱画像をデコードするためのパスと、輪郭画像をデコードするためのパスである。
FIG. 11 is a schematic diagram showing an example of the structure of a trained model of an image conversion process for converting a thermal image and a contour image into a visible image in the third embodiment.
The trained model shown in FIG. 11 has a U-Net structure in which the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure and are connected by a skip connection. The decoder portion comprises two parallel paths, the two paths being a path for decoding the thermal image and a path for decoding the contour image.
 これにより、図11に示されている学習済モデルは、デコーダー部分が並列で2パス存在し、一方は熱画像を、もう一方は輪郭画像のデコードを行う。モデルの中央のレイヤーでデコードされた2つのベクトル情報が連結され、連結された情報がエンコーダー部分へ入力される。
 このような構造を有することで、実施の形態3では、熱画像から変換された可視画像にエッジ成分がより多く含まれ、姿勢推定の精度を向上させることができる。
As a result, in the trained model shown in FIG. 11, the decoder portion has two paths in parallel, one of which decodes the thermal image and the other of which decodes the contour image. The two vector information decoded in the center layer of the model are concatenated, and the concatenated information is input to the encoder part.
By having such a structure, in the third embodiment, the visible image converted from the thermal image contains a larger amount of edge components, and the accuracy of posture estimation can be improved.
 以上に記載された学習装置310も、図5に示されているようなコンピュータ160で実現することができる。
 例えば、学習側データ取得部312、モデル生成部313及び学習側輪郭抽出部316も、プロセッサ164が、メモリ163に読み出されたプログラムを実行することで実現することができる。このようなプログラムは、ネットワークを通じて提供されてもよく、また、記録媒体に記録されて提供されてもよい。即ち、このようなプログラムは、例えば、プログラムプロダクトとして提供されてもよい。
The learning device 310 described above can also be realized by a computer 160 as shown in FIG.
For example, the learning side data acquisition unit 312, the model generation unit 313, and the learning side contour extraction unit 316 can also be realized by the processor 164 executing the program read into the memory 163. Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
 図12は、姿勢推定装置330の構成を概略的に示すブロック図である。
 姿勢推定装置330は、推論装置340と、姿勢推定実行装置150とを備える。
 実施の形態3における姿勢推定装置330の姿勢推定実行装置150は、実施の形態1における姿勢推定実行装置150と同様である。
FIG. 12 is a block diagram schematically showing the configuration of the posture estimation device 330.
The posture estimation device 330 includes an inference device 340 and a posture estimation execution device 150.
The posture estimation execution device 150 of the posture estimation device 330 in the third embodiment is the same as the posture estimation execution device 150 in the first embodiment.
 推論装置340は、推論側通信部141と、推論側学習済モデル記憶部142と、推論側入力部143と、推論側データ取得部344と、推論部345と、推論側輪郭抽出部346とを備える。
 実施の形態3における推論装置340の推論側通信部141、推論側学習済モデル記憶部142及び推論側入力部143は、実施の形態1における推論装置140の推論側通信部141、推論側学習済モデル記憶部142及び推論側入力部143と同様である。
The inference device 340 includes an inference side communication unit 141, an inference side learned model storage unit 142, an inference side input unit 143, an inference side data acquisition unit 344, an inference side 345, and an inference side contour extraction unit 346. Be prepared.
The inference side communication unit 141, the inference side learned model storage unit 142 and the inference side input unit 143 of the inference device 340 in the third embodiment have the inference side communication unit 141 and the inference side learned in the inference device 140 in the first embodiment. This is the same as the model storage unit 142 and the inference side input unit 143.
 推論側データ取得部144は、推論側入力部143を介して、対象熱画像データを取得する。そして、推論側データ取得部144は、取得された対象熱画像データを、推論部345及び推論側輪郭抽出部346に与える。 The inference side data acquisition unit 144 acquires the target thermal image data via the inference side input unit 143. Then, the inference side data acquisition unit 144 gives the acquired target thermal image data to the inference unit 345 and the inference side contour extraction unit 346.
 推論側輪郭抽出部346は、対象熱画像データで示される熱画像から輪郭画像を抽出する輪郭抽出部である。抽出方法は、学習側輪郭抽出部316と同一とする。そして、推論側輪郭抽出部346は、抽出された輪郭画像を示す輪郭画像データを、推論側輪郭画像データとして推論部345に与える。ここで抽出される輪郭画像を対象輪郭画像ともいい、推論側輪郭画像データを対象輪郭画像データともいう。 The inference side contour extraction unit 346 is a contour extraction unit that extracts a contour image from the thermal image indicated by the target thermal image data. The extraction method is the same as that of the learning side contour extraction unit 316. Then, the inference side contour extraction unit 346 provides the inference side contour image data to the inference unit 345 with contour image data indicating the extracted contour image. The contour image extracted here is also referred to as a target contour image, and the inference side contour image data is also referred to as a target contour image data.
 推論部345は、推論側学習済モデル記憶部142に記憶されている推論側学習済モデルを用いて、対象熱画像データで示される熱画像及び推論側輪郭画像データで示される輪郭画像の組み合わせから、可視画像を推論する。言い換えると、推論部345は、推論側学習済モデルに、対象熱画像データで示される熱画像及び推論側輪郭画像データで示される輪郭画像を入力することで、その熱画像から推論される、その熱画像に対応する可視画像を取得することができる。そして、推論部345は、推論された可視画像を示す可視画像データを生成し、その可視画像データを姿勢推定実行装置150に与える。ここで生成される可視画像データを、対象可視画像データともいう。対象可視画像データで示される可視画像を対象可視画像ともいう。 The inference unit 345 uses the inference side trained model stored in the inference side trained model storage unit 142 from the combination of the thermal image shown in the target thermal image data and the contour image shown in the inference side contour image data. , Infer the visible image. In other words, the inference unit 345 is inferred from the thermal image by inputting the thermal image shown by the target thermal image data and the contour image indicated by the inference side contour image data into the inference side trained model. A visible image corresponding to a thermal image can be acquired. Then, the inference unit 345 generates visible image data indicating the inferred visible image, and gives the visible image data to the posture estimation execution device 150. The visible image data generated here is also referred to as target visible image data. The visible image indicated by the target visible image data is also referred to as a target visible image.
 以上に記載された姿勢推定装置330も、図5に示されているようなコンピュータ160で実現することができる。
 例えば、推論側データ取得部344、推論部345及び推論側輪郭抽出部346は、プロセッサ164が、メモリ163に読み出されたプログラムを実行することで実現することができる。このようなプログラムは、ネットワークを通じて提供されてもよく、また、記録媒体に記録されて提供されてもよい。即ち、このようなプログラムは、例えば、プログラムプロダクトとして提供されてもよい。
The posture estimation device 330 described above can also be realized by a computer 160 as shown in FIG.
For example, the inference side data acquisition unit 344, the inference side 345, and the inference side contour extraction unit 346 can be realized by the processor 164 executing the program read into the memory 163. Such a program may be provided through a network, or may be recorded and provided on a recording medium. That is, such a program may be provided, for example, as a program product.
 一般的に、熱画像は、曖昧な輪郭情報を有するため、学習済モデルを用いて生成される可視画像も曖昧な輪郭となる。姿勢の推定では輪郭情報が重要となるため、輪郭が曖昧な画像では姿勢推定の精度が低下する。
 これに対して、実施の形態3に係る姿勢推定システム300によれば、熱画像と輪郭画像とを学習済モデルへ同時に入力することで、輪郭が曖昧ではない可視画像を生成することができる。これにより熱画像単体を学習済モデルへ入力することと比較して、生成された可視画像からの姿勢推定精度を向上させることができる。
In general, since the thermal image has ambiguous contour information, the visible image generated by using the trained model also has an ambiguous contour. Since contour information is important for posture estimation, the accuracy of posture estimation is reduced for images with ambiguous contours.
On the other hand, according to the posture estimation system 300 according to the third embodiment, by inputting the thermal image and the contour image into the trained model at the same time, it is possible to generate a visible image in which the contour is not ambiguous. As a result, the attitude estimation accuracy from the generated visible image can be improved as compared with inputting the thermal image alone into the trained model.
 100,200,300 姿勢推定システム、 110,210,310 学習装置、 111 学習側入力部、 112,212,312 学習側データ取得部、 113,213,313 モデル生成部、 114 学習側学習済モデル記憶部、 115 学習側通信部、 316 学習側輪郭抽出部、 130,230,330 姿勢推定装置、 140,340 推論装置、 141 推論側通信部、 142 推論側学習済モデル記憶部、 143 推論側入力部、 144,344 推論側データ取得部、 145,245,345 推論部、 346 推論側輪郭抽出部、 150 姿勢推定実行装置。 100,200,300 posture estimation system, 110,210,310 learning device, 111 learning side input unit, 112,212,312 learning side data acquisition unit, 113,213,313 model generation unit, 114 learning side learned model storage Unit, 115 learning side communication unit, 316 learning side contour extraction unit, 130, 230, 330 attitude estimation device, 140, 340 inference device, 141 inference side communication unit, 142 inference side learned model storage unit, 143 inference side input unit , 144,344 Inference side data acquisition unit, 145,245,345 Inference unit, 346 Inference side contour extraction unit, 150 Attitude estimation execution device.

Claims (26)

  1.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部と、
     前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成するモデル生成部と、を備えること
     を特徴とする学習装置。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. A data acquisition unit that acquires training data, and
    It is provided with a model generation unit that generates a trained model for inferring the visible image from the thermal image by learning the inference from the thermal image to the visible image using the learning data. Characterized learning device.
  2.  前記学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有していること
     を特徴とする請求項1に記載の学習装置。
    The learning device according to claim 1, wherein the trained model has a U-Net structure in which a layer of a decoder portion and a layer of an encoder portion have a symmetrical structure and are connected by a skip connection. ..
  3.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得するデータ取得部と、
     前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成するモデル生成部と、を備えること
     を特徴とする学習装置。
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. A data acquisition unit that acquires learning data including attitude information indicating the attitude of the subject, and
    By learning the inference from the thermal image to the visible image and the inference from the visible image to the posture using the training data, a trained model for inferring the posture from the thermal image is generated. A learning device characterized by having a model generator and a model generator.
  4.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部と、
     前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出する輪郭抽出部と、
     前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成するモデル生成部と、を備えること
     を特徴とする学習装置。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. A data acquisition unit that acquires training data, and
    A contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and a contour extraction unit.
    A model generation unit that generates a trained model for inferring the visible image from the combination of the thermal image and the contour image by learning the inference from the combination of the thermal image and the contour image to the visible image. A learning device characterized by being equipped with.
  5.  前記学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有しており、
     前記デコーダー部分が並列の二つのパスを備えており、
     前記二つのパスは、前記熱画像をデコードするためのパスと、前記輪郭画像をデコードするためのパスであること
     を特徴とする請求項4に記載の学習装置。
    The trained model has a U-Net structure in which the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure and are connected by a skip connection.
    The decoder part has two parallel paths,
    The learning device according to claim 4, wherein the two paths are a path for decoding the thermal image and a path for decoding the contour image.
  6.  前記輪郭抽出部は、前記被写体のエッジを検出するエッジ検出処理により、前記熱画像から前記輪郭画像を抽出すること
     を特徴とする請求項4又は5に記載の学習装置。
    The learning device according to claim 4 or 5, wherein the contour extraction unit extracts the contour image from the thermal image by an edge detection process for detecting the edge of the subject.
  7.  前記輪郭抽出部は、前記熱画像に対して二値化処理を行ってから、エッジ検出処理を行うことにより、前記熱画像から前記輪郭画像を抽出すること
     を特徴とする請求項4又は5に記載の学習装置。
    According to claim 4 or 5, the contour extraction unit extracts the contour image from the thermal image by performing a binarization process on the thermal image and then performing an edge detection process. Described learning device.
  8.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを記憶する記憶部と、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、
     前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論する推論部と、
     前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部と、を備えること
     を特徴とする活用装置。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays emitted from the subject, and a visible image that images the subject by using the visible light reflected from the subject. A storage unit that stores a trained model for inferring the visible image from the thermal image, which is generated by learning the inference from the thermal image to the visible image using the training data.
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and a data acquisition unit.
    An inference unit that infers a target visible image, which is a visible image of the target subject, from the target thermal image using the trained model.
    A utilization device including a posture estimation unit that estimates the posture of the target subject from the target visible image.
  9.  前記学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有していること
     を特徴とする請求項8に記載の活用装置。
    The utilization device according to claim 8, wherein the trained model has a U-Net structure in which a layer of a decoder portion and a layer of an encoder portion have a symmetrical structure and are connected by a skip connection. ..
  10.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを記憶する記憶部と、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、
     前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論する推論部と、を備えること
     を特徴とする活用装置。
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. Using the learning data including the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the inference from the visible image to the posture is learned. A storage unit that stores a trained model for inferring the posture from the thermal image, and a storage unit.
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and a data acquisition unit.
    An inflection device characterized by including an inference unit that infers the posture of the target subject from the target thermal image using the trained model.
  11.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを記憶する記憶部と、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部と、
     前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出する輪郭抽出部と、
     前記学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論する推論部と、
     前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部と、を備えること
     を特徴とする活用装置。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using the visible light reflected from the subject. Using the training data and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the inference from the combination of the thermal image and the contour image to the visible image is learned. A storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image generated by the above.
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and a data acquisition unit.
    A contour extraction unit that extracts a target contour image, which is a contour image showing the contour of the target subject, from the target thermal image, and a contour extraction unit.
    An inference unit that infers a target visible image, which is a visible image of the target subject, from a combination of the target thermal image and the target contour image using the trained model.
    A utilization device including a posture estimation unit that estimates the posture of the target subject from the target visible image.
  12.  前記学習済モデルは、デコーダー部分のレイヤーと、エンコーダー部分のレイヤーとが対称構造となり、スキップコネクションで接続されたU-Net構造を有しており、
     前記デコーダー部分が並列の二つのパスを備えており、
     前記二つのパスは、前記熱画像をデコードするためのパスと、前記輪郭画像をデコードするためのパスであること
     を特徴とする請求項11に記載の活用装置。
    The trained model has a U-Net structure in which the layer of the decoder portion and the layer of the encoder portion have a symmetrical structure and are connected by a skip connection.
    The decoder part has two parallel paths,
    The utilization device according to claim 11, wherein the two paths are a path for decoding the thermal image and a path for decoding the contour image.
  13.  前記輪郭抽出部は、エッジ検出処理により、前記熱画像から前記輪郭画像を抽出すること
     を特徴とする請求項11又は12に記載の活用装置。
    The utilization device according to claim 11 or 12, wherein the contour extraction unit extracts the contour image from the thermal image by an edge detection process.
  14.  前記輪郭抽出部は、前記熱画像に対して二値化処理を行ってから、エッジ検出処理を行うことにより、前記熱画像から前記輪郭画像を抽出すること
     を特徴とする請求項11又は12に記載の活用装置。
    According to claim 11 or 12, the contour extraction unit extracts the contour image from the thermal image by performing a binarization process on the thermal image and then performing an edge detection process. Described utilization device.
  15.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部、及び、
     前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成するモデル生成部、として機能させること
     を特徴とするプログラム。
    Computer,
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. Data acquisition unit that acquires training data, and
    By learning the inference from the thermal image to the visible image using the training data, it can function as a model generation unit that generates a trained model for inferring the visible image from the thermal image. Characterized program.
  16.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得するデータ取得部、及び、
     前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成するモデル生成部、として機能させること
     を特徴とするプログラム。
    Computer,
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. A data acquisition unit that acquires learning data including posture information indicating the posture of the subject, and
    By learning the inference from the thermal image to the visible image and the inference from the visible image to the posture using the training data, a trained model for inferring the posture from the thermal image is generated. A program characterized by functioning as a model generator.
  17.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得するデータ取得部、
     前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出する輪郭抽出部、及び、
     前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成するモデル生成部、として機能させること
     を特徴とするプログラム。
    Computer,
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. Data acquisition unit that acquires training data,
    A contour extraction unit that extracts a contour image showing the contour of the subject from the thermal image, and a contour extraction unit.
    A model generation unit that generates a trained model for inferring the visible image from the combination of the thermal image and the contour image by learning the inference from the combination of the thermal image and the contour image to the visible image. A program characterized by functioning as.
  18.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを記憶する記憶部、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、
     前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論する推論部、及び、
     前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部、として機能させること
     を特徴とするプログラム。
    Computer,
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays emitted from the subject, and a visible image that images the subject by using the visible light reflected from the subject. A storage unit that stores a trained model for inferring the visible image from the thermal image, which is generated by learning the inference from the thermal image to the visible image using the training data.
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject.
    An inference unit that infers a target visible image, which is a visible image of the target subject, from the target thermal image using the trained model, and
    A program characterized by functioning as a posture estimation unit that estimates the posture of the target subject from the target visible image.
  19.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを記憶する記憶部、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、及び、
     前記学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論する推論部、として機能させること
     を特徴とするプログラム。
    Computer,
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. Using the learning data including the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the inference from the visible image to the posture is learned. A storage unit that stores a trained model for inferring the posture from the thermal image,
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject, and a data acquisition unit.
    A program characterized by using the trained model to function as an inference unit that infers the posture of the target subject from the target thermal image.
  20.  コンピュータを、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを記憶する記憶部、
     対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得するデータ取得部、
     前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出する輪郭抽出部、
     前記学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論する推論部、及び、
     前記対象可視画像から、前記対象被写体の姿勢を推定する姿勢推定部、として機能させること
     を特徴とするプログラム。
    Computer,
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays emitted from the subject, and a visible image that images the subject by using visible light reflected from the subject. Using the training data and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the inference from the combination of the thermal image and the contour image to the visible image is learned. A storage unit that stores a trained model for inferring the visible image from the combination of the thermal image and the contour image generated by the above.
    A data acquisition unit that acquires target thermal image data indicating a target thermal image that is a thermal image of the target subject that is the target subject.
    A contour extraction unit that extracts a target contour image, which is a contour image showing the contour of the target subject, from the target thermal image.
    An inference unit that infers a target visible image, which is a visible image of the target subject, from a combination of the target thermal image and the target contour image using the trained model, and
    A program characterized by functioning as a posture estimation unit that estimates the posture of the target subject from the target visible image.
  21.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得し、
     前記学習用データを用いて前記熱画像から前記可視画像への推論を学習することで、前記熱画像から前記可視画像を推論するための学習済モデルを生成すること
     を特徴とする学習方法。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. Get the training data,
    A learning method characterized in that a trained model for inferring the visible image from the thermal image is generated by learning the inference from the thermal image to the visible image using the learning data.
  22.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを取得し、
     前記学習用データを用いて、前記熱画像から前記可視画像への推論及び前記可視画像から前記姿勢への推論を学習することで、前記熱画像から前記姿勢を推論するための学習済モデルを生成すること
     を特徴とする学習方法。
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. Acquire learning data including posture information indicating the posture of the subject,
    By learning the inference from the thermal image to the visible image and the inference from the visible image to the posture using the training data, a trained model for inferring the posture from the thermal image is generated. A learning method characterized by doing.
  23.  被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを取得し、
     前記熱画像から前記被写体の輪郭を示す輪郭画像を抽出し、
     前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを生成すること
     を特徴とする学習方法。
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, and a visible image that images the subject by using visible light reflected from the subject. Get the training data,
    A contour image showing the contour of the subject is extracted from the thermal image, and the contour image is extracted.
    By learning the inference from the combination of the thermal image and the contour image to the visible image, a trained model for inferring the visible image from the combination of the thermal image and the contour image is generated. How to learn.
  24.  対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習することで生成された、前記熱画像から前記可視画像を推論するための学習済モデルを用いて、前記対象熱画像から、前記対象被写体の可視画像である対象可視画像を推論し、
     前記対象可視画像から、前記対象被写体の姿勢を推定すること
     を特徴とする活用方法。
    Acquires the target thermal image data indicating the target thermal image, which is the thermal image of the target subject, which is the target subject.
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays emitted from the subject, and a visible image that images the subject by using the visible light reflected from the subject. From the target thermal image using a trained model for inferring the visible image from the thermal image, which was generated by learning the inference from the thermal image to the visible image using the training data. , Inferring the target visible image, which is the visible image of the target subject,
    A utilization method characterized by estimating the posture of the target subject from the target visible image.
  25.  対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像と、前記被写体の姿勢を示す姿勢情報とを含む学習用データを用いて、前記熱画像から前記可視画像への推論を学習するとともに、前記可視画像から前記姿勢への推論を学習することで生成された、前記熱画像から前記姿勢を推論するための学習済モデルを用いて、前記対象熱画像から、前記対象被写体の姿勢を推論すること
     を特徴とする活用方法。
    Acquires the target thermal image data indicating the target thermal image, which is the thermal image of the target subject, which is the target subject.
    A thermal image that images the temperature distribution of the subject by using infrared rays radiated from the subject, a visible image that images the subject by using visible light reflected from the subject, and the above. Using the learning data including the posture information indicating the posture of the subject, the inference from the thermal image to the visible image is learned, and the inference from the visible image to the posture is learned. A utilization method characterized by inferring the posture of the target subject from the target thermal image using a trained model for inferring the posture from the thermal image.
  26.  対象となる被写体である対象被写体の熱画像である対象熱画像を示す対象熱画像データを取得し、
     前記対象熱画像から前記対象被写体の輪郭を示す輪郭画像である対象輪郭画像を抽出し、
     被写体から放射される赤外線を利用することで、前記被写体の温度分布を画像化した熱画像と、前記被写体から反射される可視光を利用することで、前記被写体を画像化した可視画像とを含む学習用データ、及び、前記熱画像から抽出された、前記被写体の輪郭を示す輪郭画像を示す輪郭画像データを用いて、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像への推論を学習することで生成された、前記熱画像及び前記輪郭画像の組み合わせから前記可視画像を推論するための学習済モデルを用いて、前記対象熱画像及び前記対象輪郭画像の組み合わせから、前記対象被写体の可視画像である対象可視画像を推論し、
     前記対象可視画像から、前記対象被写体の姿勢を推定すること
     を特徴とする活用方法。
    Acquires the target thermal image data indicating the target thermal image, which is the thermal image of the target subject, which is the target subject.
    A target contour image, which is a contour image showing the contour of the target subject, is extracted from the target thermal image.
    Includes a thermal image that images the temperature distribution of the subject by using infrared rays emitted from the subject, and a visible image that images the subject by using visible light reflected from the subject. Using the training data and the contour image data showing the contour image showing the contour of the subject extracted from the thermal image, the inference from the combination of the thermal image and the contour image to the visible image is learned. Using the trained model for inferring the visible image from the combination of the thermal image and the contour image generated by the above, the visible image of the target subject is obtained from the combination of the target thermal image and the target contour image. Infer the target visible image that is
    A utilization method characterized by estimating the posture of the target subject from the target visible image.
PCT/JP2020/027027 2020-07-10 2020-07-10 Learning device, utilization device, program, learning method, and utilization method WO2022009419A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2020/027027 WO2022009419A1 (en) 2020-07-10 2020-07-10 Learning device, utilization device, program, learning method, and utilization method
JP2020552066A JP6797344B1 (en) 2020-07-10 2020-07-10 Learning device, utilization device, program, learning method and utilization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/027027 WO2022009419A1 (en) 2020-07-10 2020-07-10 Learning device, utilization device, program, learning method, and utilization method

Publications (1)

Publication Number Publication Date
WO2022009419A1 true WO2022009419A1 (en) 2022-01-13

Family

ID=73646788

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/027027 WO2022009419A1 (en) 2020-07-10 2020-07-10 Learning device, utilization device, program, learning method, and utilization method

Country Status (2)

Country Link
JP (1) JP6797344B1 (en)
WO (1) WO2022009419A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230377192A1 (en) * 2022-05-23 2023-11-23 Dell Products, L.P. System and method for detecting postures of a user of an information handling system (ihs) during extreme lighting conditions

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008286725A (en) * 2007-05-21 2008-11-27 Mitsubishi Electric Corp Person detector and detection method
JP2011091523A (en) * 2009-10-21 2011-05-06 Victor Co Of Japan Ltd Shape recognition method and shape recognition device
JP2017220779A (en) * 2016-06-07 2017-12-14 オムロン株式会社 Display control device, display control system, display control method, display control program, and recording medium
JP2019003554A (en) * 2017-06-19 2019-01-10 コニカミノルタ株式会社 Image recognition device, image recognition method, and image recognition device-purpose program
JP2019530116A (en) * 2016-09-05 2019-10-17 ケイロン メディカル テクノロジーズ リミテッド Multimodal medical image processing
JP2020030458A (en) * 2018-08-20 2020-02-27 株式会社デンソーアイティーラボラトリ Inference device, learning method, program and learned model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008286725A (en) * 2007-05-21 2008-11-27 Mitsubishi Electric Corp Person detector and detection method
JP2011091523A (en) * 2009-10-21 2011-05-06 Victor Co Of Japan Ltd Shape recognition method and shape recognition device
JP2017220779A (en) * 2016-06-07 2017-12-14 オムロン株式会社 Display control device, display control system, display control method, display control program, and recording medium
JP2019530116A (en) * 2016-09-05 2019-10-17 ケイロン メディカル テクノロジーズ リミテッド Multimodal medical image processing
JP2019003554A (en) * 2017-06-19 2019-01-10 コニカミノルタ株式会社 Image recognition device, image recognition method, and image recognition device-purpose program
JP2020030458A (en) * 2018-08-20 2020-02-27 株式会社デンソーアイティーラボラトリ Inference device, learning method, program and learned model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"An illustrated guide to deep learning, revised second edition", 19 November 2018, KODANSHA LTD. , JP , ISBN: 978-4-06-513331-6, article YAMASHITA, TAKAYOSHI: "Passage; Illustrated Guide to Deep Learning", pages: 108, XP009534337 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230377192A1 (en) * 2022-05-23 2023-11-23 Dell Products, L.P. System and method for detecting postures of a user of an information handling system (ihs) during extreme lighting conditions
US11836825B1 (en) * 2022-05-23 2023-12-05 Dell Products L.P. System and method for detecting postures of a user of an information handling system (IHS) during extreme lighting conditions

Also Published As

Publication number Publication date
JPWO2022009419A1 (en) 2022-01-13
JP6797344B1 (en) 2020-12-09

Similar Documents

Publication Publication Date Title
WO2020088588A1 (en) Deep learning-based static three-dimensional method for detecting whether face belongs to living body
JP6946831B2 (en) Information processing device and estimation method for estimating the line-of-sight direction of a person, and learning device and learning method
US8406470B2 (en) Object detection in depth images
JP5493108B2 (en) Human body identification method and human body identification device using range image camera
WO2018163555A1 (en) Image processing device, image processing method, and image processing program
KR20180057096A (en) Device and method to perform recognizing and training face expression
JP5886616B2 (en) Object detection apparatus, method for controlling object detection apparatus, and program
JP6603548B2 (en) Improved data comparison method
CN110909561A (en) Eye state detection system and operation method thereof
JP6797344B1 (en) Learning device, utilization device, program, learning method and utilization method
JP6773825B2 (en) Learning device, learning method, learning program, and object recognition device
JP2007304721A (en) Image processing device and image processing method
JP5300795B2 (en) Facial expression amplification device, facial expression recognition device, facial expression amplification method, facial expression recognition method, and program
JP4011426B2 (en) Face detection device, face detection method, and face detection program
JP2021149687A (en) Device, method and program for object recognition
WO2022244536A1 (en) Work recognition device and work recognition method
JP2009009206A (en) Extraction method of outline inside image and image processor therefor
JP7349290B2 (en) Object recognition device, object recognition method, and object recognition program
Khare et al. Machine vision theory and applications for cyber-physical systems
WO2023119968A1 (en) Method for calculating three-dimensional coordinates and device for calculating three-dimensional coordinates
JP5773935B2 (en) How to classify objects in a scene
JP2009003644A (en) Eye opening degree decision device
JP7124746B2 (en) Partial Object Position Estimation Program, Neural Network Structure for Partial Object Position Estimation, Partial Object Position Estimation Method, and Partial Object Position Estimation Apparatus
KR20210079137A (en) System and method for automatic recognition of user motion
KR20210091033A (en) Electronic device for estimating object information and generating virtual object and method for operating the same

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020552066

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20943920

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20943920

Country of ref document: EP

Kind code of ref document: A1