WO2022153984A1

WO2022153984A1 - Learning data generation method, model generation method, and learning data generation device

Info

Publication number: WO2022153984A1
Application number: PCT/JP2022/000607
Authority: WO
Inventors: 拓也柴山; 和良長谷川; 政人峯岸; 亮坂本
Original assignee: 株式会社Preferred Networks; 三井物産株式会社
Priority date: 2021-01-15
Filing date: 2022-01-11
Publication date: 2022-07-21

Abstract

A learning data generation method according to one embodiment of the present invention is implemented by using at least one processor, and involves: generating a model of a structure on the basis of multiple feature quantities pertaining to the structure; generating, through a wave propagation simulation of the model of the structure, simulated data in which measured values pertaining to the structure are expressed in a simulated manner; and generating learning data by associating the generated model of the structure and the simulated data.

Description

Training data generation method, model generation method and training data generation device

The embodiment of the present invention relates to a learning data generation method, a model generation method, and a learning data generation device.

Inferring the spatial distribution of physical property values underground from observation data such as earthquake waveforms corresponds to seismic waveform inversion (seismic inverse problem). Seismic wave propagation simulations, which require specialized knowledge to solve the seismic inverse problem, are required many times during the inversion process. In recent years, researchers have begun to implement deep learning methods that estimate physical properties such as underground velocity directly from seismic record data. For example, by inputting the observation data acquired by the seismic survey into the trained deep neural network (Deep Neural Network: DNN), the spatial distribution of the physical property values in the underground is inferred from the observation data. This approach reduces the time required for the seismic inverse problem.

However, for example, when a model such as DNN is used for underground exploration using seismic waveforms, there is a problem that it is difficult to prepare a large amount of data used for model learning because the actual underground structure cannot be grasped.

The problem that the invention tries to solve is to generate learning data about the structure.

The training data generation method according to the embodiment is a training data generation method executed by using at least one processor, and generates a structural model based on a plurality of structural features, and a structural model. By the wave propagation simulation for, the simulated data showing the observed values related to the structure is generated, and the generated model of the structure is associated with the simulated data to generate the learning data.

FIG. 1 is a block diagram showing an example of a hardware configuration of a learning system having a learning data generation device according to an embodiment. FIG. 2 is a diagram showing an example of a functional block in a processor according to an embodiment. FIG. 3 is a diagram showing an example of a generation area before generation of the underground structure model according to the embodiment. FIG. 4 is a diagram showing an example of a generation area in which a plurality of strata are deposited according to the embodiment. FIG. 5 is a diagram showing an example of a generation region in which a plurality of strata are folded according to the embodiment. FIG. 6 is a diagram showing an example of a generation region in which faults are generated for a plurality of strata according to the embodiment. FIG. 7 is a diagram showing an example of a generated region in which the stratum shallower than the inconsistent surface is stripped according to the embodiment. FIG. 8 is a diagram showing an example of a generation area in which a plurality of strata are re-deposited in the exfoliation area according to the embodiment. FIG. 9 is a diagram showing an example of a generation region in which a plurality of strata are folded and then a plurality of faults are generated according to the embodiment. FIG. 10 is a diagram showing an example of a generation region in which folds and faults are generated for a plurality of strata according to the embodiment. FIG. 11 is a diagram showing an example of a formation region intrusive with rock salt according to the embodiment. FIG. 12 is a diagram showing an example of an image (shot image) of common shot gather data according to the embodiment. FIG. 13 is a flowchart showing an example of the procedure of the learning data generation processing according to the embodiment. FIG. 14 is a diagram showing an example of a P-wave velocity (Vp) model generated by a parameter generation system according to an embodiment. FIG. 15 is a diagram showing an example of a functional block in a processor mounted on a learning device according to an embodiment. FIG. 16 is a diagram showing an example of estimation of P wave velocity (Vp) in the Marmousi2 geological structure model according to the embodiment. FIG. 17 is a diagram showing an example of estimation of P wave velocity (Vp) in the 1994 Amoco static correction test data set according to the embodiment. FIG. 18 shows an outline of the generation of training data, the generation of an underground structure estimation model using the training data, and the underground structure estimation process using the generated underground structure estimation model, according to the application example of the embodiment. It is a figure which shows an example. FIG. 19 is a flowchart showing an example of a procedure of generating learning data, generating an underground structure estimation model using the learning data, and performing model generation estimation processing according to an application example of the embodiment.

Hereinafter, embodiments relating to the learning data generation method, the model generation method, and the training data generation device will be described in detail with reference to the drawings. The training data generation method is executed using, for example, at least one processor.

(Embodiment)
FIG. 1 is a block diagram showing an example of a hardware configuration of a learning system 1 having a learning data generation device 3 according to the present embodiment. As shown in FIG. 1, the learning system 1 includes a learning data generation device 3, a learning device 7 connected to the learning data generation device 3 via a communication network 5, and a learning data generation device 3 via a communication network 5. An external device 9A connected to the device 9A and an external device 9B connected via the device interface 39 are provided. The learning system 1 generates a plurality of learning data by the learning data generation device 3. The learning system 1 trains a deep neural network to be trained using a plurality of generated training data to generate a trained model.

The trained model is, for example, a model that outputs sound waves, electromagnetic waves, radiation, etc. to the observation target and estimates the structure of the observation target based on the reflected wave propagating inside the observation target. The structure is, for example, the internal structure of the observation target. Observation targets are underground, artificial structures such as pillars and bridges, clouds, and living organisms. The trained model can be applied, for example, to non-destructive inspection, ultrasonic diagnosis of structures, echo inspection, submarine sonar, remote sensing, and the like. Hereinafter, in order to make the explanation concrete, the observation target will be described as being an underground structure. At this time, the trained model is a model that outputs the underground structure to be observed by inputting seismic waves (elastic waves), electromagnetic waves, or radiation. For example, if seismic waves are used as input, the trained model is used for seismic exploration. Further, for example, when an electromagnetic wave (electromagnetic field) is used as an input, the trained model is used for electromagnetic exploration. To be more specific, the trained model will be described as being used for seismic exploration. At this time, the plurality of training data correspond to the reflected waves propagated inside the observation target by the seismic waves radiated to the underground of the observation target.

The learning data generation device 3 has a computer 30 and an external device 9B connected to the computer 30 via the device interface 39. Further, the learning device 7 may be connected to the computer 30 via the device interface 39. As an example, the computer 30 includes a processor 31, a main storage device (memory) 33, an auxiliary storage device (memory) 35, a network interface 37, and a device interface 39. The learning data generation device 3 may be realized as a computer 30 in which the processor 31, the main storage device 33, the auxiliary storage device 35, the network interface 37, and the device interface 39 are connected via the bus 41. The computer 30 may be mounted on the learning device 7.

The computer 30 shown in FIG. 1 includes one component for each component, but may include a plurality of the same components. Further, although one computer 30 is shown in FIG. 1, software is installed on a plurality of computers, and each of the plurality of computers executes the same or different part of the software. May be good. In this case, it may be a form of distributed computing in which each computer communicates via a network interface 37 or the like to execute processing. That is, even if the learning data generation device 3 in the present embodiment is configured as a system that realizes various functions described later by executing instructions stored in one or a plurality of storage devices by one or a plurality of computers. good. Further, the information transmitted from the terminal is processed by one or a plurality of computers provided on the cloud, and the processing result is transmitted to a terminal such as a display device (display unit) corresponding to the external device 9B. It may have such a configuration. The display device is realized by, for example, various displays.

Various operations of the learning data generation device 3 in the present embodiment may be executed in parallel processing by using one or a plurality of processors or by using a plurality of computers via a network. Further, various operations may be distributed to a plurality of arithmetic cores in the processor and executed in parallel processing. In addition, some or all of the processes, means, etc. of the present disclosure may be executed by at least one of a processor and a storage device provided on the cloud capable of communicating with the computer 30 via the network. As described above, various types described later in this embodiment may be in the form of parallel computing by one or a plurality of computers.

The processor 31 is an electronic circuit (processing circuit, processing circuit, processing cycle, CPU (Central Processing Unit), GPU (Graphics Processing Unit), FPGA (Field Program)) including a control device and a computing device of the computer 30. (Application Special Integrated Circuit), etc.) may be used. Further, the processor 31 may be a semiconductor device or the like including a dedicated processing circuit. The processor 31 is not limited to an electronic circuit using an electronic logic element, and may be realized by an optical circuit using an optical logic element. Further, the processor 31 may include an arithmetic function based on quantum computing.

The processor 31 can perform arithmetic processing based on data and software (programs) input from each device or the like of the internal configuration of the computer 30 and output the arithmetic result or control signal to each device or the like. The processor 31 may control each component constituting the computer 30 by executing an OS (Operating System) of the computer 30, an application, or the like.

The learning data generation device 3 in this embodiment may be realized by one or a plurality of processors 31. Here, the processor 71 may refer to one or more electronic circuits arranged on one chip, or may refer to one or more electronic circuits arranged on two or more chips or two or more devices. You may point. When a plurality of electronic circuits are used, each electronic circuit may communicate by wire or wirelessly.

The main storage device 33 is a storage device that stores instructions executed by the processor 31, various data, and the like, and the information stored in the main storage device 33 is read out by the processor 31. The auxiliary storage device 35 is a storage device other than the main storage device 33. Note that these storage devices mean any electronic component capable of storing electronic information, and may be a semiconductor memory. The semiconductor memory may be either a volatile memory or a non-volatile memory. The storage device for storing various data used in the learning data generation device 3 in the present embodiment may be realized by the main storage device 33 or the auxiliary storage device 35, or is realized by the built-in memory built in the processor 31. You may. For example, the storage unit in this embodiment may be realized by the main storage device 33 or the auxiliary storage device 35.

A plurality of processors may be connected (combined) to one storage device (memory), or a single processor 31 may be connected. A plurality of storage devices (memory) may be connected (combined) to one processor. When the learning data generation device 3 in the present embodiment is composed of at least one storage device (memory) and a plurality of processors connected (combined) to the at least one storage device (memory), among the plurality of processors At least one processor may include a configuration in which it is connected (combined) to at least one storage device (memory). Further, this configuration may be realized by a storage device (memory) included in a plurality of computers and a processor 31. Further, a configuration in which the storage device (memory) is integrated with the processor 31 (for example, a cache memory including an L1 cache and an L2 cache) may be included.

The network interface 37 is an interface for connecting to the communication network 5 wirelessly or by wire. As the network interface 37, an appropriate interface such as one conforming to an existing communication standard may be used. The network interface 37 may exchange information with the learning device 7 and the external device 9A connected via the communication network 5. The communication network 5 may be any one of WAN (Wide Area Network), LAN (Local Area Network), PAN (Personal Area Network), or a combination thereof, and may be a combination of the computer 30 and the external device 9A. Any information can be exchanged between them. An example of WAN is the Internet, an example of LAN is IEEE802.11, Ethernet (registered trademark), etc., and an example of PAN is Bluetooth (registered trademark), NFC (Near Field Communication), etc.

The device interface 39 is an interface such as an output device such as a display device, an input device (input unit), and a USB (Universal Serial Bus) that is directly connected to the external device 9B. The output device may have a speaker or the like that outputs voice or the like.

The external device 9A is a device connected to the computer 30 via a network. The external device 9B is a device that is directly connected to the computer 30.

The external device 9A or the external device 9B may be an input device as an example. The input device is, for example, a device such as a camera, a microphone, a motion capture, various sensors, a keyboard, a mouse, or a touch panel, and gives the acquired information to the computer 30. Further, the external device 9A or the external device 9B may be a personal computer, a tablet terminal, a device having an input unit such as a smartphone, a memory, and a processor.

Further, the external device 9A or the external device 9B may be an output device (output unit) as an example. The output device may be, for example, a display device (display unit) such as an LCD (Liquid Crystal Display), a CRT (Cathode Ray Tube), a PDP (Plasma Display Panel), or an organic EL (Electro Luminescence) panel. It may be a speaker or the like that outputs voice or the like. Further, the external device 9A or the external device 9B may be a personal computer, a tablet terminal, a device having an output unit such as a smartphone, a memory, and a processor.

Further, the external device 9A or the external device 9B may be a storage device (memory). For example, the external device 9A may be a network storage or the like, and the external device 9B may be a storage such as an HDD.

Further, the external device 9A or the external device 9B may be a device having a part of the functions of the components of the learning data generation device 3 in the present embodiment. That is, the computer 30 may transmit or receive a part or all of the processing result of the external device 9A or the external device 9B.

FIG. 2 is a diagram showing an example of a functional block in the processor 31. The processor 31 has a setting unit 311, a determination unit 313, a model generation unit 315, a simulated data generation unit 317, and a learning data generation unit 319 as functions realized by the processor 31. The functions realized by the setting unit 311, the determination unit 313, the model generation unit 315, the simulated data generation unit 317, and the learning data generation unit 319 are, as programs, for example, a main storage device 33 or an auxiliary storage device. It is stored in 35 or the like. The processor 31 reads and executes a program stored in the main storage device 33, the auxiliary storage device 35, or the like, thereby causing the setting unit 311, the determination unit 313, the model generation unit 315, the simulated data generation unit 317, and the like. The function related to the learning data generation unit 319 is realized.

The setting unit 311 sets a range of feature value values (hereinafter referred to as a feature range) for each of a plurality of feature quantities related to the underground structure. For example, features are geological parameters. Geological structural elements described using geological parameters include sedimentation, folds, faults, ablation, redeposition, re-folding, re-faults, rock salt intrusion, lateral bending of formations, and inconsistent surfaces. There are how to insert, the size of the structure (width, depth) related to the modeling of the underground structure, the insertion of the low-speed layer into the surface layer, or the distribution of the thickness of the layer. Geological parameters include, for example, the thickness of the deposited formation, the amplitude and wavelength (or number of vibrations of the formation per unit length) with respect to the curvature of the formation, the angle of the fault with respect to the horizontal, the length of the fault, and the fault. Bending (degree of bending in the lateral direction (horizontal direction) of the fault), amplification depending on the depth of fault displacement, P wave velocity, ratio of P wave velocity to S wave velocity, distribution position of rock salt dome, rock salt dome P-wave velocity, amount of stratum scraping, depth of unconformity surface, depth dependence of rock velocity, etc. In addition, the setting unit 311 can set things that do not occur at the same time in geology. For example, the setting unit 311 sets a normal fault or a reverse fault in the feature amount related to the fault.

Specifically, the setting unit 311 sets a plurality of feature ranges for a plurality of feature quantities by default settings stored in the main storage device 33 or the auxiliary storage device 35. The default setting is, for example, a combination of multiple feature ranges covering all geological patterns. The default setting is not limited to one, and a plurality of default settings may be set according to a plurality of regions. At this time, the default setting is set in advance according to the geological information according to the area including the area to be investigated of the underground structure (hereinafter referred to as the investigation area). Geological information is various data acquired in the past in the area, for example, logging data in wells excavated in the area, observation data on the underground structure in the area, and the said. It is at least one of the spatial distributions of features inferred in the region. Further, as a default setting, the setting unit 311 sets an area (hereinafter, referred to as a generation area) in which a model of the underground structure (hereinafter, referred to as an underground structure model) is generated by the model generation unit 315. The generation region is, for example, a region schematically showing the length and depth in the horizontal direction.

The setting unit 311 displays the feature range set by the default setting on the display device. Specifically, the display device displays a plurality of the ranges corresponding to the plurality of feature quantities and two indicators for changing the upper limit value and the lower limit value in each of the plurality of the ranges. At this time, the setting unit 311 may appropriately change the feature range according to the instruction of the user via the input device. In addition, the setting unit 311 may display a radio button that allows the user to determine a normal fault or a reverse fault. At this time, the setting unit 311 sets the normal fault or the reverse fault according to the user's instruction via the radio button.

Further, the setting unit 311 sets the feature range set by the default setting and, for example, a model of the underground structure generated by the model generation section 315 using the representative value of the feature range (hereinafter, referred to as a confirmation model). , May be displayed on the display device. Specifically, the display device displays a plurality of the ranges corresponding to the plurality of feature quantities, two indicators in each of the plurality of ranges, and a confirmation model. For example, the setting unit 311 changes the confirmation model into a display device with a predetermined hue that changes from blue to yellow along the depth direction according to the magnitude of the P wave velocity of the stratum in the confirmation model. Display it. When the feature range is adjusted by the movement of the two indicators by the user, the setting unit 311 displays the confirmation model generated using the representative value changed according to the adjustment together with the changed feature range. Display on the device.

The determination unit 313 determines the model generation parameters, which are the information necessary for generating the underground structure model, based on the feature range and the random numbers for the plurality of feature quantities. The determination unit 313 determines a value in the feature range as a model generation parameter, for example, with respect to a plurality of feature quantities, based on the feature range and a random number. Specifically, when the setting of the feature range is determined by the setting unit 311, the determination unit 313 is a model of the underground structure generated by using the model generation parameters in the feature range (hereinafter, referred to as an underground structure model). Determine the identifier (hereinafter, underground model ID) for. The determination unit 313 uses the underground model ID as a seed for random numbers to generate random numbers. The determination unit 313 determines the model generation parameters within the set feature range based on the set feature range and the generated random numbers for the plurality of feature quantities. Specifically, the determination unit 313 sets a probability distribution within the feature range, and uses random numbers for the set probability distribution to determine model generation parameters within the feature range. The probability distribution is, for example, a uniform distribution, but is not limited to this, and other probability distributions such as a Poisson distribution may be used. The probability distribution may be set by, for example, the setting unit 311.

The model generation unit 315 generates a structural model based on a plurality of structural features. Specifically, the model generation unit 315 generates an underground structure model using the model generation parameters determined by the determination unit 313. In addition, the model generation unit 315 generates a model for confirming the underground structure using representative values in the range specified by the two indicators. 3 to 11 are diagrams showing an example of the process of generating the underground structure model generated by the model generation unit 315. Although the generation area 10 in FIGS. 3 to 11 indicates an area in which a two-dimensional underground structure model is generated, the model generation unit 315 may generate a three-dimensional underground structure model. The model generation unit 315 may generate a plurality of structural models using different feature quantities.

FIG. 3 is a diagram showing an example of the generation area 10 before the generation of the underground structure model. As shown in FIG. 3, the model generation unit 315 generates a generation region 10 filled with zeros as a feature amount. Legend 11 shown in FIG. 3 shows the P wave velocity. The upper end of the generation area 10 shown in FIG. 3 indicates the ground surface.

(Deposition)
The model generation unit 315 deposits a plurality of strata in the generation region 10 using the determined model generation parameters. For example, the model generation unit 315 uses the thickness of each of the plurality of strata, the P wave velocity of each of the plurality of strata, and the like to deposit a plurality of strata in the generation region 10, and the model generation unit 315 deposits a plurality of strata in the generation region 10. To place. FIG. 4 is a diagram showing an example of a generation area 10 in which a plurality of strata are deposited. As shown in FIG. 4, a plurality of strata having different P-wave velocities are deposited in the generation region 10.

(Fold)
The model generation unit 315 folds a plurality of strata using the determined model generation parameters in the generation region 10 in which the plurality of strata are deposited. For example, the model generation unit 315 bends a plurality of strata in the generation region 10 by using the wavelength of the fold and the amplitude of the fold. FIG. 5 is a diagram showing an example of a generation region 10 in which a plurality of strata are folded. As shown in FIG. 5, the amplitude of the fold can also be changed depending on the depth. For example, the amplitude of a fold increases in proportion to its shallowness. In addition, the fold corresponds to pulling up a plurality of strata upward in the vertical direction, as shown in FIG. Therefore, the deepest part of the generation area 10 is filled with the stratum before the fold.

(Fault)
The model generation unit 315 generates a fault for a plurality of strata using the determined model generation parameters in the generation region 10 having a plurality of folded strata. For example, the model generation unit 315 forms a fault in a plurality of strata by using the position of the fault, the angle of the fault, the amount of displacement of the fault, the degree of bending of the fault, and the like. FIG. 6 is a diagram showing an example of a generation region 10 in which faults are generated for a plurality of strata. Usually, there are many cases where normal faults or reverse faults are consistently present in one area. Therefore, as shown in FIG. 6, the model generation unit 315 generates one of the normal fault and the reverse fault according to the setting by the user or the default setting.

(Abrasion)
The model generation unit 315 executes a stripping process using the determined model generation parameters in the generation area 10 having a plurality of strata on which a fault is formed. For example, the model generation unit 315 removes all the strata shallower than the depth of the scraped lower surface from the generation region 10. Specifically, the model generation unit 315 fills the region of all strata shallower than the depth of the lower surface of the scraping (hereinafter referred to as the scraping region) with 0. FIG. 7 is a diagram showing an example of a generation region 10 in which stripping is performed on all strata shallower than the depth of the lower surface of scraping. At this time, as shown in FIG. 7, the model generation unit 315 may form an inconsistent surface in the region 13 directly above the lower surface of the scraping.

(Re-deposition)
The model generation unit 315 arranges a plurality of strata with respect to the exfoliation region 15 in the generation region 10 by using the thickness of each of the plurality of strata, the P wave velocity of each of the plurality of strata, and the like. FIG. 8 is a diagram showing an example of a generation region 10 in which a plurality of strata are re-deposited in the scraping region 15. As shown in FIG. 8, a plurality of strata having different P-wave velocities are deposited in the exfoliation region 15.

(Re-folding)
The model generation unit 315 bends a plurality of strata in the generation region 10 in which a plurality of strata are deposited in the scraping region 15 by using the wavelength of the fold and the amplitude of the fold. FIG. 9 is a diagram showing an example of a generation region 10 in which a plurality of strata are folded. The fold corresponds to pulling up a plurality of formations vertically upwards, as shown in FIG. Therefore, as shown in FIG. 9, the deepest part in the generation region 10 is filled with the stratum before the fold as in FIG.

(Re-fault)
The model generation unit 315 forms a fault in the generation region 10 having a plurality of folded strata, using the position of the fault, the angle of the fault, the displacement amount of the fault, the degree of bending of the fault, and the like. FIG. 10 is a diagram showing an example of a generation region 10 in which faults are generated for a plurality of strata.

(Intrusive rock salt)
The model generation unit 315 penetrates the rock salt in the generation region 10 using the determined model generation parameters. For example, the model generation unit 315 is arranged so that the rock salt penetrates into the generation region 10 by using the distribution position of the salt dome and the P wave velocity of the salt dome. FIG. 11 is a diagram showing an example of a production region 10 in which rock salt 17 is intruded.

The procedure of the above geological events (deposition, fold, fault, exfoliation, redeposition, fault reactivity (refold, refault), rock salt intrusion) performed by the model generator 315 is in the time series of their occurrence. Along. Therefore, the order of the geological events can be changed as appropriate. Geological events are not limited to sedimentation, folds, faults, ablation, redeposition, fault reactivity (re-folding, re-fault), and rock salt intrusion, and other events may be further performed.

The model generation unit 315 cuts out a predetermined range from the generation area 10 in which the rock salt 17 has penetrated. The predetermined range is, for example, in the generation region 10 shown in FIG. 11, the depth is 0 to 5 km and the length in the horizontal direction is 0 to 25 km. The predetermined range may be appropriately set / changed by the setting unit 311 under the instruction of the user via the input device. The model generation unit 315 generates an underground structure model by cutting out from the generation area 10. The model generation unit 315 generates a plurality of underground structure models according to the generation of random numbers. Since the random number seed corresponds to the underground model ID, the model generation unit 315 can regenerate the underground structure model according to the selection of the underground model ID.

The simulated data generation unit 317 generates simulated data that simulates the observed values related to the structure by wave propagation simulation for the model of the structure. The wave propagation simulation is, for example, a simulation related to seismic waves (hereinafter referred to as seismic wave simulation). Specifically, the simulated data generation unit 317 executes a seismic wave propagation simulation on the underground structure model generated by the model generation unit 315. As the seismic wave propagation simulation, for example, a known technique generally performed as a wave propagation simulation relating to elastic waves or acoustic waves can be appropriately used. The seismic wave propagation simulation is realized, for example, by applying the initial conditions and the boundary conditions to the partial differential equations constituting the equation of motion (wave equation) of the elastic body and solving them sequentially. As a numerical calculation solution method for the differential equation, for example, a finite difference method (FDM: Fine Finite Difference Method) or a finite element method (FEM: Fine Element Method) can be used. The simulated data generation unit 317 inputs, for example, an underground structure model to an earthquake simulator and executes an earthquake wave propagation simulation. As a result, the simulated data generation unit 317 generates simulated data that simulates the observed values related to the underground structure of the underground structure model. The simulated data includes, for example, shot data (shot data of the underground structure) in which seismic waves generated by an artificial earthquake (shot) are virtually propagated to an underground structure model and received by a seismograph. The simulated data generation unit 317 may generate simulated data by executing a wave propagation simulation on a plurality of structural models.

FIG. 12 is a diagram showing an example of an image (shot image) 19 of common shot gather data. The vertical axis of the shot image 19 shown in FIG. 12 corresponds to the time from the execution time of the shot, and the horizontal axis of the shot image 19 indicates the horizontal position in the underground structure model used to generate the shot image 19. ing.

The learning data generation unit 319 generates training data by associating the generated model of the structure with the simulated data. Specifically, the learning data generation unit 319 inputs a plurality of underground structure models generated according to random numbers and a plurality of simulated data generated according to the plurality of underground structure models in the seismic wave propagation simulation. Correspond according to the relationship of output. The learning data generation unit 319 generates a plurality of learning data by the associated plurality of underground structure models and a plurality of simulated data. The learning data generation unit 319 stores a plurality of learning data in the main storage device (memory) 33 or the auxiliary storage device (memory) 35. The learning data generation unit 319 may store a plurality of learning data in the external device A as the network storage. The learning data generation unit 319 may generate learning data by associating a plurality of structural models with simulated data for the same.

The components of the learning data generator 3 have been described above. Hereinafter, the procedure of the process of generating the learning data by the learning data generation device 3 (hereinafter, referred to as the learning data generation process) will be described. The procedure of the training data generation process corresponds to the training data generation method. The training data generation method generates a structural model based on a plurality of features related to the structure, and generates simulated data that simulates the observed values related to the structure by wave propagation simulation for the structural model, and the generated structure is generated. The training data is generated by associating the model of the above with the simulated data. For example, the structure is an underground structure, and the plurality of features include geological information. Geological information can be, for example, how to bend the formation laterally, how to insert inconsistent surfaces, the size of the structure (width, depth) for modeling underground structures, the insertion of slow layers into the surface, or , Includes any one of the layer thickness distributions. The training data generation method determines a plurality of feature quantities based on a range of feature quantity values and a random number. Specifically, the structure is an underground structure, and the range of feature value values is set based on geological information in the target area. The geological information has at least one of logging data in the target area, observation data on the underground structure in the target area, and spatial distribution of feature quantities inferred in the target area.

For example, the structure is an underground structure, and the learning data generation method is, as shown in FIGS. 3 to 11, at least deposition, fold, fault, scraping, redeposition, and fold based on a plurality of feature quantities. A structural model is generated by stepwise execution of events related to any one of, re-fault, or rock salt intrusion. The training data generation method generates a model of the structure by executing events in the order of sedimentation, fold, and fault, for example. The learning data generation method may generate a structural model by executing events in the order of fault, scraping, redeposition, refolding, and re-fault. In addition, a structural model is generated by executing events in the order of re-fault and rock salt intrusion. FIG. 13 is a flowchart showing an example of the procedure of the learning data generation process.

(Learning data generation process)
(Step S101)
The setting unit 311 sets a feature range for each of a plurality of feature quantities related to the underground structure. The upper and lower limits of the feature range are set by default settings. The feature range may be set based on, for example, geological information, according to the user's instruction via the input device. The function executed by the setting unit 311 may be set in another input device such as the external device 9A.

(Step S102)
The determination unit 313 determines the underground model ID. The determination unit 313 uses the number of the underground model ID as a seed for the random number to generate a random number. The determination unit 313 determines the model generation parameters in the feature range based on the feature range and the random number for the plurality of feature quantities. The determined model generation parameters are stored in the main storage device (memory) 33 or the auxiliary storage device (memory) 35.

(Step 103)
The model generation unit 315 generates an underground structure model using the model generation parameters. Specifically, the model generation unit 315 generates a plurality of underground structure models in response to a plurality of random numbers. The model generation unit 315 stores a plurality of underground structure models in the main storage device (memory) 33 or the auxiliary storage device (memory) 35.

(Step S104)
The simulated data generation unit 317 executes a seismic wave propagation simulation on the underground structure model and generates simulated data corresponding to the underground structure model. The simulated data generation unit 317 stores the simulated data generated according to the underground structure model in the main storage device (memory) 33 or the auxiliary storage device (memory) 35.

(Step S105)
The learning data generation unit 319 generates a plurality of learning data by associating a plurality of underground structure models generated according to random numbers with a plurality of simulated data generated according to the plurality of underground structure models. The learning data generation unit 319 stores a plurality of learning data in the main storage device (memory) 33, the auxiliary storage device (memory) 35, or the external device A as network storage. By the above processing, one learning data generation device 3 generates, for example, 500,000 or more learning data for one underground model ID, and stores the generated learning data.

Hereinafter, a parametric velocity model generation system will be described as an example of the learning data generation device 3.

(Parameter velocity model generation system)
We propose a parametric velocity model generation system to obtain a large training data set. This system is designed to prevent data leakage and allow proper evaluation of generation performance because it only uses geological information (knowledge) in the generation of underground structural models. First, the system produces an underground structural model, a realistic and high resolution velocity distribution. Velocity structures in underground structure models are generated through synthetic processes that correspond to geological events including stratification, folds, faults, intrusions, and ablation. Each process is modeled using geological parameters such as thickness, P-wave velocity, and fault tilt angle. FIG. 14 is a diagram showing an example of a velocity model as an example of an underground structure model. That is, FIG. 14 shows an example of a P-wave velocity (Vp) model generated by the parameter generation system. Seismic wave propagation in the generated underground structure model is simulated to create shot gathers (simulated data) corresponding to the velocity model.

This system incorporates multiple probability distributions to extract a large number of samples in geological parameters and generates large-scale training data including a wide variety of underground structures. The determination unit 313 uses a uniform distribution having upper and lower limits of geological parameters. The range of geological parameters is set by the setting unit 311 by geological insights such as, for example, a rough estimate of the velocity structure. In a specific case, the underground insight of the target can be obtained by using geological surveys such as analysis results by the conventional inverse problem method and logging data in the vicinity. High quality training data is generated by making reasonable assumptions about these geological parameters.

The learning data generation process by the learning data generation device 3 has been described above. Hereinafter, the learning device 7 will be described. Since the hardware configuration of the learning device 7 is the same as that in the frame of the dotted line 3 in FIG. 1, the description thereof will be omitted. The learning device 7 learns a deep neural network using a plurality of learning data. The deep neural network is an example of a model (hereinafter referred to as an estimation model) that estimates information about a structure. The learning device 7 uses the learning data generated by the above-mentioned learning data generation method to generate an estimation model that estimates information about the structure. That is, the model generation method for generating the estimation model using the training data generated by the training data generation method described above is executed by using at least one processor in the learning device 7.

FIG. 15 is a diagram showing an example of a functional block in the processor 81 mounted on the learning device 7. The processor 81 has a preprocessing unit 811, a model setting unit 813, and a learning unit 815 as functions realized by the processor 81. The functions realized by the preprocessing unit 811, the model setting unit 813, and the learning unit 815 are stored as programs in, for example, a main storage device or an auxiliary storage device mounted on the learning device 7. The processor 81 reads and executes a program stored in the main storage device or the auxiliary storage device mounted on the learning device 7, and thereby functions related to the preprocessing unit 811, the model setting unit 813, and the learning unit 815. To realize.

The preprocessing unit 811 increases the number of simulated data corresponding to one underground structure model according to noise addition, various settings at the time of data acquisition in seismic survey, and the like. The noise added to the simulated data set includes, for example, noise caused by a sensor that cannot receive vibration among a plurality of vibration receiving sensors in seismic survey, noise caused by a vehicle passing through a road in the survey area, noise caused by a test drilling well installed in the survey area, and the like. Is. The various settings include, for example, the positional relationship of the vibration receiving sensor with respect to the earthquake simulation vehicle, the method of generating shots by the earthquake simulation vehicle, and the like. As described above, the preprocessing unit 811 inflates the number of simulated data for one underground structure model in the plurality of training data (Augmentation). That is, due to the padding of the simulated data by the pretreatment unit 811, a plurality of simulated data correspond to one underground structure model (correct answer data). The preprocessing unit 811 stores a plurality of learning data increased by padding in the main storage device or the auxiliary storage device in the learning device 7.

The model setting unit 813 sets a pre-learning model to be trained using a plurality of training data sets among a plurality of training data. The pre-learning model is, for example, a deep neural network. First, the model setting unit 813 divides the plurality of training data into a plurality of training data sets and a plurality of verification data sets used for verification of the trained model. Next, the model setting unit 813 extracts a plurality of model setting data sets used for setting the pre-training model and a plurality of model verification data sets used for verifying the set model from the plurality of training data sets. do. Hereinafter, the plurality of data sets after extracting the model setting data set and the model verification data set from the plurality of training data sets will be referred to as the extracted data sets.

The model setting unit 813 sets the pre-learning model using the plurality of model setting data sets and the plurality of model verification data sets. For example, the model setting unit 813 applies a neural architecture search (Neural Architecture Search: NAS) using a hyperparameter automatic optimization framework to a plurality of model setting data sets and a plurality of model verification data sets. .. As a result, the model setting unit 813 sets the structure of the pre-learning model and the hyperparameters in the pre-learning model as the pre-learning model. As NAS, for example, Optuna (registered trademark) is used. The NAS is not limited to Optuna (registered trademark), and other architectures may be used. Further, the model setting unit 813 may set the model of the deep neural network suitable for exploration and the hyperparameters of the deep neural network according to the instruction of the user via the input device.

For example, NAS automatically designs a neural network suitable for a given task. Once the search space for the neural architecture is defined, the NAS will use multiple model setup datasets and multiple model validation datasets to continuously sample, train, and evaluate candidate architectures. Find the optimal architecture in the search space. This architecture is defined by model hyperparameters, including the number of layers and channels. The model setting unit 813 adopts Optuna (registered trademark), which is a hyperparameter automatic optimization framework. The user-friendly interface of Optuna® enables simple and automatic hyperparameter optimization via parallel processing, reducing calculation time.

Specifically, the model setting unit 813 defines the search space of the neural architecture as an encoder-decoder model based on ResNet. ResNet is a deep learning model used in both seismic inverse problems and computer vision. Of particular interest are encoder-decoder models that associate inputs and outputs with a common potential feature space. A convolutional neural network typically captures spatial local features. In the encoder-decoder model, the encoder and the decoder are connected by a common latent feature space. Therefore, the encoder-decoder model can learn spatially global features having no structure by breaking the structure of the input simulated data (shot image). Learning spatially global features is useful for the seismic inverse problem.

The learning unit 815 learns the pre-learning model set by the model setting unit 813 using the extracted data set. For example, the learning unit 815 inputs each of the plurality of simulated data in the extracted data set into the pre-learning model. The learning unit 815 uses, for example, a stochastic gradient descent method by back propagation to reduce the difference between the output from the pre-learning model and the underground structure model corresponding to the simulated data input to the pre-learning model. Adjust the weight of the pre-training model. In addition, the learning unit 815 verifies the weight-adjusted pre-learning model with the verification data set. As a result, the learning unit 815 generates a trained model (estimated model). The learning unit 815 stores the generated learned model in the main storage device or the auxiliary storage device in the learning device 7.

The example of generating a trained model by the learning device 7 has been described above. Hereinafter, an example of the process related to the estimation of the underground structure by the trained model (hereinafter referred to as the underground structure estimation process) will be described.

Prior to the implementation of the underground structure estimation process, it is assumed that shot data related to the underground structure to be estimated has been acquired in advance. Further, prior to the execution of the underground structure estimation process, data cleansing such as noise reduction and outlier removal may be appropriately performed on the raw data or shot data.

Since the hardware configuration of the estimation device is the same as that in the frame of the dotted line 3 in FIG. 1, the description thereof will be omitted. The estimator stores the trained model in the main or auxiliary storage device in the estimator. The estimation device may store a plurality of trained models generated according to the region. At this time, the estimation device selects the trained model according to the instruction of the user via the input device. The estimation device estimates the underground structure to be estimated by inputting shot data into the trained model. As a result, the estimation device estimates the underground structure corresponding to the input shot data. The estimated underground structure data may be post-processed as appropriate. The estimation device stores the data of the estimated underground structure in the main storage device or the auxiliary storage device in the estimation device. At this time, the estimation device may display the data of the estimated underground structure on the display unit (display) provided in the estimation device.

The estimation of underground structure data by the estimation device has been explained above. Hereinafter, an example of the experimental results using various treatments related to this embodiment will be described.

(Example of experimental results)
The various processes in the embodiment apply to the inverse problem of estimating the velocity structure of each layer in the underground cross section from a two-dimensional (2D) shot gather image. The experiment comprises four steps: training data generation, training data division, NAS, and evaluation of the optimal neural architecture.

In the training data generation process, a 300,000 pairs of training data data sets having a speed model corresponding to the underground structure model and shot gathers (simulated data) corresponding to the speed model were created. The generation parameters (geological parameters) that control the geological properties of the velocity model were determined based on the geological insights of the Marmousi2 geological structural model (Martin, G.S., Willey, R.,. and Marfurt, K.J. [2006] Marmousi2: An elastic upgrade for Marmousi. The Reading Edge, 25 (2): 156-166.). The summary information about the geological property was used to set the feature range, but the dataset itself (benchmark dataset) used to benchmark the trained model is not used. Seismic wave propagation was simulated by a supercomputer.

Training data division was applied to prevent data leakage during NAS processing. The 300,000 training data were divided into a large training data set of 240,000 samples and a large validation data set of 60,000 samples. These large datasets were used to train optimal neural architectures. In addition, 10,000 data were sampled from a large training dataset and this subset was divided into a small model setup dataset of 8,000 samples and a small model validation dataset of 2,000 samples. These small datasets were used to find the optimal neural architecture. By partitioning this dataset, non-regular access (data leakage) during the NAS step to a large number of validation datasets can be avoided.

The model setting unit 813 optimizes the encoder / decoder model based on ResNet by tuning hyperparameters such as the number of layers and the number of channels. Finally, we obtained an optimal neural architecture (pre-learning model) with more than 100 hidden layers, much deeper than those used in previous studies.

The learning unit 815 trained the optimal neural architecture using a large training data set. The generated trained model was evaluated using two standard benchmark datasets, the Marmousi2 geostructure model and the 1994 Amoco static correction test dataset. FIG. 16 is a diagram showing an example of estimation of P wave velocity (Vp) in the Marmousi2 geological structure model. That is, FIG. 16 shows a model based on the conventional ResNet50 (He, K., Zhang, X., Ren, S., and Sun, J. [2016] Deep redearing for image recognition Recognition Co., Ltd. Comparison of the result of the inverse problem in which the standard benchmark data set is applied to Pattern Recognition (CVPR), 770-778.) And the result of the inverse problem in which the standard benchmark data set is applied to the trained model trained by the learning device 7. It is a figure which shows an example. (A) in FIG. 16 shows the grand truth. (B) in FIG. 16 shows the estimation by the trained model generated by the embodiment. FIG. 16 (c) shows the estimation using the encoder-decoder model based on ResNet50. (D) in FIG. 16 shows a one-dimensional (1D) profile at a position of 2 km corresponding to the red line in (a) to (c). (E) in FIG. 16 shows a 1D profile at 11 km corresponding to the blue line.

The result of the inverse problem corresponds to the result of the inverse problem of the Marmousi2 geological structure model. Due to the quality and quantity of the training dataset in this embodiment, the result (c) of the inverse problem with the conventional baseline ResNet50 based model roughly regenerated the velocity model (a). On the other hand, the result (b) of the inverse problem by the trained model generated by the present embodiment shows a result that is easier to understand than the result (c) of the conventional inverse problem. For example, the result (b) of the inverse problem by the trained model generated by this embodiment is more accurate than the result (c) of the conventional inverse problem in the rock salt layer (depth of (d) in FIG. 16). The velocity of 4 km) was predicted, and a more detailed structure was obtained in the complicated area around the fault ((e) in FIG. 16).

FIG. 17 shows the results of modeling for the 1994 Amoco static correction test dataset. (A) in FIG. 17 shows the grand truth. FIG. 17 (b) shows the estimation by the trained model generated by the embodiment. The trained model generated by this embodiment estimated high resolution and good output even though the training data did not contain information related to the 1994 Amoco static correction test dataset.

According to the learning data generation device 3 (hereinafter referred to as the present learning data generation device 3) according to the present embodiment, as an example, a range of the value of the feature amount is set for each of the plurality of feature amounts related to the underground structure, and a plurality of features are set. Based on the range and random numbers, the value of the feature amount in the range is determined, the model of the underground structure is generated using the determined value, and the seismic wave propagation simulation for the model of the underground structure is performed. , Generate simulated data that simulates the observed values related to the underground structure, and generate a plurality of simulated data that are generated according to the random number, and a plurality of simulated data that are generated according to a plurality of models of the underground structure. Are associated with each other according to the input / output relationship in the seismic wave propagation simulation, and a plurality of training data are generated. According to the learning data generation device 3, the range is set based on the geological information in the area including the area to be investigated of the underground structure. Here, the geological information in the learning data generation device 3 includes at least one of logging data in the area, observation data on the underground structure in the area, and spatial distribution of the feature quantity inferred in the area. Have. In addition, according to this learning data generation device 3, the feature quantity indicates the depth of the stripped surface in the model of the underground structure, the degree of bending of the fault in the model of the underground structure, and the inconsistency of the stratum, and the flow of the salt dome layer. It has an inconsistent surface formed by. Further, according to the present learning data generation device 3, a normal fault or a reverse fault is set in the feature amount.

Based on these facts, according to the learning data generator 3, as an example, geological parameters using random numbers are used in the range of geological parameters set based on geological knowledge and insight. It is possible to generate a large number of diverse underground structure models by using the set geological parameters by geologically following the formation process of the underground structure. That is, according to the present learning data generation device 3, a large amount of realistic underground structure models can be generated by using random numbers with reference to the natural history of the underground structure, and a large amount of learning data can be generated. Since the large amount of training data generated by the training data generation device 3 has a large amount of underground structure models that are qualitatively high, that is, realistic, as teacher data, it is possible to improve the generalization performance of the trained model. can.

Further, according to the learning data generation device 3, as an example, a plurality of said ranges corresponding to a plurality of feature quantities and two indicators for changing an upper limit value and a lower limit value in each of the plurality of said ranges are displayed. , A model for confirming the underground structure is generated using the representative values of the ranges specified by the two indicators, and the plurality of the ranges corresponding to the plurality of features and the two instructions in each of the plurality of the ranges are indicated. Display the vessel and the confirmation model. As a result, according to the learning data generation device 3, when the user inputs the range to reflect the geological knowledge and the like in the range of the geological parameters, the user can change and adjust the range of the underground structure model. Changes can be easily grasped. As a result, the user's operability regarding the generation of the underground structure model can be improved, and the quality of the training data can be further improved. As a result, the generalization performance of the trained model can be improved.

(Application example)
In this application example, simulated data is used, for example, by a hostile generation network (GAN), which uses observation data acquired by observing a structure, simulated data, and a loss function weighted according to the size of the data value in the simulated data. Learn the improver so that the improvement data that is closer to the reality of the observed data is improved based on the simulated data, input the simulated data to the learned improver, generate the improvement data, and use the structural model. The purpose is to generate training data by associating it with improvement data. Further, in this application example, the estimation model related to the structure (hereinafter referred to as the underground structure estimation model) is learned by using the training data having the improvement data, and the underground structure estimation process using the underground structure estimation model is also performed. explain.

Further, in this application example, the learning data generation unit 319 is provided in the processor 81 mounted on the learning device 7. When the technical features in this application example are realized in the estimation device, the processor in the estimation device relates to the learning data generation unit 319, the learning unit 815, and the observation data by inputting the observation data into the trained model. It is provided with an estimation unit that estimates the structure.

FIG. 18 is a diagram showing an outline of the generation of training data, the training of the underground structure estimation model TUSEM using the training data, and the underground structure estimation processing using the trained underground structure estimation model TUSEM. .. The Sim shown in FIG. 18 corresponds to the learning data generation process in the embodiment. That is, Sim shown in FIG. 18 shows an outline of a process of generating a plurality of underground structure models and generating a plurality of simulated data SDs corresponding to the underground structure model USM by wave propagation simulation WPS. Since the processing contents in Sim shown in FIG. 18 are the same as those in the embodiment, the description thereof will be omitted. Further, Real in FIG. 18 shows the collected observation data (for example, shot data) OD used for carrying out the underground structure estimation process. The description of the acquisition of the observation data OD will be omitted because it conforms to the existing method.

S2R in FIG. 18 is a process of learning the improver RF by the hostile generation network based on the simulated data SD and the observation data OD, and the simulated data SD using the learned improver TRF. It shows the process of converting to RD. The hostile generation network in FIG. 18 has a refiner RF and a discriminator DCN to be trained. Further, in the learning process of the improver RF by the hostile generation network, the output from the improver RF during learning corresponds to the noise addition data NAD in which realistic noise or the like is added to the simulated data SD.

INV in FIG. 18 inputs the observation data OD into the learning process of the model LOM to be learned by the improved data RD and the underground structure model USM and the learned underground structure estimation model TUSEM, and outputs the estimated underground structure UGS. It shows the process of underground structure estimation processing.

The learning data generation unit 319 executes the wave propagation simulation WPS on the underground structure model USM. As a result, the learning data generation unit 319 corresponds to the simulation result of the wave propagation simulation WPS and generates the simulated data SD corresponding to the underground structure model USM. The learning data generation unit 319 generates a plurality of simulated data SDs corresponding to the plurality of underground structure model USMs by executing the above processing on the plurality of underground structure model USMs. Next, the learning data generation unit 319 adds randomly generated noise to each of the plurality of simulated data SDs. As a result, the learning data generation unit 319 generates a plurality of simulated data to which random noise is added (hereinafter, referred to as noise-added simulated data). The addition of noise is carried out, for example, by the arrow NA shown in FIG. The addition of random noise may be appropriately omitted in order to shorten the processing time in this application example. The learning data generation unit 319 associates the plurality of underground structure model USMs with the plurality of noise-added simulated data by input / output to the wave propagation simulation WPS, and stores them in the memory. Further, the learning data generation unit 319 reads the observation data OD acquired by the existing acquisition device and the network before learning from the memory. The total number of observed data ODs may be smaller than, for example, the total number of noise-added simulated data, but it is desirable that the total number is a predetermined number or more in order to improve the generalization performance of learning for the improver RF. ..

The learning data generation unit 319 applies the observation data and the noise addition simulated data (simulated data when noise addition is not executed) to the network, and alternately learns the improver RF and the discriminator DCN. The loss function used for learning the improver RF (hereinafter referred to as the improver loss function) depends on the strength of the signal value in the simulated data SD, for example, in proportion to the magnitude of the signal value in the simulated data SD. Weighted to maintain that signal value. Hereinafter, the loss function peculiar to this application example will be described. Since existing techniques can be applied to other configurations and processes in the network, the description thereof will be omitted.

Assuming that the image corresponding to the noise-added simulated data input to the network with respect to the simulated data SD is _{Isim and the image corresponding to the noise-added data NAD is Itint} _, the improver loss function Loss _L1 is, for example, the following equation (1). ).

Loss _L1 = mean (abs (I _sim ) * abs (I _sim -I _taint )) ... (1)
The right side in the equation (1) shows that the absolute value abs of the difference between the image I _sim and the image I _point is multiplied by the absolute value of the image I _sim for each pixel to calculate the mean value mean over the pixels of all the images. ing. Multiplying abs (I _sim -I _sim ) by abs (I _sim ) is an absolute value image abs (I _sim -I _sim- ) before calculation of the mean value in a normal L1 loss mean (abs (I _sim -I sim)). It is shown that I _{point) is multiplied by the strength of the signal value in the simulated data SD as the weights abs (I sim} ₎ . As a result, the learning data generation unit 319 repeatedly learns the improver RF so that the improver loss function Loss _L1 shown in the equation (1) becomes smaller. As a result, the improver RF is learned to maintain the signal value in proportion to the magnitude of the signal value in the simulated data SD.

The weight in the improver loss function Loss _L1 is not limited to the above abs (I _sim ). For example, the weight in the improver loss function Loss _L1 may be any function as long as it is a function of abs (I _sim ) in a broad sense of monotonic increase (broad sense monotonic increase function). Specifically, the weights in the improver loss function Loss _L1 are non-linear weights (abs (I _sim )) ^ 2, square (abs (I _sim )), or min (a, abs (I)) provided with an upper limit a. It may be _sim ) or the like.

The learning data generation unit 319 is an improver RF (hereinafter, learned improver RF) learned by learning a network using a plurality of observation data and a plurality of noise addition simulated data (a plurality of simulated data when noise addition is not executed). TRF) is stored in the memory. Since the trained improver TRF is an improver that reflects the observation data, the area where the observation data was acquired, the collection conditions for collecting the observation data (for example, the observation equipment for collecting the observation data, the collection of the observation data by the operator). It reflects various characteristics at the time, characteristics of the research company regarding the collection of observation data, etc.), and is specialized in observation data, that is, it is automatically customized and learned about observation data.

The learning data generation unit 319 generates noise addition simulated data by adding randomly generated noise to the simulated data. The learning data generation unit 319 reads the learned improver TRF from the memory, inputs noise-added simulated data or simulated data, and generates improvement data. The noise addition simulation data input to the trained improver TRF may be the noise addition simulation data used for learning the improver RF. The learning data generation unit 319 associates the structural model with the improvement data and generates learning data. Specifically, the learning data generation unit 319 associates the underground structure model USM with the improvement data RD to generate learning data. The learning data generation unit 319 generates a plurality of learning data by repeating the processes from the generation of the noise-added simulated data to the generation of the improved data RD for the plurality of simulated data, for example. The learning data generation unit 319 stores the generated plurality of learning data in the memory.

The learning unit 815 learns the underground structure estimation model TUSEM by learning the model LOM to be learned using each of the plurality of learning data. Since a known method can be appropriately used for training the model LOM using a plurality of training data, the description thereof will be omitted. In the learning of the underground structure estimation model TUSEM, noise-added simulated data in which the simulated data SD, which is the simulation result, is given a sense of reality by the observation data OD is used. Therefore, in the underground structure estimation model TUSEM, a network (learned model) capable of inversion of the underground structure estimation with respect to the observation data OD which is the actual data is learned.

The estimation unit estimates the underground structure UGS by inputting the observation data OD into the underground structure estimation model TUSEM. The estimated underground structure is stored in memory. The estimated underground structure may be displayed on the display.

In the following, for the sake of concrete explanation, the observation data shall be shot data related to the underground structure acquired in the area related to the estimation of the underground structure. In addition, a process of estimating the underground structure based on the shot data according to the acquisition of the shot data will be described. FIG. 19 shows a series of processes of generation of training data, generation of an underground structure estimation model using the training data, and underground structure estimation processing using the generated underground structure estimation model (hereinafter, model generation estimation processing). It is a flowchart which shows an example of the procedure of).

(Model generation estimation process)
(Step S191)
The learning data generation unit 319 acquires a plurality of shot data ODs. For example, the learning data generation unit 319 acquires the shot data OD from the acquisition device, the server device in which the shot data OD is stored, or the storage medium in which the shot data OD is stored. The learning data generation unit 319 stores the acquired shot data OD in the memory.

(Step S192)
The learning data generation unit 319 reads a plurality of simulated data SDs from the memory. The learning data generation unit 319 adds random noise to each of the plurality of simulated data SDs, and generates noise-added simulated data. The learning data generation unit 319 stores the generated noise-added simulated data in the memory. This step may be omitted in order to shorten the processing time in the model generation estimation process.

(Step S193)
The learning data generation unit 319 learns the improver RF together with the classifier DCN using the noise addition simulated data and the plurality of shot data, and generates the trained improver TRF. Specifically, the learning data generation unit 319 inputs the shot data to the trained improver RF. The learning data generation unit 319 outputs the noise addition data NAD from the learned improver RF. The learning data generation unit 319 calculates the _{improver loss function Loss L1} _based on the image corresponding to the noise-added simulated data and the image corresponding to the noise-added data _NAD with respect to the simulated data SD. The learning data generation unit 319 learns the improver RF by, for example, an error backpropagation method so as to reduce the improver loss function Loss _L1 . In addition, the learning data generation unit 319 learns the classifier DCN. The learning data generation unit 319 learns the improver RF and the discriminator DCN by repeating these processes according to each of the plurality of noise-added simulated data and each of the plurality of shot data. When the learning process is completed, the learning data generation unit 319 generates the learned improver TRF. The learning between the improver RF and the discriminator DCN may be realized by the learning unit 815 in the learning device.

(Step S194)
The learning data generation unit 319 inputs each of the plurality of noise-added simulated data to the trained improver TRF, and generates a plurality of improvement data. Specifically, the learning data generation unit 319 generates a plurality of noise-added simulated data by adding randomly generated noise to each of the plurality of simulated data.

(Step S195)
The learning data generation unit 319 associates a plurality of underground structure models with a plurality of improvement data to generate a plurality of learning data. The learning data generation unit 193 stores a plurality of learning data in the memory.

(Step S196)
The learning unit 815 generates an underground structure estimation model TUSEM using a plurality of learning data. That is, the learning unit 815 generates the underground structure estimation model TUSEM by learning the model LOM to be learned over a plurality of learning data. The learning unit 815 stores the underground structure estimation model TUSEM in the memory.

(Step S197)
The estimation unit inputs each of a plurality of shot data into the underground structure estimation model TUSEM and estimates the underground structure UGS. The estimation unit stores the estimated underground structure in the memory. The estimation unit may display the estimated underground structure UGS on the display.

The learning data generation device 3 according to the present embodiment is hostile using the observation data OD acquired by observing the structure, the simulated data SD, and the loss function Loss _L1 weighted according to the magnitude of the data values in the simulated data SD. By the target generation network, the improver RF is learned so that the simulated data SD is generated based on the simulated data SD, and the simulated data SD is converted into the learned improver TRF. Input to generate improvement data RD, and associate the underground structure model USM with the improvement data RD to generate training data. For example, according to the present learning data generation device 3, the improver RF is learned by using the improver loss function Loss _L1 proportional to the magnitude of the signal value in the simulated data SD. As a result, according to the learning data generator 3, the trained improver TRF can be generated so as to maintain the signal value in the simulated data SD.

In addition, according to the learning data generator 3, since the improver RF is learned using the observation data OD, fluctuation factors such as noise due to the collection conditions of the observation data OD are also learned in the learning process of the improver RF. be able to. Therefore, according to the present learning data generation device 3, it is possible to generate the improved data RD in which the simulated data SD is close to the reality of the observation data OD. In other words, the learning data generation device 3 generates more realistic learning data by generating improved data in which reality is added to the simulated data SD which is the simulation result of the underground structure model USM. be able to.

Further, the learning data generation device 3 may learn the improver RF by using the noise addition simulated data based on the simulated data SD, the observation data OD, and the improver loss function Loss _L1 . At this time, according to the learning data generation device 3, the improver RF and the discriminator DCN can be effectively trained with respect to the random noise in the observation data OD. As a result, according to the learning data generation device 3, the improvement device RF can be learned so that more realistic improvement data can be generated.

From the above, according to the present learning data generation device 3, it is possible to generate more realistic improvement data, so that the quality of the learning data can be further improved.

This learning device learns the model LOM to be learned by the learning unit 815 using the learning data having the improvement data using the learned improver TRF. As a result, according to this learning device, the underground structure estimation model TUSEM is generated by learning the model LOM to be learned by using the highly realistic improvement data and the underground structure model USM as the correct answer data. .. From the above, according to this learning device, it is possible to learn the underground structure estimation model TUSEM capable of outputting a more reliable underground structure UGS with respect to the observation data OD.

Further, according to this estimation device, the underground structure UGS is estimated by inputting the observation data OD used for learning the improver RF into the underground structure estimation model TUSEM. From the above, according to this estimation device, a highly realistic underground structure UGS can be estimated by using the underground structure estimation model TUSEM generated by using highly realistic learning data.

From the above, according to the learning data generation device 3 according to the present embodiment, it is possible to generate learning data related to the structure.

When the technical features of the present embodiment are realized by the training data generation method, the training data generation method is a training data generation method executed by using at least one processor, and is based on a plurality of structural features. , Generate a model of the structure, generate simulated data that simulates the observed values related to the structure by wave propagation simulation for the model of the structure, and associate the generated model of the structure with the simulated data. Generate training data. Since the processing procedure corresponding to the learning data generation method corresponds to the procedure of the learning data generation processing, the description thereof will be omitted. Further, since the effect of the learning data generation method is the same as that of the embodiment, the description thereof will be omitted. When the technical features of the present embodiment are realized by the model generation method, the model generation method generates an estimation model that estimates information about the structure by using the training data generated by the above training data generation method. Since the processing procedure related to the model generation method corresponds to the processing procedure in the learning unit 815 and the like, the description thereof will be omitted. Further, since the effect of the model generation method is the same as that of the embodiment, the description thereof will be omitted.

A part or all of each device in the above-described embodiment may be configured by hardware, or may be configured by information processing of software (program) executed by a CPU, GPU, or the like. In the case of software information processing, software that realizes at least a part of the functions of each device in the above-described embodiment is a flexible disk, a CD-ROM (Computer Disc-Read Only Memory), a USB memory, or the like. The information processing of the software may be executed by storing the software in a non-temporary storage medium (non-temporary computer-readable medium) and causing the computer 30 to read the information. Further, the software may be downloaded via the communication network 5. Further, information processing may be executed by hardware by implementing the software in a circuit such as an ASIC or FPGA.

The type of storage medium that stores the software is not limited. The storage medium is not limited to a removable one such as a magnetic disk or an optical disk, and may be a fixed storage medium such as a hard disk or a memory. Further, the storage medium may be provided inside the computer or may be provided outside the computer.

In the present specification (including claims), the expression (including similar expressions) of "at least one (one) of a, b and c" or "at least one (one) of a, b or c" is used. When used, it includes any of a, b, c, ab, ac, bc, or abc. Further, a plurality of instances may be included for any of the elements, such as aa, abb, aabbbcc, and the like. Furthermore, it also includes adding elements other than the listed elements (a, b and c), such as having d, such as abcd.

In the present specification (including claims), when expressions such as "with data as input / based on / according to / according to" (including similar expressions) are used, unless otherwise specified. This includes the case where various data itself is used as an input, and the case where various data are processed in some way (for example, noise-added data, normalized data, intermediate representation of various data, etc.) are used as input. In addition, when it is stated that some result can be obtained "based on / according to / according to the data", it includes the case where the result can be obtained based only on the data, and other data other than the data. It may also include cases where the result is obtained under the influence of factors, conditions, and / or conditions. In addition, when it is stated that "data is output", unless otherwise specified, various data itself is used as output, or various data is processed in some way (for example, noise is added, normal). It also includes the case where the output is output (intermediate representation of various data, etc.).

In the present specification (including claims), when the terms "connected" and "coupled" are used, direct connection / combination and indirect connection / combination are used. , Electrical connection / combination, communication connection / combination, functional connection / combination, physical connection / combination, etc. Intended as a term. The term should be interpreted as appropriate according to the context in which the term is used, but any connection / combination form that is not intentionally or naturally excluded is not included in the term. It should be interpreted in a limited way.

In the present specification (including claims), when the expression "A configured to B" is used, the physical structure of the element A can perform the operation B. Including that the element A has a structure and the permanent or temporary setting (setting / configuration) of the element A is set (configured / set) to actually execute the operation B. good. For example, when the element A is a general-purpose processor, the processor has a hardware configuration capable of executing the operation B, and the operation B is set by setting a permanent or temporary program (instruction). It suffices if it is configured to actually execute. Further, when the element A is a dedicated processor, a dedicated arithmetic circuit, or the like, the circuit structure of the processor actually executes the operation B regardless of whether or not the control instruction and data are actually attached. It suffices if it is constructed.

In the present specification (including claims), when a term meaning inclusion or possession (for example, "comprising / inclusion" and having "(having), etc.)" is used, the object of the term is used. It is intended as an open-end term, including the case of containing or owning an object other than the indicated object. If the object of these terms that mean inclusion or possession is an expression that does not specify a quantity or suggests a singular (an expression with a or an as an article), the expression is interpreted as not being limited to a specific number. It should be.

In the present specification (including claims), expressions such as "one or more" or "at least one" are used in some places, and the quantity is specified in other places. Even if expressions that do not or suggest the singular (expressions with a or an as an article) are used, the latter expression is not intended to mean "one". In general, expressions that do not specify a quantity or suggest a singular (expressions with a or an as an article) should be interpreted as not necessarily limited to a particular number.

In the present specification, when it is stated that a specific effect (advance / result) can be obtained for a specific configuration having an embodiment, unless there is another reason, another one or more having the configuration. It should be understood that the effect can also be obtained in the examples of. However, it should be understood that the presence or absence of the effect generally depends on various factors, conditions, and / or states, etc., and that the effect cannot always be obtained by the configuration. The effect is merely obtained by the configuration described in the examples when various factors, conditions, and / or conditions are satisfied, and in the invention relating to the claim that defines the configuration or a similar configuration. , The effect is not always obtained.

In the present specification (including claims), when terms such as "maximize" are used, the global maximum value is obtained, the approximate value of the global maximum value is obtained, and the local maximum value is obtained. Should be interpreted as appropriate according to the context in which the term is used, including finding an approximation of the local maximum. It also includes probabilistically or heuristically finding approximate values of these maximum values. Similarly, when terms such as "minimize" are used, finding the global minimum, finding the approximation of the global minimum, finding the local minimum, and the local minimum. It should be interpreted as appropriate according to the context in which the term was used, including finding an approximation of the value. It also includes probabilistically or heuristically finding approximate values of these minimum values. Similarly, when terms such as "optimize" are used, finding the global optimal value, finding the approximate value of the global optimal value, finding the local optimal value, and local optimal It should be interpreted as appropriate according to the context in which the term was used, including finding an approximation of the value. It also includes probabilistically or heuristically finding approximate values of these optimal values.

In the present specification (including claims), when a plurality of hardware performs a predetermined process, the respective hardware may cooperate to perform the predetermined process, or some hardware may perform the predetermined process. You may do all of the above. Further, some hardware may perform a part of a predetermined process, and another hardware may perform the rest of the predetermined process. In the present specification (including claims), when expressions such as "one or more hardware performs the first process and the one or more hardware performs the second process" are used. , The hardware that performs the first process and the hardware that performs the second process may be the same or different. That is, the hardware that performs the first process and the hardware that performs the second process may be included in the one or more hardware. The hardware may include an electronic circuit or a device including the electronic circuit.

In the present specification (including the claims), when a plurality of storage devices (memory) store data, each storage device (memory) among the plurality of storage devices (memory) stores only a part of the data. It may be stored or the entire data may be stored.

Although the embodiments of the present disclosure have been described in detail above, the present disclosure is not limited to the individual embodiments described above. Various additions, changes, replacements, partial deletions, etc. are possible without departing from the conceptual idea and purpose of the present invention derived from the contents specified in the claims and their equivalents. For example, in all the above-described embodiments, when numerical values or mathematical formulas are used for explanation, they are shown as examples, and the present invention is not limited thereto. Further, the order of each operation in the embodiment is shown as an example, and is not limited to these.

1 Learning system 3 Learning data generator 5 Communication network 7 Learning device 9A External device 9B External device 10 Generation area 13 Area directly above the bottom surface of scraping 15 Stripping area 17 Rock salt 19 Shot image 30 Computer 31 Processor 33 Main storage device 35 Auxiliary storage device 37 Network interface 39 Device interface 41 Bus 81 Processor 311 Setting unit 313 Decision unit 315 Model generation unit 317 Simulated data generation unit 319 Learning data generation unit 811 Preprocessing unit 813 Model setting unit 815 Learning unit

Claims

A learning data generation method executed using at least one processor.
To generate a model of the structure based on a plurality of features related to the structure,
By wave propagation simulation for the model of the structure, it is possible to generate simulated data that simulates the observed values related to the structure.
To generate training data by associating the model of the structure with the simulated data,
A learning data generation method comprising.
The structure is an underground structure
The wave propagation simulation is a simulation related to seismic waves.
The learning data generation method according to claim 1.
Based on the plurality of features, a model of the structure is generated by executing an event related to at least one of deposition, fold, fault, exfoliation, redeposition, refold, refault, or rock salt intrusion. do,
The learning data generation method according to claim 2.
A model of the structure is generated by executing events in the order of sedimentation, folds, and faults.
The learning data generation method according to claim 3.
A model of the structure is generated by executing events in the order of fault, scraping, redeposition, refolding, and re-fault.
The learning data generation method according to claim 3 or 4.
A model of the structure is generated by executing events in the order of re-fault and rock salt intrusion.
The learning data generation method according to any one of claims 3 to 5.
The plurality of features include geological information.
The learning data generation method according to any one of claims 2 to 6.
The geological information can be obtained from the lateral bending of the formation, the insertion of inconsistent surfaces, the size of the structure (width, depth) related to the modeling of the underground structure, the insertion of the low-speed layer into the surface layer, or Includes any one of the layer thickness distributions,
The learning data generation method according to claim 7.
The plurality of feature quantities are determined based on the range of the feature quantity values and the random number.
The learning data generation method according to any one of claims 1 to 8.
The structure is an underground structure
The range of the feature value values is set based on the geological information in the target area.
The learning data generation method according to any one of claims 1 to 9.
The geological information has at least one of the stratified data in the target area, the observation data on the underground structure in the target area, and the spatial distribution of the feature amount inferred in the target area.
The learning data generation method according to claim 10.
The structure is an underground structure
The simulated data includes shot data of the underground structure.
The learning data generation method according to any one of claims 1 to 11.
The simulated data is input to the improver learned based on the observation data acquired by observing the structure, the simulated data, and the loss function weighted according to the magnitude of the data value in the simulated data. , Generate improvement data,
The training data is generated by associating the model of the structure with the improvement data.
The learning data generation method according to any one of claims 1 to 12.
Using the learning data generated by the learning data generation method according to any one of claims 1 to 13, an estimation model for estimating information about the structure is generated.
Model generation method.
A model generator that generates a model of the structure based on a plurality of features related to the structure,
A simulated data generation unit that generates simulated data that simulates observation values related to the structure by wave propagation simulation for the model of the structure.
A learning data generation unit that generates training data by associating a model of the structure with the simulated data.
A learning data generator equipped with.
Further, a display unit for displaying two indicators for changing the upper limit value and the lower limit value of the plurality of feature quantities is provided.
The model generator generates a model of the structure using representative values in the range specified by the two indicators.
The learning data generation device according to claim 15.