CN111625608B - Method and system for generating electronic map according to remote sensing image based on GAN model - Google Patents
Method and system for generating electronic map according to remote sensing image based on GAN model Download PDFInfo
- Publication number
- CN111625608B CN111625608B CN202010310125.XA CN202010310125A CN111625608B CN 111625608 B CN111625608 B CN 111625608B CN 202010310125 A CN202010310125 A CN 202010310125A CN 111625608 B CN111625608 B CN 111625608B
- Authority
- CN
- China
- Prior art keywords
- electronic map
- loss
- model
- generator
- discriminator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013256 Gubra-Amylin NASH model Methods 0.000 title claims abstract description 56
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000005070 sampling Methods 0.000 claims abstract description 28
- 230000003044 adaptive effect Effects 0.000 claims abstract description 23
- 238000009877 rendering Methods 0.000 claims abstract description 19
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims description 48
- 230000006870 function Effects 0.000 claims description 37
- 239000011159 matrix material Substances 0.000 claims description 22
- 230000008569 process Effects 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 9
- 150000001875 compounds Chemical class 0.000 claims description 6
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 5
- 238000005516 engineering process Methods 0.000 claims description 5
- 230000014509 gene expression Effects 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000008447 perception Effects 0.000 claims description 2
- 238000012546 transfer Methods 0.000 claims description 2
- 238000013519 translation Methods 0.000 abstract description 19
- 238000011156 evaluation Methods 0.000 abstract description 13
- 230000000007 visual effect Effects 0.000 abstract 1
- 238000012360 testing method Methods 0.000 description 6
- 230000006872 improvement Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 235000021384 green leafy vegetables Nutrition 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- BASFCYQUMIYNBI-UHFFFAOYSA-N platinum Chemical compound [Pt] BASFCYQUMIYNBI-UHFFFAOYSA-N 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000003042 antagnostic effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 229910052697 platinum Inorganic materials 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Remote Sensing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method and a system for generating an electronic map according to remote sensing images based on a GAN model. The method and the system improve the generator architecture and the loss function of the GAN model and provide an adaptive solution for generating electronic map color rendering and road identification. The generator architecture of the GAN model consists of three parts: a down-sampling layer, a residual block, an up-sampling layer, comprising 6 residual blocks and 2 long-jump connections. The model loss function contains loss terms for optimizing color rendering and road generation for generating an electronic map, in addition to adaptive perceptual loss and adaptive countervailing loss. In addition, the present invention also controls color rendering using binary image channels of specific terrain elements. The results show that: the quality of the electronic map generated by the method and the system disclosed by the invention is superior to that of the existing picture translation model under the evaluation of visual vision and classical evaluation indexes, the pixel-level translation accuracy is improved by 40%, and the FID evaluation value is reduced by 38%.
Description
Technical Field
The invention belongs to the field of geographic science, and particularly relates to a method, a system and an electronic device for generating an electronic map according to a remote sensing image.
Background
The picture translation means that a picture in the field A is input, and the picture in the corresponding field B is output after processing and conversion. Many problems in the field of computer vision can be attributed to picture translation, for example, the super-resolution problem can be regarded as the conversion from a low-resolution picture to a high-resolution picture, and the picture colorization can be regarded as the conversion of a single-channel gray-scale image and a multi-channel color image. Although the convolutional neural network has a good effect in solving the picture translation task of the painting art, the problems of low quality of generated pictures, complex design of loss functions and the like exist. Until the appearance of a GAN (generation countermeasure network) network model, a new method is provided in a picture translation task, and the GAN well solves the problems existing in the convolutional neural network by training a discriminator to automatically learn a loss function and generates a picture with higher quality.
In recent years, there are many GAN-based picture translation models, such as: pix2pix is proposed based on CGAN improvement, the generator no longer generates data from random noise, but reads in from a given picture, which is the basis of many subsequent GAN-based picture translation models; the CycleGAN is composed of two generators and two discriminators, and a cycle limit is added on the basis of the loss function of the original GAN, so that the problem of training a picture translation model under the condition of no paired data sets is solved; the VAE + GAN model further solves the problem by adding a shared hidden space; pix2pixHD can generate a color map with resolution as high as 2048 x 1024 from a semantic segmentation map by using two generators, a global generation network and a local promotion network; textureGAN enables texture control of the generated picture by introducing local texture loss and local content loss.
At present, pix2pix and CycleGAN are used as a general picture translation frame and can be directly applied to a map generation task, but the generated electronic map cannot accurately identify and render surface feature elements such as forest lands, water areas, roads and the like, and meanwhile, the problems of blurred texture, low quality and the like exist.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a model established based on mapGAN from a specific scene generated by an electronic map, aiming at the problems of fuzzy texture, low quality and the like in the prior art, and to adopt a plurality of targeted optimization measures to improve the accuracy and the attractiveness of the generated map.
The technical scheme adopted by the invention for solving the technical problems is as follows: a method for generating an electronic map according to remote sensing images based on a GAN model is constructed, and comprises the following steps:
s1, constructing a GAN-generation confrontation network model, wherein the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:
the generator comprises a down-sampling layer, M residual blocks and an up-sampling layer, wherein the down-sampling layer adopts multilayer convolution to reduce the width and height of the characteristic matrix, and the up-sampling layer adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix;
the discriminator adopts the receptive field blocks with the size of 70 x 70 to judge the electronic map generated by the generator and outputs a matrix representing the discrimination result of each block;
s2, acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed in the step S1 for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;
in the training process, in order to promote GAN model learning, N pieces of binary image channel information are used to construct a first loss term to distinguish the difference of color rendering of the generated electronic map and the target electronic map in terms of N different surface feature elements, wherein the first loss term is defined as:
wherein i represents the ith feature element,and &>Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>The total number of pixels occupied by the i element in the target electronic map, lambda 1 A weight coefficient being a first loss term;
and S3, inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.
The invention discloses a system for generating an electronic map according to remote sensing images based on a GAN model, which comprises the following modules:
the generation countermeasure network model building module is used for building a GAN-generation countermeasure network model, and the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:
the generator comprises a down-sampling layer, M residual blocks and an up-sampling layer, wherein the down-sampling layer adopts multilayer convolution to reduce the width and height of the characteristic matrix, and the up-sampling layer adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix;
the discriminator adopts reception field blocks to judge the electronic map generated by the generator and outputs a matrix representing the discrimination result of each block;
the model training module is used for acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed by the confrontation network model construction module for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;
the model training module comprises a first loss item construction module, which is used for constructing a first loss item defined by the following formula in order to promote the learning of the GAN model by using N binary image channel information in the training process so as to judge the difference of color rendering of the generated electronic map and the target electronic map in the aspects of N different surface feature elements:
in the formula, i representsThe ith feature element of the table is shown,and &>Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>The total number of pixels occupied by the i element in the target electronic map, lambda 1 A weight coefficient being a first loss term;
and the target electronic map generation module is used for inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.
The method and the system for generating the electronic map according to the remote sensing image based on the GAN model have the following beneficial effects that:
1. 6 residual blocks are added in a generator of the GAN model, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation;
2. the model loss function contains loss terms for optimizing color rendering and road generation for generating an electronic map, in addition to adaptive perceptual loss and adaptive countervailing loss.
Drawings
The invention will be further described with reference to the following drawings and examples, in which:
FIG. 1 is a graph of the results of partial testing of pix2pix and mapGAN models on a Facades data set;
fig. 2 is a partial test diagram of the mapGAN model in the task of picture translation where the photo scene environment is converted from day to night.
FIG. 3 is a flowchart illustrating steps of a method for generating an electronic map from a remote sensing image based on a GAN model according to the present invention;
FIG. 4 is a GAN model generator architecture diagram;
FIG. 5 is a pair of training data for rendering green space into a square for aesthetics;
FIG. 6 is a model structure diagram of a method for generating an electronic map from a remote sensing image based on a GAN model according to the present invention.
Detailed Description
For a more clear understanding of the technical features, objects and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
1. Description of the experimental conditions:
the map translation task of the invention trains the remote sensing-electronic data set at 1096 and tests the remote sensing-electronic data set at 1042, wherein the data set is from satellites and electronic map tiles on a Google map, and the picture size is 600 pixels by 600 pixels.
The experiment carried out by the invention is that 1 block of GPU with NVIDA M40, 4 blocks of CPU and RAM with Inter Xeon Platinum 8163@2.5GHz: 30 GiB.
2. Electronic map generation quality analysis
The present embodiment uses an improved GAN model (hereinafter referred to as mapGAN model) for tile map generation flip-over and compares the quality of the generated map with a pix2pix, cycleGAN model on the same test set. In the training phase, 3 binary image channels are acquired by using an OpenCV library. Table 1 shows the evaluation results of the generation quality of mapGAN and other model maps under different evaluation indexes, where the sample data set for calculating the true feature distribution of the electronic map includes 2000 electronic maps, and the sample data set for calculating the feature distribution of the model-generated electronic map includes 1000 electronic maps. The result shows that the map generation result of the mapGAN model is better than the pix2pix and cycleGAN models in pixel level translation accuracy, kernel MMD and FID evaluation.
Table 1 evaluation results of models under different evaluation indexes
3. Model expansibility analysis
Although the mapGAN model is proposed to solve the specific application problem of accurately generating an electronic map from a remote sensing image, the mapGAN model can be used as a brand-new image translation model after input control and related loss items of a binary image channel in the mapGAN are removed. To further explore the scalability of this model, the mapGAN model of this embodiment was modified as follows:
(1) Canceling binary map input related to an expressway, a water area and a forest land;
(2) Cancelling the use of L road ,L color And L s A loss term;
(3) Mixing L with f_vgg The calculated feature loss is instead a comparison between the generator generated picture and the target generated picture. The remaining model settings remain unchanged. The invention performs mapGAN expansibility test on two different picture translation tasks.
And (3) translating the semantic tags to generate photos: using a CMP concepts dataset, containing 400 training samples, a translation task was generated for the conversion from the building outside semantic segmentation map to the building outside photo. The experiment was trained and tested on the Facades dataset using pix2pix and mapGAN simultaneously, and the results of the experiment are shown in figure 1. Intuitively, the translation result of mapGAN on this task is not much different from pix2 pix. To give a more accurate comparison, the Kernel MMD index evaluation was further used and the data showed that the evaluation of pix2pix was 0.13 and the evaluation of mapgan was 0.11, which is 15.4% less than the comparison.
Converting the photo scene from day to night: the data set source is an outdoor scene data set used by P-Y.Laffont, and the translation task is to input a picture taken in the daytime of a place and output a picture of the place at night. The results of the experiment are shown in FIG. 2. The result shows that mapGAN can effectively separate the ground features and the environment in the input picture and correctly learn the state attributes of various ground features at night. The pix2pix model was also introduced here to train tests on the same dataset and evaluated against mapGAN under Kernel MMD evaluation criteria, showing a 33.3% reduction in the results for pix2pix of 0.12 and mapGAN of 0.08.
Example 1:
the following will explain in detail the steps of a method for generating an electronic map from a remote sensing image based on a GAN model, specifically refer to fig. 3:
the method comprises the following 3 steps:
step S1:
constructing a GAN-generation confrontation network model, wherein the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:
the generator comprises a down-sampling layer which adopts multilayer convolution to reduce the width and height of the characteristic matrix, M residual blocks and an up-sampling layer which adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix (please refer to FIG. 4 for the structure of a generator network model);
the discriminator comprises a receptive field block for judging the electronic map generated by the generator and outputting a matrix representing a discrimination result;
in the present step, the GAN-generated countermeasure network model includes a generative model (i.e., generator) and a discriminative model (i.e., discriminator). Where the discriminant model is used to determine whether a given picture is a true picture (a picture taken from a data set), the task of generating the model is to create a picture that looks like a true picture, i.e. to generate a picture through the model that closely resembles the picture you want.
At the beginning, the two models are not trained, the two models are subjected to antagonistic training together, the generated model generates a picture to deceive the discrimination model, then the discrimination model judges whether the picture is true or false, finally, the two models are stronger and stronger in the training process of the two models, and finally, the two models reach a steady state.
Under the current embodiment, the following 2 improvements are made to the generator:
1. the generator comprises a down-sampling layer, an up-sampling layer and a residual block, wherein a convolution kernel of 7*7 is cancelled, and 3 convolution kernels of 3*3 are adopted, so that the purpose of reducing the receptive field of the convolution kernel is to enhance the sensitivity of the generator to the detailed characteristics of each picture block and reduce the training parameters.
2. The number of the residual blocks is 6, that is, in the generator, 6 residual blocks are added between the lower sampling layer and the upper sampling layer, wherein each residual block is composed of 2 convolution layers, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation.
In the present embodiment, the following improvements are made to the discriminator:
in the present embodiment, the discriminator uses the segments of the receptive field with the size of 70 × 70 to judge the electronic map generated by the generator. In the convolutional neural network, a Receptive Field (Receptive Field) refers to a region of an input image that can be seen by a certain point on a feature map, that is, a point on the feature map is obtained by calculating the size region of the Receptive Field in the input image.
In summary, based on the improvement points of the generator and the discriminator, the key point of step S1 is the structural design of the GAN model, and currently, by adding the residual block and reducing the receptive field of the convolution kernel, the performance of the network model is further improved, and the sensitivity of the model to the detailed features of each picture block is enhanced.
Step S2:
acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed in the step S1 for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;
in the training process, in order to promote the GAN model learning, a first loss term defined by formula (1) is constructed by using N binary image channel information to judge the difference of color rendering of the generated electronic map and the target electronic map in terms of N different ground feature elements:
wherein i represents the ith feature element,and &>Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>Total number of pixels occupied by i element in target electronic map, lambda 1 A weight coefficient being a first loss term;
in the current step, how to perform correct color rendering on the generated electronic map by model learning faces many difficulties, which are reflected as follows: (1) In the standard electronic map manufacturing, when color rendering is performed, attribute information of various aspects of the map area needs to be referred to, for example, an expressway is rendered to be orange according to geographic element information extracted from a geographic entity database, and a national-way provincial road is rendered to be yellow, but the attribute information neural network cannot be extracted from a remote sensing image. (2) The model needs to take into account aesthetics when rendering the colors of the generated electronic map. For example, sometimes to increase map aesthetics, standard electronic mapping will render the greens into standard squares, although this may contain some non-greens components, as shown in fig. 5.
In order to overcome the above (1) difficult point, in this embodiment, with the aid of the generation control concept of the CGAN model, the input channels of the GAN model are designed to be six, which are: remote sensing image RGB channel and 3 binary images; wherein, 3 binary images respectively represent the information of the forest land, the water area and the highway by using 0-1 codes, and the pixel range with the value of 1 represents the color rendering range of the corresponding element.
Currently, in order to reflect the color difference, the difference size of the color rendering of the generated electronic map and the target electronic map on 3 elements of the forest land, the water area and the expressway is judged by constructing the following first loss terms:
(a1) Wherein i represents the ith feature element,and &>Respectively represents the binary image extracted aiming at the i element in the electronic map generated by the generator network and the target electronic map, and is/are selected>The total number of pixels occupied by the i element in the target electronic map, lambda 1 Is the weight coefficient of the first loss term.
The difference size of the color rendering is considered in the above, however, the correct identification and display of the roads in the remote sensing image is an important task for electronic map generation, which is only the basis for applications such as navigation. In the aspect of road identification generation, the model not only needs to learn to connect roads shielded by scattered trees to ensure the continuity of the roads, but also generates a road boundary which is straight and smooth. To do both, the present embodiment constructs a second loss term defined by formula (a 2), increasing the restriction of the generated electronic map in terms of road continuity and smoothness:
(a2) In the formula (I), the compound is shown in the specification,and &>Electronically representing an electronic map and an object generated by a generator network, respectivelyTwo-value map extracted for a road element in the map, based on the evaluation of the road condition>Total number of pixels, lambda, occupied by a road in the target electronic map 2 Is the loss term weight.
Because the countertraining is widely applied to the picture translation model, the countertraining can guide the generator to generate more vivid pictures by automatically learning the loss function by using a trainable arbiter in the training process.
However, the GAN original penalty function presents a problem: that is, when the generator is updated, when the generated false sample is far away from the decision boundary and still is on the side of the real sample, the original sigmoid cross entropy loss function is used to cause the gradient to disappear. Therefore, the present embodiment adopts the least square loss function proposed by the present embodiment to solve the problem of premature disappearance of the gradient, and combines the classical L1 loss function to further improve the stability of the mapGAN model training.
In order to improve the stability of the mapGAN model training, based on the formulas (a 1) - (a 2), the final loss function of the GAN model in this embodiment, i.e. the third loss term, is defined by the formula (a 3):
L(D)=minL adv (D)
L(G)=min(L adv (G)+L color +L road +L p ); (a3)
in the formula, L (D) represents a loss function of the discriminator, and L (G) represents a loss function of the generator.
Wherein the adaptive counter-loss term L adv Defined by equations (a 4) - (a 5) as:
in the formula, L adv (D) Adaptive countering loss function, L, of finger arbiter adv (G) Adaptive penalty function, L, for finger generators 1 Is L 1 A loss function; lambda 6 And λ 7 A weight coefficient representing a corresponding loss term; x represents a target electronic map, c represents all binary image channels in generator input, y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents an electronic map generated by the generator with y and c as input, D (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c and y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator with x, c, y as input and further output.
In summary, the key point of step S2 is that, in order to improve the stability of the mapGAN model training, the final loss function of the GAN model, that is, the third loss term is designed, and in consideration of the difference in color rendering between the generated electronic map and the target electronic map in terms of N different feature elements and the limitation of the electronic map in terms of road continuity and smoothness, a trainable discriminator is used to automatically learn the third loss term to guide the generator to generate a more realistic picture during the training process.
And step S3:
based on the two steps, the remote sensing image to be processed can be input into the trained GAN model at present, and therefore the corresponding target electronic map is obtained.
Example 2:
in order to effectively help the model reuse the features extracted by the downsampling during the electronic map construction in the upsampling stage, two long-jump connections are added between the downsampling layer and the upsampling layer in the embodiment, so that cross-layer information transfer is achieved. Please refer to fig. 4.
Example 3:
the map generation discussed in the invention can be regarded as a special style migration, in the scene of the invention, the content picture is a remote sensing image, the style picture is an electronic map, and the purpose of the map generation is to reserve a certain content of the remote sensing image and convert the remote sensing image into a picture with a network electronic map style.
The loss function during the training of the style migration model is generally designed into two items, one item measures the content similarity between the input content picture and the output composite picture, and the other item measures the style similarity between the input style picture and the output composite picture based on the Gram matrix.
The invention makes the following improvements on the basis of the method: (1) A feature loss term consisting of the features extracted by the discriminator is added. (2) And selecting a vgg-19 model for the layer number of the characteristic loss establishment by adopting a priori test method. In this embodiment, the adaptive perceptual loss function L p The mathematical formula is as follows:
L p =L f_d +L f_vgg +L s
in the formula, L f_d Representing a loss of characteristics of the discriminator, L f_vgg Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L s Representing a generator style loss term constructed using the generated electronic map and the target electronic map, the mathematical expression of each of the above loss terms being defined by the following formula:
L S =λ 5 (Gram(s)-Gram(x))2;
in the above 3-term equation, λ 3 、λ 4 、λ 5 A weight coefficient representing each loss term; t represents the number of network layers, n t Features representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d t The characteristics extracted by the discriminator at the t layer are taken as input; v represents the vgg-19 model employed in the generator, V t (. Star) denotes vgg-19 model extraction with ". Star" as inputThe content characteristics of the obtained picture; style loss term L s Expressing the generated picture style characteristics by using a Gram matrix, the Gram matrix being defined by the following formula:
in the formula (I), the compound is shown in the specification,and &>Respectively represent the j-th feature matrix and the k-th feature matrix in the t-th network layer of the vgg-19 model, and the meaning of the rest variables is the same as the formula in the foregoing. The Gram matrix is defined as the sum of the inner products of any two feature matrices in a particular network layer.
When the method for generating an electronic map from a remote sensing image based on a GAN model disclosed in the present invention is applied to a system, the system structure thereof refers to fig. 6.
The invention discloses a system for generating an electronic map according to a remote sensing image based on a GAN model, which comprises a generation confrontation network model building module L1, a model training module L2 and a target electronic map generating module L3, wherein:
a generation countermeasure network model construction module L1 that stores an execution program for executing the step S1 disclosed in the embodiments 1 to 3, specifically for constructing the GAN model;
a model training module L2 storing an execution program for executing the step S2 disclosed in embodiments 1 to 3, which includes constructing a first loss function by a first loss term constructing module L21 in the model training module L2, a second loss function by a second loss term constructing module L22 in the model training module L2, and a final loss function of the GAN model by a third loss term constructing module L23 in the model training module L2, during training;
a target electronic map generation module L3 that stores an execution program for executing step S3 disclosed in embodiment 1, specifically for generating a final target electronic map.
The above description is made of the content of the implementation method, the execution flow when the implementation method is applied to the system, and the system configuration. The implementation method is not limited to the experimental conditions and the experimental environment disclosed in the embodiment, and adaptive adjustment can be made on selection of the data source and the workstation to achieve a better implementation effect.
In summary, according to the method and system for generating the electronic map according to the remote sensing image based on the GAN model, 6 residual blocks are added in a generator of the GAN model, and the performance of the model can be improved by properly increasing the network depth on the premise of not generating the problem of gradient propagation; in the training process, the loss function of the model comprises an adaptive perception loss item and an adaptive countermeasure loss item which are set per se, and also comprises a loss item for optimizing color rendering of the generated electronic map and road generation, and by distinguishing the difference of the color rendering of the generated electronic map and a target electronic map on N different ground feature elements and increasing the limitation of the generated electronic map on road continuity and smoothness, a trainable discriminator is used for automatically learning the loss function to guide the generator to generate more vivid pictures in the training process.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (10)
1. A method for generating an electronic map according to remote sensing images based on a GAN model is characterized by comprising the following steps:
s1, constructing a GAN-generation confrontation network model, wherein the GAN model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:
the generator comprises a down-sampling layer, M residual blocks and an up-sampling layer, wherein the down-sampling layer adopts multilayer convolution to reduce the width and height of the characteristic matrix, and the up-sampling layer adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix;
the discriminator comprises a reception field block for judging the electronic map generated by the generator and outputting a matrix representing the discrimination result;
s2, acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed in the step S1 for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;
in the training process, in order to promote the GAN model learning, a first loss term defined by formula (1) is constructed by using N binary image channel information to judge the difference of color rendering of the generated electronic map and the target electronic map in terms of N different ground feature elements:
wherein i represents the ith feature element,and &>Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>Occupied by i element in target electronic mapTotal number of pixels, λ 1 A weight coefficient being a first loss term;
and S3, inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.
2. The method for generating an electronic map according to remote sensing images of claim 1, wherein in step S1, the downsampled layer, the residual block and the upsampled layer are 3 convolution kernels of 3*3.
3. The method for generating an electronic map according to the remote sensing image as claimed in claim 2, wherein in step S1, two long jump connections are added between the down-sampling layer and the up-sampling layer of the generator, so that the features extracted by the down-sampling layer are used for realizing cross-layer information transfer when the electronic map is constructed in the up-sampling stage.
4. A method of generating an electronic map from remotely sensed images as recited in claim 3, wherein in step S1, the generator comprises 6 residual blocks, each residual block being composed of two convolutional layers.
5. The method for generating an electronic map according to the remote sensing image of claim 4, wherein in step S2, the input channels of the GAN model comprise 3 binary map channels respectively representing information of forest land, water area and road;
constructing a second loss term defined by equation (2) to increase the restriction of the generated electronic map in terms of road continuity and smoothness:
in the formula (I), the compound is shown in the specification,and &>Respectively represents the two-value map extracted aiming at the road elements in the electronic map and the target electronic map generated by the generator>Total number of pixels, lambda, occupied by a road in the target electronic map 2 Is the weight coefficient of the loss term.
6. The method of claim 5, wherein the loss function of the GAN model further comprises an adaptive perceptual loss term L p And adaptive countering loss term L adv (ii) a In conjunction with equations (1) - (2), the final loss function of the GAN model is defined by equation (3):
L(D)=minL adv (D)
L(G)=min(L adv (G)+L color +L road +L p ); (3)
wherein L (D) represents the loss function of the discriminator and L (G) represents the loss function of the generator;
the adaptive counter-loss term L adv Defined by equations (4) - (5) as:
in the formula, L adv (D) Adaptive countering loss function, L, of finger arbiter adv (G) Adaptive countering loss function of finger generator, L 1 Is L 1 A loss function; lambda [ alpha ] 6 And λ 7 A weight coefficient representing a corresponding loss term; x represents a target electronic map, c represents all binary image channels in generator input, y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents that the generator takes y and c as inputD (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c, y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator which takes x, c and y as input and further outputs; e (, x) is the average of "".
7. The method of claim 6, wherein the adaptive perceptual loss term L is selected from the group consisting of p Is defined as follows: l is p =L f_d +L f_vgg +L s Wherein:
L f_d representing a loss of characteristics of the discriminator, L f_vgg Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L s Representing the generator-style loss term constructed using the generation electronic map and the target electronic map, the mathematical expressions of the above-described respective loss terms are defined by equations (6) to (8):
L s =λ 5 (Gram(s)-Gram(x)) 2 ; (8)
in the formula, λ 3 、λ 4 、λ 5 A weight coefficient representing each loss term; t represents the number of network layers, n t Features representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d t The character extracted at the t-th layer by the discriminator with the character as input; v represents the vgg-19 model employed in the generator, V t The (x) represents the vgg-19 model and takes 'x' as the input to extract the picture content characteristics;
style loss term L s Expressing generated picture style features by using a Gram matrixThe definition is shown by equation (9):
8. A system for generating an electronic map according to remote sensing images based on a GAN model is characterized by comprising the following modules:
the generation countermeasure network model building module is used for building a GAN-generation countermeasure network model, and the GAN network model comprises a generator and a discriminator; the discriminator is used for comparing and discriminating the electronic map generated by the generator with a target map sample; wherein:
the generator comprises a down-sampling layer, M residual blocks and an up-sampling layer, wherein the down-sampling layer adopts multilayer convolution to reduce the width and height of the characteristic matrix, and the up-sampling layer adopts multilayer convolution and utilizes a deconvolution technology to improve the width and height of the characteristic matrix;
the discriminator adopts reception field blocks to judge the electronic map generated by the generator and outputs a matrix representing the discrimination result of each block;
the model training module is used for acquiring a training data set comprising a plurality of remote sensing images, and inputting the training data set into the GAN model constructed by the confrontation network model construction module for network training; the input channel of the GAN model comprises an RGB channel of a remote sensing image and N binary image channels representing different surface feature elements, wherein each binary image uses 0-1 coding; n is greater than 1;
the model training module comprises a first loss item construction module, which is used for constructing a first loss item defined by the following formula in order to promote the learning of the GAN model by using N binary image channel information in the training process so as to judge the difference of color rendering of the generated electronic map and the target electronic map in the aspects of N different surface feature elements:
wherein i represents the ith feature element,and &>Respectively represents the binary image extracted for the i element in the electronic map generated by the generator and the target electronic map, and is/are selected>The total number of pixels occupied by the i element in the target electronic map, lambda 1 A weight coefficient being a first loss term;
and the target electronic map generation module is used for inputting the remote sensing image to be processed into the trained GAN model to obtain a corresponding target electronic map.
9. The system for generating an electronic map according to remote sensing images of claim 8, wherein the model training module further comprises a second loss item construction module;
the second loss item constructing module is configured to construct a second loss item when an input channel of the GAN model includes 3 binary map channels respectively representing forest land, water area, and road information, and increase a limit of the generated electronic map in terms of road continuity and smoothness, where a mathematical expression of the second loss item is:
in the formula (I), the compound is shown in the specification,and &>Respectively represents the two-value map extracted aiming at the road elements in the electronic map and the target electronic map generated by the generator>Total number of pixels, lambda, occupied by a road in the target electronic map 2 Is the loss term weight.
10. The system for generating an electronic map according to remote sensing images of claim 9, wherein the model training module further comprises a third loss term construction module;
the third loss term construction module is used for combining the self-adaptive perception loss term L p And adaptive countering loss term L adv And a first loss term and a second loss term, and constructing a final loss function of the GAN model, wherein the mathematical expression of the loss function is as follows:
L(D)=minL adv (D)
L(G)=min(L adv (G)+L color +L road +L p );
wherein L (D) represents the loss function of the discriminator and L (G) represents the loss function of the generator;
wherein the adaptive counter-loss term L adv Defined by the following equation:
in the formula, L adv (D) Adaptive countering loss function, L, of finger arbiter adv (G) The adaptive counter-loss function of the finger generator,L 1 is L 1 A loss function; lambda 6 And λ 7 A weight coefficient representing a corresponding loss term; y represents a remote sensing image input to the GAN model, and p _ data (y) and p _ data (x) represent probability distribution obeyed by y and x respectively; g (y, c) represents an electronic map generated by the generator with y and c as input, D (G (y, c), c, y) represents the discrimination probability of the discriminator with G (y, c), c and y as input and further output; d (x, c, y) represents the discrimination probability of the discriminator which takes x, c and y as input and further outputs; e (, is an average of "");
wherein the adaptive perceptual loss term L p Is defined as: l is p =L f_d +L f_vgg +L s (ii) a In the formula, L f_d Representing a loss of characteristics of the discriminator, L f_vgg Representing the loss of features of the feature extraction and matching structure in the generator by adopting a vgg-19 model, L s Representing a generator style loss term constructed using the generated electronic map and the target electronic map, the mathematical expression of each of the above loss terms being defined by the following formula:
L S =λ 5 (Gram(s)-Gram(x)) 2 ;
in the formula, λ 3 、λ 4 、λ 5 A weight coefficient representing each loss term; t represents the number of network layers, t nfeatures representing the t-th network layer; d represents a discriminator, s represents the generated electronic map, x represents the target electronic map, and c represents all binary image channels in the input of the generator; d t The characteristics extracted by the discriminator at the t layer are taken as input; v represents the vgg-19 model employed in the generator, V t The model vgg-19 takes "+" as input and extracted picture content characteristics; style loss term L s By using a Gram matrixTo express the style characteristics of the picture, the Gram matrix definition is defined by the following formula:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010310125.XA CN111625608B (en) | 2020-04-20 | 2020-04-20 | Method and system for generating electronic map according to remote sensing image based on GAN model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010310125.XA CN111625608B (en) | 2020-04-20 | 2020-04-20 | Method and system for generating electronic map according to remote sensing image based on GAN model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111625608A CN111625608A (en) | 2020-09-04 |
CN111625608B true CN111625608B (en) | 2023-04-07 |
Family
ID=72260052
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010310125.XA Active CN111625608B (en) | 2020-04-20 | 2020-04-20 | Method and system for generating electronic map according to remote sensing image based on GAN model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111625608B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112183727B (en) * | 2020-09-29 | 2024-08-02 | 中科方寸知微(南京)科技有限公司 | Countermeasure generation network model, and method and system for rendering scenery effect based on countermeasure generation network model |
CN112487999B (en) * | 2020-12-02 | 2024-06-14 | 西安邮电大学 | CycleGAN-based robust feature extraction method for remote sensing image |
CN112884640B (en) * | 2021-03-01 | 2024-04-09 | 深圳追一科技有限公司 | Model training method, related device and readable storage medium |
CN113076806A (en) * | 2021-03-10 | 2021-07-06 | 湖北星地智链科技有限公司 | Structure-enhanced semi-supervised online map generation method |
CN112860838B (en) * | 2021-03-16 | 2022-04-08 | 湖北星地智链科技有限公司 | Multi-scale map generation method, system and terminal based on generation type countermeasure network |
CN113052121B (en) * | 2021-04-08 | 2022-09-06 | 北京理工大学 | Multi-level network map intelligent generation method based on remote sensing image |
CN112991493B (en) * | 2021-04-09 | 2023-07-18 | 华南理工大学 | Gray image coloring method based on VAE-GAN and mixed density network |
US20240257352A1 (en) * | 2021-06-30 | 2024-08-01 | Grabtaxi Holdings Pte. Ltd. | Segmenting method for extracting a road network for use in vehicle routing, method of training the map segmenter, and method of controlling a vehicle |
CN114418005B (en) * | 2022-01-21 | 2022-09-20 | 杭州碧游信息技术有限公司 | Game map automatic generation method, device, medium and equipment based on GAN network |
CN114758251A (en) * | 2022-06-15 | 2022-07-15 | 青岛阅海信息服务有限公司 | Remote sensing image unsupervised road extraction method based on content and style coding |
CN115146349B (en) * | 2022-07-06 | 2024-09-20 | 北京林业大学 | Method and device for locally updating design |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830209A (en) * | 2018-06-08 | 2018-11-16 | 西安电子科技大学 | Based on the remote sensing images method for extracting roads for generating confrontation network |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
CN110992262A (en) * | 2019-11-26 | 2020-04-10 | 南阳理工学院 | Remote sensing image super-resolution reconstruction method based on generation countermeasure network |
-
2020
- 2020-04-20 CN CN202010310125.XA patent/CN111625608B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830209A (en) * | 2018-06-08 | 2018-11-16 | 西安电子科技大学 | Based on the remote sensing images method for extracting roads for generating confrontation network |
CN110992262A (en) * | 2019-11-26 | 2020-04-10 | 南阳理工学院 | Remote sensing image super-resolution reconstruction method based on generation countermeasure network |
AU2020100274A4 (en) * | 2020-02-25 | 2020-03-26 | Huang, Shuying DR | A Multi-Scale Feature Fusion Network based on GANs for Haze Removal |
Non-Patent Citations (2)
Title |
---|
Wenhao Yu, etc..Automated generalization of Facility Points-of-Interest With Service Area Delimitation.《IEEE Access》.2019,第7卷第63921-63935页. * |
龚希等.融合全局和局部深度特征的高分辨率遥感影像场景分类方法.《光学学报》.2019,第39卷(第3期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111625608A (en) | 2020-09-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111625608B (en) | Method and system for generating electronic map according to remote sensing image based on GAN model | |
Golts et al. | Unsupervised single image dehazing using dark channel prior loss | |
CN109584248B (en) | Infrared target instance segmentation method based on feature fusion and dense connection network | |
CN110276316B (en) | Human body key point detection method based on deep learning | |
CN111598174B (en) | Model training method based on semi-supervised antagonistic learning and image change analysis method | |
CN110363215B (en) | Method for converting SAR image into optical image based on generating type countermeasure network | |
CN109934154B (en) | Remote sensing image change detection method and detection device | |
CN103208001B (en) | In conjunction with shape-adaptive neighborhood and the remote sensing image processing method of texture feature extraction | |
CN114187450B (en) | Remote sensing image semantic segmentation method based on deep learning | |
CN111738111A (en) | Road extraction method of high-resolution remote sensing image based on multi-branch cascade void space pyramid | |
US12106428B2 (en) | Radiance fields for three-dimensional reconstruction and novel view synthesis in large-scale environments | |
CN110084108A (en) | Pedestrian re-identification system and method based on GAN neural network | |
CN110728197B (en) | Single-tree-level tree species identification method based on deep learning | |
CN112434745A (en) | Occlusion target detection and identification method based on multi-source cognitive fusion | |
CN111738113A (en) | Road extraction method of high-resolution remote sensing image based on double-attention machine system and semantic constraint | |
CN111860351A (en) | Remote sensing image fishpond extraction method based on line-row self-attention full convolution neural network | |
CN111160293A (en) | Small target ship detection method and system based on characteristic pyramid network | |
CN117314811A (en) | SAR-optical image fusion method based on hybrid model | |
CN113971764A (en) | Remote sensing image small target detection method based on improved YOLOv3 | |
CN113191213B (en) | High-resolution remote sensing image newly-added building detection method | |
CN115861756A (en) | Earth background small target identification method based on cascade combination network | |
CN116563682A (en) | Attention scheme and strip convolution semantic line detection method based on depth Hough network | |
CN116863347A (en) | High-efficiency and high-precision remote sensing image semantic segmentation method and application | |
CN111798530A (en) | Remote sensing image classification method | |
CN114612709A (en) | Multi-scale target detection method guided by image pyramid characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |