CN109509156B

CN109509156B - Image defogging processing method based on generation countermeasure model

Info

Publication number: CN109509156B
Application number: CN201811289748.2A
Authority: CN
Inventors: 郑军; 李俊
Original assignee: Matrixtime Robotics Shanghai Co ltd
Current assignee: Matrixtime Robotics Shanghai Co ltd
Priority date: 2018-10-31
Filing date: 2018-10-31
Publication date: 2021-02-05
Anticipated expiration: 2038-10-31
Also published as: CN109509156A

Abstract

The invention relates to an image defogging processing method based on a generation countermeasure model, which directly converts a foggy image into a fogless image through a trained generation countermeasure model, wherein the generation countermeasure model comprises a generation network and a discrimination network, the generation network is used for generating a fogless recovery image and comprises the following steps: the first-layer subnet is used for extracting a characteristic map; the second-layer sub-network comprises 5 convolutional layers which are sequentially connected, and the input end of the second convolutional layer is connected with the output end of the previous-layer convolutional layer and the output end of the first-layer sub-network and is used for fusing the global feature map extracted by the first-layer sub-network to obtain a fog-free recovery image; the judging network is used for judging whether the fog-free recovery image output by the generating network is a real fog-free image. Compared with the prior art, the invention has the advantages of good defogging effect, simple process and the like.

Description

Image defogging processing method based on generation countermeasure model

Technical Field

The invention relates to an image processing method, in particular to an image defogging processing method based on a generation countermeasure model.

Background

Fog and haze are common atmospheric phenomena in real life. Because a lot of atmospheric tiny particles with certain sizes exist in foggy weather, the atmospheric tiny particles absorb the reflected light of a target object/scene, and the reflected light of the atmospheric tiny particles and the reflected light of the target object are mixed together to enter a camera for imaging, so that the imaging definition is interfered by different programs. Due to the blurring and noise of imaging, inconvenience is brought to outdoor photographing application in foggy weather, and great difficulty is caused to various algorithms based on computer vision, such as target recognition/tracking, scene segmentation, automatic driving, and the like.

With the advancement of image processing technology, image defogging work has been greatly advanced in recent years. The existing image defogging algorithm can be mainly divided into two types: the first type is based on artificially defined image characteristics, clustering and information statistics are carried out, the transmissivity and the atmospheric light intensity in an atmospheric scattering model are estimated, and then a defogged image is obtained through the reverse solution of the scattering model; the method has a general restoring effect on the foggy image, and has the main limitation that the estimation accuracy and robustness of the transmissivity and the atmospheric light intensity are poor through artificially defined characteristics and a traditional image processing method, and image color distortion and large image noise often occur. The second type is that a deep convolution network model is adopted, training is carried out based on a large sample size, the network learns how to estimate the transmittance and the atmospheric light intensity, and then a fog-free image is solved through a model formula. The deep learning model has the advantages that the extraction method of the characteristic is not required to be artificially defined, the network can learn how to extract the required characteristic, and the defogging effect is higher in recovery accuracy and universality than that of the traditional image processing method. However, the current deep learning defogging method has the defect that the end-to-end image defogging cannot be well realized.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide an image defogging processing method based on a generation countermeasure model.

The purpose of the invention can be realized by the following technical scheme:

an image defogging processing method based on a generation countermeasure model, which directly converts a foggy image into a fogless image through a trained generation countermeasure model, wherein the generation countermeasure model comprises a generation network and a discrimination network, the generation network is used for generating a fogless recovery image, and the method comprises the following steps:

the first-layer subnet is used for extracting a characteristic map;

the second-layer sub-network comprises 5 convolutional layers which are sequentially connected, and the input end of the second convolutional layer is connected with the output end of the previous-layer convolutional layer and the output end of the first-layer sub-network and is used for fusing the global feature map extracted by the first-layer sub-network to obtain a fog-free recovery image;

the judging network is used for judging whether the fog-free recovery image output by the generating network is a real fog-free image.

Further, the first-layer subnet includes 4 convolution modules, 4 pooling layers and two full-connection layers, each pooling layer is correspondingly connected behind one convolution module, and the two full-connection layers are connected behind the pooling layer.

Further, the convolution module of the first-layer subnet includes two connected convolution layers.

Furthermore, in the generative confrontation model, a ReLU nonlinear activation layer is arranged behind each convolution layer.

Furthermore, the discriminating network comprises 4 convolutional layers, 1 full-connection layer and 1 Sigmoid active layer

Further, the sample database adopted in the generation of the confrontation model training is generated by the following method:

the method comprises the steps of obtaining a fog-free image set, carrying out fog-adding processing on the fog-free image to generate fog images under different illumination intensities and fog concentrations, obtaining the fog image set, and forming a sample database by the fog-free image set and the corresponding fog image set.

Further, the fogging processing function adopted by the fogging processing is as follows:

G(I)＝T(I)+a*(1-T(I))

where I is the original fog-free RGB image t (I) { t (p) | p is any pixel of the image I }, t (p) is the transmittance of the pixel p, a is the atmospheric light intensity, and g (I) is the generated fog image.

Further, the network parameters are updated by using an image recovery cost function during the generation countermeasure model training, where the image recovery cost function L is represented as:

L＝E_GT+E_D

E_GT＝|I’-I|

E_D＝minmax(log(D(I)+log(1-D(I’))

wherein, E _ GT represents the difference cost obtained by comparing the fog-free restored image with the real fog-free image, I is the real fog-free image, I' is the fog-free restored image, E _ D represents the discrimination function for discriminating whether the fog-free restored image is the fog image or the fog-free image, and D is the discrimination network conversion.

Compared with the prior art, the invention has the following beneficial effects:

1) the invention adopts the generation countermeasure model to realize the end-to-end image defogging treatment, has the advantages that the judgment of the judgment network is carried out on the recovery image of the generation network, the estimation of intermediate parameters is not needed, and simultaneously, the good defogging effect can be obtained.

2) In the generated countermeasure model, a ReLU nonlinear activation layer is arranged behind each convolution layer, so that the whole network has the capability of simulating a high-order nonlinear function.

3) The method adopts the defined recovery cost function to train the countermeasure model until the final convergence is reached, so that the generated countermeasure model has higher precision.

Drawings

FIG. 1 is a schematic flow diagram of the present invention;

fig. 2 is a schematic diagram of a network structure for generating a countermeasure model according to the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and specific embodiments. The present embodiment is implemented on the premise of the technical solution of the present invention, and a detailed implementation manner and a specific operation process are given, but the scope of the present invention is not limited to the following embodiments.

The invention realizes an image defogging processing method based on a generation countermeasure model, converts a foggy image into a fogless image through a trained generation countermeasure model, does not need to estimate intermediate parameters, and can obtain a good defogging effect.

As shown in fig. 1, the method comprises the following specific steps:

step S101, a sample database is obtained.

Firstly, a fog-free image set is obtained, fog adding processing is carried out on the fog-free image set at different concentrations based on an atmospheric scattering model, data enhancement is carried out, a corresponding fog data set is generated, and the fog-free image set and the fog image set form a sample database.

The fogging processing function adopted by the fogging processing is as follows:

G(I)＝T(I)+a*(1-T(I))

wherein, I is an original RGB image without fog, f (I) represents a random processing function for the image, t (I) { t (p) | p is any pixel of the image I } is a propagation coefficient corresponding to the image, t (p) is a propagation coefficient of the pixel p, a is an atmospheric light intensity, and g (I) is a generated image with fog. t (p) is estimated according to the pixel depth value d (p), t (p) e ^ (beta ^ d (p)). beta and a are randomly taken from (0.8-1.2) and (0.6, 1).

In this embodiment, the NYU Depth V2 Dataset is obtained, and includes a fog-free clear RGB image and a Depth image registered therewith, which are used as a fog-free image set in a training sample. And carrying out fogging treatment on the fog-free image in the NYU Depth V2 Dataset to generate foggy images under different illumination intensities and fogging concentrations, and using the foggy images as a foggy image set in the training sample. In this embodiment, the fog-fog free image pair is cropped and scaled to a 128 x 128 image size.

Step S102, constructing and generating a confrontation model.

As shown in fig. 2, the network structure for generating the countermeasure model of the present invention includes two parts, one is a generation network and the other is a discrimination network, wherein the generation network is responsible for generating the fog-free image corresponding to the content of the input fog-free image, and the discrimination network is responsible for discriminating whether the fog-free image output by the generation network is a real fog-free image.

The generation network is divided into two layers, a first layer of sub-network comprises 4 convolution modules, 4 pooling layers and two full-connection layers, and the 4 pooling layers are respectively connected into the 4 convolution modules and then carry out maximum value down-sampling on the feature map to realize feature pooling; the second tier subnet contains 5 convolutional layers. The input of the first layer is 128-128 foggy images, the output is 32-32 feature maps, the feature dimension is 8, the feature maps are connected with the feature maps output by the first convolution layer of the second layer and input into the second convolution layer of the second layer together; the second layer input is a 128 x 128 foggy image and maintains this resolution all the time, outputting a restored 128 x 128 fogless image.

Each convolution module of the first layer of sub-network comprises 2 layers of convolution layers, the sizes of convolution kernels of the 2 layers of convolution layers are 3 x 3, and the number of convolution kernels contained in each layer of 5 modules is 128,256,256,512 respectively. Rearranging and converting one-dimensional vectors output by the fully connected layer into a feature map with the dimension of 32 x 8 and connecting the feature map to a second layer of subnets; the convolution kernels of the second tier subnets are each 5 x 64 in size.

In this embodiment, a ReLU nonlinear activation layer may also be added after each convolutional layer of the network is generated, so that the entire network has the capability of simulating a high-order nonlinear function.

The discrimination network comprises 4 convolution layers, 1 full-connection layer and 1 Sigmoid active layer, and a Leaky RuLU nonlinear active layer is arranged behind each convolution layer in the discrimination network, so that gradient return is more stable.

Step S103, training and generating a confrontation model.

And setting the learning rate and momentum parameters of the network, and training the generated confrontation model by utilizing the matchvnet until the network converges.

During training, calculating gradient by adopting an image recovery cost function, returning the gradient to update network parameters, wherein the image recovery cost function L comprises two items, one item is a difference cost E _ GT which is obtained by comparing with a real fog-free image, I is the real fog-free image, and I' is a network recovery image; a discrimination function E _ D ═ minmax (log (D (I)) + log (1-D (I'))) for the fog image and the fog-free image is used to discriminate whether the restored image is a fog image or a fog-free image, and D is the discrimination network transform.

And step S104, inputting a fog image, and directly obtaining a fog-free image by using the trained generation confrontation model.

The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims

1. An image defogging processing method based on a generation countermeasure model is characterized in that the method directly converts a foggy image into a fogless image through a trained generation countermeasure model, wherein the generation countermeasure model comprises a generation network and a discrimination network, and the generation network is used for generating a fogless recovery image and comprises the following steps:

the first-layer subnet is used for extracting a characteristic map;

the judging network is used for judging whether the fog-free recovery image output by the generating network is a real fog-free image or not;

the first layer of the subnet comprises 4 convolution modules, 4 pooling layers and two full-connection layers, wherein each pooling layer is correspondingly connected behind one convolution module, and the two full-connection layers are connected behind the pooling layers;

the discrimination network comprises 4 convolution layers, 1 full-connection layer and 1 Sigmoid activation layer;

when the generated confrontation model is trained, updating network parameters by adopting an image recovery cost function, wherein the image recovery cost function L is expressed as:

L＝E_GT+E_D

E_GT＝|I’-I|

E_D＝minmax(log(D(I)+log(1-D(I’))

2. The generative confrontation model-based image defogging method according to claim 1, wherein the convolution module of the first layer subnet comprises two connected layers of convolution.

3. The image defogging method according to claim 1, wherein a ReLU nonlinear activation layer is arranged after each convolution layer in the generative countermeasure model.

4. The image defogging processing method according to claim 1, wherein the sample database adopted in the training of the generative confrontation model is generated by:

5. The image defogging method according to claim 4, wherein the fog processing adopts a fog processing function as follows:

G(I)＝T(I)+a*(1-T(I))