CN113920018A

CN113920018A - Improved MSCNN underwater image defogging method

Info

Publication number: CN113920018A
Application number: CN202111066603.8A
Authority: CN
Inventors: 王昊月; 李轶南; 徐碧洁; 杜姝函; 姜明勇; 陈向宁
Original assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Current assignee: Peoples Liberation Army Strategic Support Force Aerospace Engineering University
Priority date: 2021-09-13
Filing date: 2021-09-13
Publication date: 2022-01-11

Abstract

The invention has proposed a modified MSCNN underwater image defogging method, carry on the illumination based on processing uniform picture of Retinex and CLAHE to the fogging picture first, carry on pretreatment such as white balance correction, contrast enhancement, etc. to the image after processing, then input the image into MSCNN model trained to estimate out the transmissivity of the image, use the method of dark channel prior to get the background light intensity at the same time, get the image after defogging finally; the method combines Retinex and CLAHE to balance the brightness and improve the contrast of the underwater image, so that the method has more advantages aiming at the complex conditions of low illumination, uneven illumination, obvious Rayleigh scattering phenomenon and the like of the underwater environment, and the problems that in the prior art, halo and overexposure are easy to generate in the process of restoring the underwater image with uneven illumination are solved; compared with the existing method, the method has the advantages that the defogging effect of the underwater image is greatly improved, and various objective evaluation indexes are improved.

Description

Improved MSCNN underwater image defogging method

Technical Field

The invention belongs to the technical field of image processing, and particularly relates to an improved MSCNN underwater image defogging method.

Background

The underwater image is an important carrier and a presentation mode of underwater information, and has important effects on exploration, development and utilization of ocean resources. However, due to the limitations of objective imaging environment and equipment, the quality of underwater images is always poor, and the underwater images have degradation phenomena such as low contrast, blurred details, color deviation and the like, thereby seriously restricting the development of related fields. Therefore, the interest of students is increasing how to enhance and restore the degraded underwater image through the later algorithm. In recent years, with the rapid development of deep learning technology, the underwater image enhancement and restoration technology based on deep learning has made great progress.

In recent years, in the hot tide of machine learning, many algorithms for image defogging using a machine learning technique emerge. Tang et al systematically studied different fog-related features in a learning framework to determine the best combination of image defogging features, and recovered clear images by using random forests to regress atmospheric transmittance of foggy images from a variety of prior features. The use of synthesis methods to generate training data proposed by Levin et al is also possible with many of the latter machine learning-based methods. Zhu et al propose to use color attenuation priors to defogg from an input foggy image, to model the scene depth of the blurred image under the new priors by establishing a linear model, and to learn the parameters of the model by using a supervised learning method, which can recover the depth information well. The image algorithm based on machine learning improves a great effect on the basis of the traditional algorithm, but compared with image reconstruction tasks such as image super-resolution and the like, single image defogging has great progress space on subjective evaluation and objective evaluation, and many problems still exist and are not optimized and solved.

It can be seen from the atmospheric scattering model that the key of image defogging under the active light illumination condition is to correctly solve the atmospheric light and transmittance values, and image defogging by using a deep learning method can achieve a relatively ideal effect, but because the training data set with fog of the method is artificially synthesized, the model fitted by the network has a certain difference with a real scene, and the loss function adopts a mean square error form, which means that the network can only reduce the difference of the pixel values of the whole image, and the similarity of the defogged image and the whole area of the fogged image cannot be ensured, so that the deep learning-based method has the defects that training is difficult, and the defogging effect depends heavily on the quality of the training data set, and meanwhile, the traditional method such as the Retinex method cannot obtain an effective processing effect on the image with a large-area sky area, and the estimated value of the transmittance is low.

Disclosure of Invention

In view of this, the present invention provides an improved method for defogging an underwater image of MSCNN, which can effectively improve the defogging effect.

An image defogging algorithm comprising the steps of:

step 1, extracting three channels of R, G and B of an image, carrying out normalization processing, respectively processing by using a CLAHE algorithm, and then synthesizing the processed three channels of the image into an image;

step 2, extracting three channels of R, G and B of the image, processing the three channels of image by using an MSRCR algorithm in a Retinex algorithm respectively, and then synthesizing the processed three channels of image into an image;

step 3, performing weighted fusion on the images processed in the step 1 and the step 2 to obtain a fused image, and setting the fused image as an image D;

step 4, preprocessing the image D;

step 5, training the MSCNN network, and then importing the processed image D into a trained MSCNN model to estimate the transmittance t (x) of the MSCNN model;

step 6, estimating the maximum illumination value of the image D by using a dark channel prior method to obtain a background light value C;

and 7, substituting the formula I (x) ═ J (x) t (x) + C [1-t (x) ], and recovering the fog-free image I (x).

Preferably, in the MSCNN network training, a CNN network capable of estimating the depth of field d (x) of an image is used, the transmittance of an input clear image e is estimated by using the formula t (x) ═ e- β d (x) (where β is an atmospheric absorption coefficient), then the model randomly generates a background light value a, a fogging image is synthesized by using the formula i (x) ═ j (x) t (x) + a [1-t (x) ], and the synthesized fogging image is sent to the MSCNN network for training.

Preferably, in step 1, an 8 × 8 rectangular block is used to divide the image into a plurality of different sub-regions,

preferably, in the step 1, the Clip-Limit is 0.01.

Preferably, in step 2, the scales adopted for the R, G, and B three-channel images during the fusion are gaussian functions of σ ═ 20, σ ═ 100, and σ ═ 200, respectively.

Preferably, in step 3, by adjusting the weighted fusion coefficient, the formula is as follows:

D＝tA+(1-t)B

where D represents the last fused image, a represents the image after CLAHE processing, B represents the image after MSRCR processing, and t is a fusion weighting coefficient of 0.95.

Preferably, in the step 4, the preprocessing includes white balance.

Preferably, in the step 4, the preprocessing includes contrast enhancement.

Preferably, in the step 4, the preprocessing includes gamma enhancement.

Preferably, in the step 4, the gamma value used in the gamma enhancement is 0.9.

The invention has the following beneficial effects:

the invention has proposed a modified MSCNN underwater image defogging method, carry on the illumination based on processing uniform picture of Retinex and CLAHE to the fogging picture first, carry on pretreatment such as white balance correction, contrast enhancement, etc. to the image after processing, then input the image into MSCNN model trained to estimate out the transmissivity of the image, use the method of dark channel prior to get the background light intensity at the same time, get the image after defogging finally; the method combines Retinex and CLAHE to carry out brightness balance and contrast improvement on the underwater image, so that the method has more advantages aiming at the complex conditions of low illumination, uneven illumination, obvious Rayleigh scattering phenomenon and the like of the underwater environment, and solves the problems that halos and overexposure are easy to generate in the restoration process of the underwater image with uneven illumination in the prior art; compared with the existing method, the method has the advantages that the defogging effect of the underwater image is greatly improved, and various objective evaluation indexes are improved.

Drawings

FIG. 1 is a general operational flow diagram of the present invention.

Fig. 2 is a flow chart of the MSCNN defogging algorithm used in the present invention.

FIG. 3(a) is an original underwater foggy image; fig. 3(b) is a defogged image obtained by using the defogging method according to the present invention.

Detailed Description

The invention is described in detail below by way of example with reference to the accompanying drawings.

1. Principle of underwater imaging

In an underwater environment, an image obtained by an imaging system mainly comprises two parts, namely target reflected light attenuated by water body particle absorption and scattering, and background light formed by water body particle scattering, wherein the mathematical expression of the background light is as follows:

I_total(x,y)＝D(x,y)+B(x,y) (1)

B(x,y)＝∫_ΘB(Θ)_approxdΘ＝B_∞[1-t(x,y)] (3)

wherein I_totalObtaining an original image which is an underwater blurred image for an imaging system; d (x, y) is the light intensity of the light reflected by the surface of the scene and reaching an imaging system after being attenuated by a water body and subjected to forward scattering; b (x, y) is the underwater background light intensity resulting from the underwater particle backscattering.

The expression for D (x, y) is:

wherein

For effective radiation information of underwater target, t (x, y) is e^{-β(x,y)ρ(x,y)}For the transmission rate of light under water, ρ (x, y) is the relative distance between the light source and the imaging system, and β (x, y) is the total attenuation coefficient of the water body for the absorption and scattering effects of the transmitted light.

The expression for B (x, y) is:

B(x,y)＝∫_ΘB(Θ)_approxdΘ＝B_∞[1-t(x,y)] (5)

wherein B (theta)_approxFor the backscatter volume element function, Θ represents the set of all possible angles of scatter, B, that can occur for a given water volume element_∞Background light intensity at infinity underwater.

2. Nonuniform illumination picture enhancement technology based on Retinex and CLAHE

1) Multi-scale Retinex algorithm with color recovery

The Retinex theory was first proposed by the american physicist Edwin Land and is based on the color constancy theory. Color constancy means that human judgment of the color of an object does not change due to a change in illumination conditions, and shows constancy. Retinex theory considers that the color of an object perceived by human beings is not influenced by the external illumination condition, and is only related to the reflection property of the object. Therefore, the Retinex theory can be used for removing the color offset of the underwater blurred image and realizing the color fidelity effect. Retinex theory considers an image S (x, y) as the product of an illumination component L (x, y) determined by environmental factors and a reflection component R (x, y) determined by the intrinsic properties of the object, as follows:

S(x,y)＝L(x,y)·R(x,y) (6)

the illumination component L (x, y) determines the dynamic range of the image S (x, y), and the reflection component L (x, y) reflects the reflective properties of the object itself. The essence of the Retinex algorithm is to take the image as the sum of the illumination component and the reflection component, remove the illumination component, eliminate the influence of the external conditions on the image, and obtain the reflection component, i.e. the original attribute of the object.

The principle of the multi-scale Retinex algorithm for color recovery is as follows:

wherein, a_k(x, y) represents a nonlinear adjustment coefficient, N represents the number of bands, I_k(x, y) denotes an image of the k-th band, and β denotes a gain adjustment coefficient. It can be seen that compared with the general multi-scale Retinex algorithm, the multi-scale Retinex algorithm for color recovery adds the adjustment factor a on the basis of the multi-scale Retinex algorithm_k(x, y). Adjustment factor a_kThe function of (x, y) is to adjust the R, G, B components of the image enhanced by the multi-scale Retinex algorithm correspondingly according to the ratio of the R, G, B components of the original image, so as to avoid color distortion.

2) Equalization of CLAHE histograms

The histogram distribution of the low-illumination underwater image is narrow and generally low, and the CLAHE can change the distribution of the histogram again to enable the pixel level distribution to be within 0 to 255, so that the pixel level distribution is uniform, the color cast is removed to a certain extent, the contrast is improved, and more details are shown. The CLAHE method comprises the following concrete implementation steps:

the lost person image is divided into M × N non-repeating subregions.

b) The gray level histogram of each sub-region is equally distributed to pixels with the same calculation result, namely the average pixel is

Wherein N is_grayTotal pixel representing the gray level in the sub-area, N_xRepresenting the number of pixels on the X-axis of each small area, N_yThe number of pixels per small area on the Y-axis is indicated.

c) Make each small region histogram larger than

K represents a clipping coefficient, and the total pixel of the clipped part is calculated as ∑ N_vThen, the number of pixels to which the truncated total number of images is assigned per gray level is found as:

d) and (4) after the steps (a), (b) and (c) are repeated for each subregion, histogram equalization is carried out on the obtained new histogram after each subregion is cut, and a transformation function is used for obtaining a new gray value.

3) Novel idea based on CLARE and multi-scale Retinex fusion

In order to solve the problem of underwater images affected by uneven illumination, the invention considers the defects of Retinex algorithm, namely, the color of the color picture is whitened and distorted due to the fact that the image with large illumination change has halo phenomenon. And the CLAHE algorithm can enhance the local contrast of the image by histogram amplitude limiting to reduce the illumination change.

Generally, the fused image needs to increase the amount of detail information as much as possible while retaining the advantages of the original image. The data level fusion is the basis of high-level image fusion, and directly processes data acquired by a sensor under a plurality of scenes, and reserves the original scene as much as possible, provides texture information which cannot be provided by other layer systems, and is also a common method for researching the current fusion image, such as a Weighted average method, a wavelet transform method and the like, wherein the Weighted average method (WA) can directly fuse a plurality of image information with different qualities, has the advantages of simplicity, intuition and high operation speed, meanwhile, the signal-to-noise ratio of the image can be improved, so that the invention selects and adopts a pixel weighted average method to fuse the images respectively processed by CLARE and the multi-scale Retinex algorithm.

3. Image defogging algorithm based on multi-scale convolutional network MSCNN

The performance of existing image defogging methods is limited by manually extracted features such as dark channels, color differences and maximum contrast, and complex fusion schemes. Ren et al trained the MSCNN network by learning the mapping synthesis data between the foggy image and its corresponding transmittance image, and the trained network could directly estimate its corresponding transmittance from the foggy image, and in the defogging stage, directly use the trained transmittance estimation model to obtain the foggy image transmittance, and then use the formula i (x) ═ j (x) t (x) + a [1-t (x) ]) to solve the image to be restored. For networks that estimate transmission, the network designed by Ren et al is divided into two parts, a coarse transmission estimation network and a refinement transmission network, where the input of the second convolutional layer of the refinement network combines the output of the coarse estimation network and both sub-networks are trained separately. Experiments have shown that MSCNN performs better than the most advanced methods for both synthetic and real-world images in terms of quality and speed.

1) Coarse scale convolutional neural network

The task of the coarse-scale network is to estimate the overall transmittance map of the scene. The coarse scale network comprises four operations, convolution, max pooling, upsampling, and linear combining.

Convolutional layer the network takes the RGB image as input. The response of each convolutional layer is given by the following equation.

Wherein

And

characteristic diagrams of the l layer and the l +1 layer are respectively. In addition, k is a convolution kernel, and the index (m, n) represents the mapping from the current m-th layer to the n-th layer, and x represents the convolution operator. The function σ (-) represents the modified linear unit (ReLU) of the filter response, and b is the offset.

Maximum pooling layer each convolutional layer is followed by a maximum boosting layer with a sampling factor of 2.

And (4) upsampling, wherein the size of the ground real transmission rate graph is the same as the input in the whole frame. While the size of the feature map is reduced by half after the maximum pooling. Therefore, an upsampling layer is added to ensure that the output transmission rate map and the input hazy image are equal in size. Although the deletion of the max pooling layer and the upsampling layer may be chosen to achieve the same goal, this approach will reduce the non-linearity of the network, which is less efficient. The upsampling layer follows the pooling layer to recover the sub-sampled feature map size while preserving network non-linearity. The response of each upsampling layer may be defined as

Where l denotes the layer number and (x, y) denotes the two-dimensional spatial coordinates of the image, the essence of this equation is to copy the pixel value at (x, y) into its corresponding 2 x 2 region. Whereas the linear combination layer is essentially the result of a convolution kernel calculation using a spatial dimension of 1 x 1 for all feature maps of the penultimate layer.

Structurally, the network is an end-to-end regression network for estimating fog map transmittance, and the output of the rough estimation network is used as an important feature map to be combined into a fine estimation network, so that the multi-scale (multi scale) characteristic of the network is fully embodied. However, due to the connection mode of the upper sampling layer, the local receptive field of each layer in the network is small, so that the method has certain limitation on learning the global characteristics of the fog image, and in addition, the estimation effect of the network on the fog image transmissivity is also influenced to a certain extent by using the synthetic data based on the indoor image as training data.

2) Fine scale convolutional neural network:

the task of the fine-scale convolutional neural network is to refine the coarse transmittance map output by the coarse-scale convolutional neural network, and a more accurate transmittance map can be obtained by inputting the foggy image and the coarse transmittance map into the fine-scale convolutional neural network. The fine-scale convolutional neural network model comprises convolution layers with 3 different scales, the sizes of convolution kernels are 7 multiplied by 7,5 multiplied by 5 and 3 multiplied by 3 respectively, and the sizes of the pooling kernels in the pooling layers are consistent with those in the coarse-scale convolutional network and are 2 multiplied by 2 respectively. And finally, refining the fine-scale convolutional neural network, wherein the output of the model is a refined transmittance graph.

The estimation of the atmospheric illumination in the MSCNN algorithm is based on the dark channel prior theory. After the thinned transmittance graph is obtained, selecting darker pixel points in the transmittance graph, and solving the intensity of the pixel points in the original fog image corresponding to the darker pixel points, wherein the value with the maximum intensity is the atmospheric illumination value. And substituting the output transmittance graph of the MSCNN model and the solved atmospheric illumination value into an atmospheric scattering model formula to obtain a defogged image.

In order to integrate the advantages of the algorithm based on multi-scale Retinex and CLAHE and the MSCNN algorithm, the invention provides an improved MSCNN image defogging algorithm.

As shown in fig. 1, the specific process is as follows:

step 1, extracting three channels of R, G and B of an image, carrying out normalization processing, respectively processing by using a CLAHE algorithm, and then synthesizing the processed three channels of images into an image to enhance the contrast and the details of the image. In the interim, an 8 × 8 rectangular block is used to divide the image into a number of different sub-regions, with a Clip-Limit of 0.01.

And 2, extracting three channels of R, G and B of the image, processing the three-channel image by using an MSRCR algorithm in a Retinex algorithm respectively, and then synthesizing the processed three-channel image into an image. In the fusion, the corresponding adopted scales of the R, G and B three-channel images are 20, 100 and 200 gaussian functions respectively.

And 3) carrying out weighted fusion on the images processed in the steps 1) and 2). And a better visual effect is obtained by adjusting the weighting fusion coefficient. The formula is as follows:

D＝tA+(1-t)B (13)

wherein D represents the final fused image, A represents the image after CLAHE processing, B represents the image after MSRCR processing, and t is a fusion weighting coefficient, and the optimal fusion coefficient t is 0.95 through a large number of experiments.

And 4, preprocessing the processed image D such as white balance, contrast enhancement, gamma enhancement and the like, wherein the gamma value is determined to be 0.9 through repeated experiments.

And 5, as shown in fig. 2, training the MSCNN network, and then importing the processed image D into the trained MSCNN model to estimate the transmittance t (x) of the image D.

When the MSCNN network is trained, firstly, a CNN network capable of estimating the depth of field d (x) of an image is used, the transmittance of an input clear image e is estimated by using a formula t (x) ═ e- β d (x) (where β is an atmospheric absorption coefficient), then a model randomly generates a background light value a, a fogging image is synthesized by using a formula i (x) ═ j (x) t (x) + a [1-t (x) ], and the synthesized fogging image is sent to the MSCNN network for training.

And 6, estimating the maximum illumination value of the image D by using a dark channel prior method to obtain a background light value C.

The effect chart of the defogging method of the present invention is shown in fig. 3, and the evaluation results are shown in table 1:

TABLE 1 Underwater image quality evaluation results

	Entropy of information	Mean gradient	UCIQE	Visible edge gradient ratio
					Original drawing	7.0339	1.1158	0.2312	1
Method for producing a composite material	7.0870	5.5931	0.3264	5.0402

In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An image defogging algorithm is characterized by comprising the following steps:

step 2, extracting three channels of R, G and B of the image, processing the three-channel image by using an MSRCR algorithm in a Retinex algorithm respectively, and then synthesizing the processed three-channel image into an image;

step 4, preprocessing the image D;

2. The image defogging algorithm according to claim 1, wherein in the training of the MSCNN network, firstly using the CNN network capable of estimating the depth of field d (x), the transmittance of the input clear image e is estimated using the formula t (x) -e- β d (x), then the model randomly generates the background light value a, synthesizes the fog image using the formula i (x) -j (x) t (x) + a [1-t (x) ], and then sends the synthesized fog image into the MSCNN network for training, wherein β is the atmospheric absorption coefficient.

3. An image defogging algorithm according to claim 1 or 2 wherein in step 1, an 8 x 8 rectangular block is used to divide the image into a plurality of different subregions.

4. An image defogging algorithm according to claim 1 or 2, wherein in the step 1, the Clip-Limit is 0.01.

5. An image defogging algorithm according to claim 1 or 2, wherein in the step 2, the scales adopted by the R, G and B three-channel images during fusion are 20, 100 and 200 gaussians respectively.

6. The image defogging algorithm according to claim 1 or 2, wherein in the step 3, the fused image is:

D＝tA+(1-t)B

7. An image defogging algorithm according to claim 1 or 2 wherein in step 4, the preprocessing includes white balancing.

8. The image defogging algorithm according to claim 7, wherein in step 4, the preprocessing comprises contrast enhancement.

9. The image defogging algorithm according to claim 8, wherein in step 4, the preprocessing comprises gamma enhancement.

10. The image defogging algorithm according to claim 8, wherein in step 4, the gamma value adopted in the gamma enhancement is 0.9.