CN114782247A

CN114782247A - Image super-resolution reconstruction method

Info

Publication number: CN114782247A
Application number: CN202210363002.1A
Authority: CN
Inventors: 许淑华; 齐鸣鸣; 张谦; 孙亚新
Original assignee: Wenzhou University of Technology
Current assignee: Wenzhou University of Technology
Priority date: 2022-04-06
Filing date: 2022-04-06
Publication date: 2022-07-22

Abstract

The invention discloses a super-resolution model (SRPUGAN-Charbon) based on PUGAN-Charbon. The model includes a generator network of synthetic super-resolution (SR) images and a discriminator network trained to distinguish SR images from true high-resolution (HR) images. The image super-resolution reconstruction method provided by the invention uses Charbonnier loss to process the abnormal value of the SR image, retains the low-frequency characteristic of the SR image, and uses positive unlabeled classification (PU) in generation of a countermeasure network (GAN), so that a discriminator is trained properly, and the stability of training is further improved. Extensive experiments on 3 reference datasets including Set5, Set14 and BSDS500 showed that the proposed SRPUGAN-Charbon method outperformed the most advanced methods in terms of PSNR, SSIM and visual effect.

Description

Image super-resolution reconstruction method

Technical Field

The invention relates to a super-resolution reconstruction method, in particular to an image super-resolution reconstruction method.

Background

Recently, the generation of countermeasure networks (GAN) has rapidly evolved from using GAN in super-resolution reconstruction (SR) due to the wide use to synthesize true high frequency details of images. However, the GAN training process has instability, and the main reason is that the discriminator in GAN keeps the positive and negative (true and false) discrimination criteria of the generated sample in the whole learning process, and does not consider the situation that the quality of the generated sample is gradually improved and is more vivid than a real sample.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide an image super-resolution reconstruction method with a more stable training process.

In order to realize the purpose, the invention provides the following technical scheme: an image super-resolution reconstruction method is characterized by comprising the following steps: the method comprises the following steps:

inputting an LR image x into a generator network, storing a good, clear and distinguishable local structure of the generated image by using a-Charbon regularization to obtain a corresponding reconstructed image G (x), and then calculating content loss of a real HR image y and the reconstructed image G (x) by using a Charbon penalty function;

step two, importing the real HR image y and the reconstructed image G (x) into VGG, extracting respective high-level features phi (y) and phi (G (x)), and calculating the content loss of the high-level features phi (y) and phi (G (x)) by using a Charbon penalty function;

inputting the extracted high-level features phi (y) and phi (G (x)) into a discriminator network, obtaining the confrontation loss based on PU classification regularization, and determining a final target loss function as the weighted sum of the content loss and the confrontation loss;

step four, realizing network back propagation by using a self-adaptive a-Charbon method and PU classification regularization, calculating gradient of each layer, and updating a parameter theta of a discriminator network and a generator network according to a training strategy_dAnd theta_GTo iteratively optimize the network;

and step five, repeating the step one to the step four until the loss function value is minimum, and finishing.

As a further improvement of the invention, the PU framework in the third step is as follows:

wherein, P_dataRepresenting the distribution of real samples, Z being the prior distribution P_zRandom noise of medium sampling, D (x) is the probability of x being true data predicted by the discriminator, f₁(. is a loss of classifying the input as a true sample, f₂(. is) the loss of classifying the input as a generated sample, and π is the prior knowledge, i.e., the proportion of high quality samples in the generated sample.

As a further improvement of the invention, the adaptive evolution equation of a-Charbon in the fourth step is as follows:

where y is the estimated high resolution image, b_kTransformation matrix, x, representing warping, blurring and decimation operations_kRepresenting a sequence of low-resolution images, gamma₁Is a regularization parameter, γ₂(y-y₀) Is a data fidelity item in which, among other things,

is the a-Charbon regularization term.

As a further improvement of the present invention, the objective loss function in the third step is as follows:

l＝l_charbon+0.008l_VGG-charbon+2×10^-6l_α-charbon+10^-3l_PU-Gen，

wherein l_charbonFor content loss,/_VGG-charbonLoss to improve VGG loss,/_α-charbonIs a-Charb loss, l_PU-GenIndicating a loss of antagonism.

As a further improvement of the invention, the content loss is defined as follows:

where r is the upsampling factor, W and H are the width and height, respectively, of the HR image,

ε＝10^-3ρ (m) is the Charbonier penalty function, X is the LR image and y is the original HR image.

As a further improvement of the present invention, the loss for improving the VGG loss is defined as follows:

wherein phi_5，4Represents the characteristic diagram, W, obtained after the 4 th convolution Relu before the 5 th maximum pooling level in VGG networks_5，4And H_5，4Respectively representing the width and height of the corresponding feature map.

As a further improvement of the invention, the a-Charb loss is defined as follows:

wherein, theta_GRepresents the parameters of the generator, and alpha is more than or equal to 0 and less than or equal to 2.

As a further improvement of the invention, the resistance loss is defined as follows:

wherein ε is 10^-2To prevent the logarithmic term from being 0, theta_dAnd the parameters of the discriminator are represented, pi represents class prior knowledge, namely, positive data proportion in unmarked data, n represents the number of training samples, and lambda is a regularization parameter. The invention has the beneficial effects that:

the discriminators of the models are designed based on the Charbonnier penalty function as the loss function, and the training stability of the discriminators is improved.

The generated SR image samples are processed as unlabeled samples, focusing the generator on improving the generated low quality SR image samples to improve the performance of the generator.

A new perception loss set is proposed, and the real texture and the background outline details of a reconstructed image are enhanced as much as possible through the weighted sum of content loss, feature loss, texture loss and Charbonier relative resistance loss.

Drawings

Fig. 1 is a schematic diagram of a network architecture of the image super-resolution reconstruction method of the present invention.

Detailed Description

The invention will be further described in detail with reference to the following examples, which are given in the accompanying drawings.

Referring to fig. 1, in the image super-resolution reconstruction method of this embodiment, a positive unmarked GAN of a Charbonnier loss function is used to perform single-image super-resolution reconstruction. The method aims to generate a super-resolution image which is reasonable in vision and has perceptual texture details, and comprises the following specific contents:

the whole training process can be divided into five steps:

(1) inputting the LR image x into a generator network, storing a good, clear and distinguishable local structure of the generated image by using a-Charbon regularization to obtain a corresponding reconstructed image G (x), and then calculating content loss of the real HR image y and the reconstructed image G (x) by using a Charbon penalty function.

(2) And (2) importing the real HR image y and the reconstructed image G (x) into VGG, extracting respective high-level features phi (y) and phi (G (x)), and similarly, calculating the content loss of the high-level features phi (y) and phi (G (x)) by using a Charbon penalty function.

(3) The extracted high-level features phi (y) and phi (G (x)) are input into a discriminator network, and regularization is carried out based on PU classification to obtain the penalty loss, and the final target loss function is a weighted sum of the content loss and the penalty loss.

(4) Realizing network back propagation by using self-adaptive a-Charbon method and PU classification regularization, and calculating each layerGradient, according to a training strategy, by updating the parameters θ of the discriminator network and the generator network_dAnd theta_GTo iteratively optimize the network.

(5) And repeating the steps until the loss function value is minimum, and finishing the network training work.

The above PU model obtaining process is as follows:

allowing the arbiter D to treat high quality generated samples as real data and focus on the generated low quality samples. D needs to learn how to distinguish high quality samples from other low quality samples. Under the guidance of real samples, the high-quality samples are identified from the generated samples to be very similar to the problem of positive unlabeled classification, so that the discriminator is properly trained and trained to be in a correct state, the conditions are not too good or too bad, and the stability of the discriminator is further improved. The general framework of PUGAN is as follows:

wherein, P_dataRepresenting the distribution of real samples, z being the distribution P from the prior experiment_z(i.e., Gaussian distribution) of sampled random noise, D (x) is the probability that x is predicted by the discriminator to be true data, f₁(. is a loss classifying the input as a true sample, f₂(. cndot.) is the loss of classifying the input as a generating sample, and π is the a priori knowledge, i.e., the proportion of high quality samples in the generating sample.

The basic feature of the a-Charbon adaptive SR method is that the regularization part is automatically switched according to the image structure. Furthermore, the parameter a selected by the control model is automatically determined by the program. The proposed a-Charbon adaptive SR evolution equation is shown in the following formula (2):

from equation (2), the high resolution image y, b can be estimated_kTransformation matrix, x, representing warping, blurring and decimation operations_kRepresenting a sequence of low-resolution images, gamma₁Is a regularization parameter, γ₂(y-y₀) Is a data fidelity item in which, among other things,

is the a-Charbon regularization term.

The objective loss function also includes content loss and antagonistic loss, as shown in the following equation (3):

l＝l_charbon+0.008l_VGG-charbon+2×10^-6l_α-charbon+10^-3l_PU-Gen， (3)

wherein the content loss comprises Charbonier loss l_charbonLoss l of improvement of VGG loss_VGG-charbona-Charb loss l_α-charbon。l_PU-GenIndicating a loss of antagonism.

Content loss

Although many of the widely used Mean Square Errors (MSE) losses in single image super-resolution reconstruction can improve PSNR, they are less robust in dealing with outliers, so a robust loss function/is used herein_charbonTo deal with outliers,/_charbonThe definition is shown in the following formula (4):

ρ (m) is the Charbonier penalty function [32 ]]X is the LR image and y is the original HR image.

VGG losses are introduced into the SR, which is used not only for pixel level losses but also for perceptual losses. In this text,/_VGG-charbonThe loss is defined as shown in the following formula (5):

wherein phi is_5，4Represents the characteristic diagram, W, obtained after the 4 th convolution Relu before the 5 th maximum pooling level in VGG networks_5，4And H_5，4Respectively representing the width and height of the corresponding feature map.

We further use the a-Charbon regularization technique to preserve good, clearly distinguishable local structures of the generated image, such as edges, l_α-charbonThe regularization term is defined as shown in equation (6) below:

To combat the loss

The generation loss is added to the perception loss, the purpose is to encourage the network to support the solution residing on the natural image manifold by trying to cheat D, an unknown prior penalty term is added to the generation loss, the generation sample is processed into an unmarked sample, the generator focuses on the generation sample with low quality, the generation network is effectively optimized, and the performance of the generation network is further improved.

The penalty function is defined as:

wherein ε is 10^-2To prevent the logarithmic term from being 0, theta_dThe parameters of the discriminator are represented, pi represents class prior knowledge, namely the proportion of positive data (data with high quality) in the unlabeled data, n represents the number of training samples, and lambda is a regularization parameter.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to those skilled in the art without departing from the principles of the present invention should also be considered as within the scope of the present invention.

Claims

1. An image super-resolution reconstruction method is characterized by comprising the following steps: the method comprises the following steps:

inputting the extracted high-level features phi (y) and phi (G (x)) into a discriminator network, obtaining the countermeasure loss based on PU classification regularization, and determining a final target loss function as the weighted sum of the content loss and the countermeasure loss;

2. The image super-resolution reconstruction method according to claim 1, characterized in that: the PU framework in the third step is as follows:

wherein, P_dataRepresenting the distribution of real samples, Z being the distribution P from the prior experiment_zRandom noise of middle sampling, D (x) is x predicted as real number by discriminatorAccording to the probability, f₁(. is a loss classifying the input as a true sample, f₂(. cndot.) is the loss of classifying the input as a generating sample, and π is the a priori knowledge, i.e., the proportion of high quality samples in the generating sample.

3. The image super-resolution reconstruction method according to claim 1 or 2, characterized in that: the adaptive evolution equation of a-Charbon in the fourth step is as follows:

is the a-Charbon regularization term.

4. The image super-resolution reconstruction method according to claim 1 or 2, characterized in that: the objective loss function in step three is as follows:

l＝l_charbon+0.008l_VGG-charbon+2×10^-6l_α-charbon+10^-3l_PU-Gen

wherein l_charbonFor content loss,/_VGG-charbonLoss to improve VGG loss,/_α-charbonIs a-Charb loss, l_PU-GenIndicating a loss of resistance.

5. The image super-resolution reconstruction method according to claim 4, characterized in that: the content loss is defined as follows:

6. The image super-resolution reconstruction method according to claim 5, characterized in that: the loss of improvement in VGG loss is defined as follows:

wherein phi_5，4Represents the characteristic graph, W, obtained after 4 th convolution Relu before the 5 th maximum pooling layer in VGG network_5，4And H_5，4Respectively representing the width and height of the corresponding feature map.

7. The image super-resolution reconstruction method according to claim 6, wherein: the a-Charb loss is defined as follows:

8. The image super-resolution reconstruction method according to claim 4, characterized in that: the resistance loss is defined as follows:

wherein，ε＝10^-2For preventing the logarithmic term from being 0, theta_dAnd the parameters of the discriminator are represented, pi represents class prior knowledge, namely positive data proportion in unlabeled data, n represents the number of training samples, and lambda is a regularization parameter.