[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US20210264568A1 - Super resolution using a generative adversarial network - Google Patents

Super resolution using a generative adversarial network Download PDF

Info

Publication number
US20210264568A1
US20210264568A1 US17/302,537 US202117302537A US2021264568A1 US 20210264568 A1 US20210264568 A1 US 20210264568A1 US 202117302537 A US202117302537 A US 202117302537A US 2021264568 A1 US2021264568 A1 US 2021264568A1
Authority
US
United States
Prior art keywords
image
neural network
convolutional neural
loss
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/302,537
Inventor
Wenzhe Shi
Christian Ledig
Zehan Wang
Lucas Theis
Ferenc Huszar
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Twitter Inc
Original Assignee
Twitter Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Twitter Inc filed Critical Twitter Inc
Priority to US17/302,537 priority Critical patent/US20210264568A1/en
Assigned to TWITTER, INC. reassignment TWITTER, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LEDIG, Christian, SHI, WENZHE, THEIS, Lucas, HUSZAR, Ferenc, WANG, Zehan
Publication of US20210264568A1 publication Critical patent/US20210264568A1/en
Assigned to MORGAN STANLEY SENIOR FUNDING, INC. reassignment MORGAN STANLEY SENIOR FUNDING, INC. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TWITTER, INC.
Assigned to MORGAN STANLEY SENIOR FUNDING, INC. reassignment MORGAN STANLEY SENIOR FUNDING, INC. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TWITTER, INC.
Assigned to MORGAN STANLEY SENIOR FUNDING, INC. reassignment MORGAN STANLEY SENIOR FUNDING, INC. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TWITTER, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0454
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • This disclosure relates to machine learning to process visual data using a plurality of datasets.
  • Machine learning is the field of study where a computer or set of computers learn to perform classes of tasks using feedback generated from the experience the machine learning process gains from computer performance of those task.
  • Supervised machine learning is concerned with a computer learning one or more rules or functions to map between example inputs and desired outputs as predetermined by an operator or programmer, usually where a dataset containing the inputs is labelled.
  • Unsupervised learning may be concerned with determining a structure for input data, for example, when performing pattern recognition, and typically uses unlabeled datasets.
  • Semi-supervised machine learning makes use of externally provided labels and objective functions as well as any implicit data relationships.
  • the machine learning algorithm can be provided with some training data or a set of training examples, in which each example is typically a pair of an input signal/vector and a desired output value, label (or classification) or signal.
  • the machine learning algorithm analyzes the training data and produces a generalized function that can be used with unseen datasets to produce desired output values or signals for the unseen input vectors/signals.
  • the user determines what type of data is to be used as the training data and also prepares a representative real-world set of data. However, the user must take care to ensure that the training data contains enough information to accurately predict desired output values.
  • the machine learning algorithm must be provided with enough data to be able to correctly learn and model for the dimensionality of the problem that is to be solved, without providing too many features (which can result in too many dimensions being considered by the machine learning process during training).
  • the user also can determine the desired structure of the learned or generalized function, for example, whether to use support vector machines or decision trees.
  • dictionary learning For example, for the case where machine learning is used for image enhancement, using dictionary representations for images, techniques are generally referred to as dictionary learning.
  • dictionary learning where sufficient representations, or atoms, are not available in a dictionary to enable accurate representation of an image, machine learning techniques can be employed to tailor dictionary atoms such that they can more accurately represent the image features and thus obtain more accurate representations.
  • a training process can be used to find optimal representations that can best represent a given signal or labelling (where the labelling can be externally provided such as in supervised or semi-supervised learning or where the labelling is implicit within the data as for unsupervised learning), subject to predetermined initial conditions such as a level of sparsity.
  • MSE least squares error
  • x is a low-resolution image
  • y is a high-resolution image
  • is an estimate of the high-resolution image generated by a neural network with the parameters of ⁇ .
  • Least squares methods struggle when there are multiple equivalently probable solutions to the problem. For example, where there are multiple equivalently good solutions to the problem, a low-resolution image may provide enough detail to be able to determine the content of the image, but not in enough details to be able to precisely determine the location of each object within a high-resolution version of the image.
  • a method for training an algorithm to process at least a section of received low resolution visual data to estimate a high resolution version of the low resolution visual data using a training dataset and a reference dataset includes: (a) generating a set of training data (e.g., by using the generator neural network of (b)); (b) training a generator neural network by comparing one or more characteristics of the training data to one or more characteristics of at least a section of the reference dataset, wherein the first network is trained to generate super-resolved image data from low resolution image data and wherein the training includes optimizing processed visual data based on the comparison between the one or more characteristics of the training data and the one or more characteristics of the reference dataset; and (c) training a discriminator neural network by comparing one or more characteristics of the generated super-resolved image data to one or more characteristics of at least a section of the reference dataset, wherein the second network is trained to discriminate super-resolved image data from real image data.
  • Implementations can include one or more of the following features, alone or in any combination with each other.
  • the steps (a), (b), and (c) can be iterated over and the training data can be updated during an iteration.
  • the order of the steps (a), (b), and (c) can be selected to achieve different goals.
  • performing (a) after (b) can result in training the discriminator with an updated (and improved) generator.
  • the generator neural network and/or the discriminator neural network can be convolutional neural networks.
  • the generator neural network and/or the discriminator neural network can be parameterized by weights and biases.
  • the weights and biases that parameterize the generator and the discriminator networks can be the same or they can differ.
  • the training dataset can include a plurality of visual data.
  • the reference dataset can include a plurality of visual data.
  • the plurality of visual data of the reference dataset may or may not be increased quality versions of the visual data of the training dataset.
  • the estimated high-resolution version can be used for any of: removing compression artifacts, dynamic range enhancement, image generation and synthesis, image inpainting, image de-mosaicing, and denoising.
  • the discriminating of the super-resolved image data from real image data can include using a binary classifier that discriminates between the super-resolved image data and reference data.
  • Comparing the one or more characteristics of the training data to the one or more characteristics of at least a section of the reference dataset can include assessing the similarity between one or more characteristics of an input of the algorithm and one or more characteristics of an output of the algorithm.
  • the algorithm can be hierarchical and can include a plurality of layers. The layers can potentially be arbitrarily connected with each other or any of sequential, recurrent, recursive, branching, recursive or merging.
  • FIG. 1A is an example original image of a high-resolution image.
  • FIG. 1B is an example image generated from a 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1B is generated using bi-cubic interpolation techniques on data in the downsampled image.
  • FIG. 1C is an example image generated from the 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1C is generated from data in the downsampled image using a deep residual network optimized for MSE.
  • FIG. 1D is an example image generated from a 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1D is generated from data in the downsampled image using a deep residual generative adversarial network optimized for a loss more sensitive to human perception.
  • FIG. 2A is an example high resolution image.
  • FIG. 2B is a super-resolved image created using the techniques described herein from a 4 ⁇ downsampled version of the image shown in FIG. 2A .
  • FIG. 3 is a schematic illustration of patches from the natural image manifold and super-resolved patches obtained with mean square error and generative adversarial networks.
  • FIG. 4 is a schematic diagram of an example GAN framework for obtaining super resolution images.
  • FIG. 5 is a schematic diagram of the generator network.
  • FIG. 6 is a schematic diagram of a discriminator network.
  • FIG. 7 is a flow chart of a process used to train a network.
  • a super-resolution generative adversarial network provides a framework that is based on a generative adversarial network (GAN) and is capable of recovering photo-realistic images from 4 X downsampled images.
  • GAN generative adversarial network
  • a perceptual loss function that consists of an adversarial loss function and a content loss function is proposed.
  • the adversarial loss pushes the solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images push the solution to the natural image manifold using a discriminator network.
  • a content loss function motivated by perceptual similarity instead of similarity in pixel space is used. Trained on a large number (e.g., tens or hundreds of thousands) of images using the perceptual loss function, the deep residual network can recover photo-realistic textures from heavily downsampled images on public benchmarks.
  • a major difficulty when estimating the HR image is the ambiguity of solutions to the underdetermined SR problem.
  • the ill-posed nature of the SR problem is particularly pronounced for high downsampling factors, for which texture detail in the reconstructed SR images is typically absent.
  • Assumptions about the data must be made to approximate the HR image, such as exploiting image redundancies or employing specifically trained feature models.
  • image SR image SR
  • simple image features e.g., edges
  • statistical image priors e.g., statistical image priors.
  • example-based methods detected and exploited patch correspondences within a training database or calculated optimized dictionaries allowing for high-detail data representation. While of good accuracy, the involved optimization procedures for both patch detection and sparse coding are computationally intensive.
  • More advanced methods formulate image-based SR as a regression problem that can be tackled, for example, with Random Forests.
  • CNNs convolutional neural networks
  • MSE mean squared error
  • PSNR peak signal to noise ratio
  • FIGS. 1A, 1B, 1C, and 1D the highest PSNR does not necessarily reflect the perceptually better SR result.
  • FIG. 1A is an example original image of a high-resolution image.
  • FIG. 1B is an example image generated from a 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1B is generated using bi-cubic interpolation techniques on data in the downsampled image.
  • the image in FIG. 1B has a PNSR of 21.69 dB.
  • FIG. 1C is an example image generated from the 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1C is generated from data in the downsampled image using a deep residual network optimized for MSE.
  • the image in FIG. 1C has a PNSR of 23.62 dB.
  • FIG. 1D is an example image generated from a 4 ⁇ downsampled version of the image of FIG. 1A , where the image in FIG. 1D is generated from data in the downsampled image using a deep residual generative adversarial network optimized for a loss more sensitive to human perception.
  • the image in FIG. 1D has a PNSR of 21.
  • a perceptual difference between a super-resolved version of a downsampled image and an original version of the image means that the super-resolved images are not generally considered as photo-realistic, at least in terms of the level of image fidelity/details expected for a given resolution of the image.
  • SRGAN super-resolution generative adversarial network
  • VCG Visual Geometry Group
  • FIG. 2B An example of a photo-realistic image that was super-resolved from a 4 ⁇ downsampling factor using SRGAN is shown in FIG. 2B , which is a SR image created using the techniques described herein from a 4 ⁇ downsampled version of the original image shown in FIG. 2A .
  • SISR single image super-resolution
  • CNN Convolutional Neural Networks
  • learning upscaling filters can be beneficial both in terms of speed and accuracy, and can offer an improvement over using data-independent, bicubic interpolation to upscale the LR observation before feeding the image to the CNN.
  • the gain in speed can be used to employ a deep residual network (ResNet) to increase accuracy.
  • ResNet deep residual network
  • pixel-wise loss functions such as MSE struggle to handle the uncertainty inherent in recovering lost high-frequency details such as texture: minimizing MSE encourages finding pixel-wise averages of plausible solutions which are typically blurry, overly-smooth, and thus have poor perceptual quality.
  • Example reconstructions of varying perceptual quality are exemplified with corresponding PSNR in FIGS. 1A, 1B, 1C , and 1 D.
  • the problem of minimizing pixel-wise MSE is illustrated in FIG. 3 , in which multiple potential solutions with high texture details are averaged to create a smooth reconstruction. As can be seen from FIG.
  • the generative adversarial network (GAN) approach can converge to a different solution 302 than the pixel-wise MSE approach 304 , and the GAN approach can often result in a more photo-realistic solution than the MSE approach.
  • GAN generative adversarial network
  • the MSE-based solution appears overly smooth due to the pixel-wise averaging of possible solutions in the pixel space, while the GAN approach drives the reconstruction towards the natural image manifold producing perceptually a more convincing solution.
  • Generative Adversarial Networks can be used to tackle the problem of image super resolution.
  • GANs can be used to learn a mapping from one manifold to another for style transfer, and for inpainting.
  • high-level features extracted from a pretrained VGG network can be used instead of low-level pixel-wise error measures.
  • a loss function based on the Euclidean distance between feature maps extracted from the VGG19 network can be used to obtain perceptually superior results for both super-resolution and artistic style-transfer.
  • FIG. 4 is a schematic diagram of an example GAN system 400 for obtaining super resolution images.
  • GANs can provide a powerful framework for generating plausible-looking natural images with high perceptual quality.
  • the GAN system 400 can include one or more computing devices that include one or more processors 402 and memory 404 storing instructions that are executable by the processors.
  • a generator neural network 406 and a discriminator neural network 408 can be trained together (e.g., jointly, interactively, alternately, etc.) but with competing goals.
  • the discriminator network 408 can be trained to distinguish natural and synthetically generated images, while the generator network 406 can learn to generate images that are indistinguishable from natural images by the best discriminator.
  • the GAN system 400 encourages the generated synthetic samples to move towards regions of the search space with high probability and thus closer to the natural image manifold.
  • the SRGAN system 400 and its techniques described herein sets a new state of the art for image SR from a high downsampling factor (4 ⁇ ) as measured by human subjects using MOS tests. Specifically, we first employ the fast learning in low resolution (LR) space and batch-normalize to robustly train a network of a plurality (e.g., 15 ) of residual blocks for better accuracy.
  • LR low resolution
  • the GAN system 400 it is possible to recover photo-realistic SR images from high downsampling factors (e.g., 4 ⁇ ) by using a combination of content loss and adversarial loss as perceptual loss functions.
  • the adversarial loss is driven by the discriminator network 408 to encourage solutions from the natural image domain, while the content loss function ensures that the super-resolved images have the same content as their low-resolution counterparts.
  • the MSE-based content loss function can be replaced with the Euclidean distance between the last convolutional feature maps of a neural network, where the similarities of the feature maps/feature spaces of the neural network are consistent with human notions of content similarity and can be more invariant to changes in pixel space.
  • the VGG network can be used, as linear interpolation in the VGG feature space corresponds to intuitive, meaningful interpolation between the contents of two images.
  • the VGG network is trained for object classification, here it can be used to solve the task of image super-resolution.
  • Other neural networks also can be used or image super-resolution, for example, a network trained on a specific dataset (e.g., face recognition) may work well for super-resolution of images containing faces.
  • I LR is the low-resolution input image of its high-resolution counterpart I HR .
  • the high-resolution images can be provided during training of the networks 406 , 408 .
  • I LR can be obtained by applying a Gaussian-filter to I HR followed by a downsampling operation with a downsampling factor r.
  • I L R can be described by a real-valued tensor of size W ⁇ H ⁇ C and I LR , I SR can be described by a real-valued tensor of size rW ⁇ rH ⁇ C.
  • a generating function G can be trained such that G estimates, for a given LR image, the corresponding HR counterpart image of the LR image.
  • the generator network 406 can be trained as a feed-forward CNN, G ⁇ G , which is parameterized by ⁇ G .
  • B G ⁇ W 1:L ;b 1:L ⁇ denotes the weights and biases of an L-layer deep network and is obtained by optimizing a SR-specific loss function I SR .
  • a perceptual loss l SR is specifically designed as a weighted combination of several loss components that model distinct desirable characteristics of the recovered SR image.
  • the individual loss functions are described in more detail below.
  • This formulation therefore allows training a generative model G with the goal of fooling a differentiable discriminator D that was trained to distinguish super-resolved images from real images.
  • the generator can learn to create solutions that are highly similar to real images and thus difficult to classify by D. Eventually this encourages perceptually superior solutions residing in the subspace or the manifold of natural images. This is in contrast to SR solutions obtained by minimizing pixel-wise error measurements, such as the MSE.
  • FIG. 5 is a schematic diagram of the generator network 500 , which is also referred to herein a G.
  • the generator network 500 can include B residual blocks 502 with identical layout.
  • a residual block that uses two convolutional layers 504 with small 3 ⁇ 3 kernels and 64 feature maps can be used to stabilize, and allow the optimization of, a particularly deep neural network. Residual blocks are described in K. He, X. Zhang, S. Ren, and J.
  • the residual block layer(s) can be followed by batch-normalization layers 506 and Rectified Linear Unit (PReLU) or parametric Rectified Linear Unit (PReLU) layers 508 as activation function to enable the network to learn complex, nonlinear functions.
  • PReLU Rectified Linear Unit
  • PReLU parametric Rectified Linear Unit
  • the trained network thus can more effectively exploit network parameters for modeling complex nonlinear transformations.
  • the resolution of the input image can be increased with a trained deconvolution layer that increases the spatial resolution of feature maps while reducing the number of feature channels.
  • FIG. 6 is a schematic diagram of a discriminator network 600 .
  • the discriminator network 600 can be trained to solve the maximization problem in Equation 2.
  • LeakyReLU activation 602 can be used and to avoid max-pooling throughout the network.
  • the discriminator network 600 can include eight convolutional layers with an increasing number of filter kernels, increasing by a factor of 2 with each layer from 64 to 512 kernels, as in the VGG network.
  • the spatial resolution of feature maps can be reduced each time the number of feature channels is doubled. Reducing the spatial resolution of feature maps can be achieved by specific network layers such as, for example, max-pooling or strided-convolutions.
  • the last convolutional layer can have a larger number of feature maps, e.g., 512.
  • feature maps e.g., 512.
  • To obtain a final probability for sample classification those numerous feature maps can be collapsed into a single scalar by employing one or more dense layers that accumulate each individual feature into a single scalar. This scalar can be converted into a probability for sample classification by applying a bounded activation function such as a sigmoid function.
  • l LR The definition of the perceptual loss function l LR influences the performance of the generator network and thus the SR algorithm. While l LR is commonly modeled based on the MSE, here a loss function that can assess the quality of a solution with respect to perceptually relevant characteristics is used instead.
  • the perceptual loss function can include a content loss function, an adversarial loss function, and a regularization loss function, as explained in further detail below.
  • the pixel-wise MSE loss can be calculated as:
  • Reconstruction quality has commonly been assessed on a pixel-level in image space.
  • this generally means optimizing for the mean (e.g., mean-squared-error, L2 loss) or median (L1 loss) of several, equally likely possible solutions.
  • the mean e.g., mean-squared-error, L2 loss
  • median L1 loss
  • a loss function that is closer to perceptual similarity can be used.
  • this loss can be calculated in a more abstract feature space.
  • the feature space representation of a given input image can be described by its feature activations in a network layer of a pre-trained convolutional neural network, such as, for example, the VGG19 network.
  • a feature space can be explicitly or implicitly defined such that it provides valuable feature representations for optimization problems. For example, in image reconstruction problems losses calculated in feature space may not penalize perceptually important details (e.g., textures, high frequency information) of solutions, while at the same time, ensuring that overall similarity is retained.
  • the feature map obtained by the jth convolution before the ith maxpooling layer within the VGG19 network can be represented by ⁇ i,j .
  • the VGG loss can be defined as the Euclidean distance between the feature representations of a reconstructed image G ⁇ G (I LR ) and a reference image (I HR ) that the reconstructed image represents:
  • W i,j and H i,j describe the dimensions of the respective feature maps within the VGG network.
  • the generative loss l Gen SR can be defined based on the probabilities of the discriminator D ⁇ D (G ⁇ G (I LR )) over all training samples as:
  • D ⁇ D (G ⁇ G (I LR )) is the estimated probability that the reconstructed image G ⁇ G (I LR ) is a natural HR image. Note that, in some implementations, for better gradient behavior, the term ⁇ log D ⁇ D (G ⁇ G (I LR )) can be minimized rather than the term log [1 ⁇ D ⁇ D (G ⁇ G (I LR ))].
  • a regularizer based on the total variation can be employed to encourage spatially coherent solutions.
  • the regularization loss, l TV can be calculated as:
  • All networks were trained on an NVIDIA Tesla M40 GPU using a random sample of a large number (e.g., tens or hundreds of thousands) of images from the ImageNet database. These images were distinct from the Set5, Set14 and BSD testing images.
  • the SRRES networks were trained with a learning rate of 10 ⁇ 4 and 10 6 update iterations.
  • the pre-trained MSE-based SRRES network was used as an initialization for the generator when training the actual GAN to avoid undesired local optima. All SRGAN network variants were trained with 100,000 update iterations at a learning rate of 10 ⁇ 4 , and another 100,000 iterations at a lower learning rate of 10-. We alternate updates to the generator and discriminator network.
  • the network architecture for the generator network 406 of GAN system 400 can combine the effectiveness of the efficient sub-pixel convolutional neural network (ESPCN) and the high performance of the ResNet.
  • the performance of the generator network 406 for l SR I MSE SR without any adversarial component, which can be referred to as SRResNet, was compared to bicubic interpolation and four state of the art frameworks: the super-resolution CNN (SRCNN), a method based on transformed self-exemplars (SelfExSR), a deeply-recursive convolutional network (DRCN), and the efficient sub-pixel CNN (ESPCNN) allowing real-time video SR. Quantitative results confirmed that SRResNet sets a new state of the art on the three benchmark datasets.
  • l SR l X SR ⁇ content ⁇ ⁇ loss + 10 - 3 ⁇ l Gen SR ⁇ adversarial ⁇ ⁇ loss ⁇ perceptual ⁇ ⁇ loss ⁇ ⁇ ( for ⁇ ⁇ VGG ⁇ ⁇ based ⁇ ⁇ content ⁇ ⁇ losses ) ( 7 )
  • l X SR can represent different content losses, such as, for example, the standard MSE content loss, a loss defined on feature maps that represent lower-level features, a loss defined on feature maps of higher-level features from deeper network layers with more potential to focus on the content of the images, etc. It was determined that, even when combined with the adversarial loss, although MSE may provide solutions with relatively high PSNR values, the results achieved with a loss component more sensitive to visual perception provides are perceptually superior. This is caused by competition between the MSE-based content loss and the adversarial loss. In general, the further away the content loss is from pixel space, the perceptually better the result of the GAN system. Thus, we observed a better texture detail using the higher level VGG feature maps as compared with lower level feature maps.
  • the experiments suggest superior perceptual performance of the proposed framework purely based on visual comparison.
  • Standard quantitative measures such as PSNR and SSIM clearly fail to capture and accurately assess image quality with respect to the human visual system.
  • the presented model can be extended to provide video SR in real-time, e.g., by performing SR techniques on frames of video data.
  • the techniques described herein have a wide variety of applications in which increasing the resolution of a visual image would be helpful.
  • the resolution of still, or video, images can be enhanced, where the images are uploaded to a social media site, where the images are provided to a live streaming application or platform, where the images are presented in a video game or media stream, where the images are rendered in a virtual reality application or where the images are part of a spherical video or 360-degree video/image, where the images are formed by a microscope or a telescope, etc.
  • visual images based on invisible radiation e.g., X-rays, MRI images, infrared images, etc.
  • invisible radiation e.g., X-rays, MRI images, infrared images, etc.
  • aspects and/or implementations of the techniques described herein can improve the effectiveness of synthesizing content using machine learning techniques. Certain aspects and/or implementations seek to provide techniques for generating hierarchical algorithms that can be used to enhance visual data based on received input visual data and a plurality of pieces of training data. Other aspects and/or implementations seek to provide techniques for machine learning.
  • a least-squares method picks an average of all possible solutions, thereby resulting in an output which may not accurately represent a higher quality version of the inputted visual data.
  • the techniques described herein select a most probable output when compared to a training dataset and an output that is most realistic, as determined by the discriminator.
  • Further implementations may use this approach to generate high quality versions of inputted low quality visual data by training an algorithm so that the generating function is optimized.
  • only low-quality data is required along with a high-quality reference dataset that may contain unrelated visual data.
  • FIG. 7 shows a flow chart of a process used to train a network 710 .
  • training the network 710 includes increasing the quality of the input visual data 720 .
  • the visual data can be processed in many ways, such as by creating photorealistic outputs, removing noise from received visual data, and generating or synthesizing new images.
  • the network 710 receives at least one section of low-quality visual data 720 used to initialize the network 710 with a set of parameters 715 .
  • the network 710 may also receive a low-quality visual data training set 730 .
  • the plurality of low-quality visual data training set 730 may be a selection of low-quality images, frames of video data or rendered frames, although other types of low-quality visual data may be received by the network 710 .
  • the low-quality images or frames can include downsampled versions of high-quality images or frames.
  • the low-quality visual data training set 730 may be received by the network 710 from an external source, such as the Internet or may be stored in a memory of a computing device.
  • the low-quality visual data 720 can be used as a training dataset and can be provided to the network 710 that, using the parameters 715 , seeks to produce estimated enhanced quality visual dataset 740 corresponding to the low-quality visual data training set 730 .
  • only a subset of the low-quality visual data 720 may be used when producing the estimate enhanced quality visual dataset 740 .
  • the estimated enhanced quality visual dataset 740 may include a set of visual data representing enhanced quality versions of the corresponding lower quality visual data from a subset of the low-quality visual data training set 730 .
  • the entire low-quality visual data training set 730 may be used.
  • the enhanced quality visual dataset 740 may be used as an input to a comparison network 760 , along with a high quality visual data reference set 750 .
  • the high-quality visual data reference set 750 may be received by the network 710 , from an external source, such as the Internet, or may be stored in a memory of a computing device that is used to train the network 710 .
  • the comparison network 760 may use a plurality of characteristics determined from the high-quality visual data reference set 750 and the estimated enhanced quality visual dataset 740 to determine similarities and differences between the two datasets 740 , 750 .
  • the comparison may be made between empirical probability distributions of visual data.
  • the plurality of characteristics use may include sufficient statistics computed across subsets of visual data.
  • the comparison network 760 may utilize an adversarial training procedure such as the one used to train a Generative Adversarial Network (GAN) that includes, for example, a generating network and a discriminating network.
  • GAN Generative Adversarial Network
  • a comparison network 760 may use a discriminator trained to discriminate between data items sampled from the high-quality visual data reference set 750 and those sampled from the estimated enhanced quality visual dataset 740 . The classification accuracy of this discriminator may then form the basis of the comparison.
  • the comparison network 760 can produce updated parameters 765 that can be used to replace the parameters 715 of the network 710 .
  • the method may iterate, seeking to reduce the differences between the plurality of characteristics determined from the high-quality visual data 730 and the estimated enhanced quality visual data 740 , each time using the updated parameters 765 produced by the comparison network 760 .
  • the method continues to iterate until the network 710 produces an estimated enhanced quality visual data 740 representative of high quality visual data corresponding to the low-quality visual data training set 730 .
  • an enhanced quality visual data 770 may be output, where the enhanced quality visual data 770 corresponds to an enhanced quality version of the at least one section of low-quality visual data 720 .
  • the method may be used to apply a style transfer to the input visual data.
  • input visual data may include a computer graphics rendering
  • the method may be used to process the computer graphics rendering.
  • the output of the network 710 may appear to have photo-real characteristics to represent a photo-real version of the computer graphics rendering.
  • the trained network 710 may be used to recover information from corrupted, downsampled, compressed, or lower-quality input visual data, by using a reference dataset to recover estimates of the corrupted, downsampled, compressed, or lower-quality input visual data.
  • the trained network may be used for the removal of compression artifacts, dynamic range inference, image inpainting, image de-mosaicing, and denoising, from corrupted, downsampled, compressed, or lower-quality input visual data, thus allowing for a range of visual data to be processed, each with different quality degrading characteristics. It will be appreciated other characteristics that affect the quality of the visual data may be enhanced by the network. Furthermore, in some implementations, the network may be configured to process the visual data consisting of one or more of the above-mentioned quality characteristics.
  • implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof.
  • ASICs application specific integrated circuits
  • These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
  • the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
  • a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and a pointing device e.g., a mouse or a trackball
  • Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
  • the systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
  • the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.
  • LAN local area network
  • WAN wide area network
  • the Internet the global information network
  • the computing system can include clients and servers.
  • a client and server are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

A neural network is trained to process received visual data to estimate a high-resolution version of the visual data using a training dataset and reference dataset. A set of training data is generated, and a generator convolutional neural network parameterized by first weights and biases is trained by comparing characteristics of the training data to characteristics of the reference dataset. The first network is trained to generate super-resolved image data from low-resolution image data and the training includes modifying first weights and biases to optimize processed visual data based on the comparison between the characteristics of the training data and the characteristics of the reference dataset. A discriminator convolutional neural network parameterized by second weights and biases is trained by comparing characteristics of the generated super-resolved image data to characteristics of the reference dataset, and where the second network is trained to discriminate super-resolved image data from real image data.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a divisional of, and claims priority to, U.S. patent application Ser. No. 15/706,428, filed on Sep. 15, 2017, entitled “Super Resolution Using a Generative Adversarial Network”, which claims priority to U.S. Provisional Patent Application No. 62/395,186, filed on Sep. 15, 2016, entitled “Super Resolution Using a Generative Adversarial Network,” and U.S. Provisional Patent Application No. 62/422,012, filed on Nov. 14, 2016, entitled “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” the disclosures of which are incorporated by reference herein in their entirety.
  • TECHNICAL FIELD
  • This disclosure relates to machine learning to process visual data using a plurality of datasets.
  • BACKGROUND
  • Machine learning is the field of study where a computer or set of computers learn to perform classes of tasks using feedback generated from the experience the machine learning process gains from computer performance of those task. Supervised machine learning is concerned with a computer learning one or more rules or functions to map between example inputs and desired outputs as predetermined by an operator or programmer, usually where a dataset containing the inputs is labelled. Unsupervised learning may be concerned with determining a structure for input data, for example, when performing pattern recognition, and typically uses unlabeled datasets. Semi-supervised machine learning makes use of externally provided labels and objective functions as well as any implicit data relationships.
  • When initially configuring a machine learning system, particularly when using a supervised machine learning approach, the machine learning algorithm can be provided with some training data or a set of training examples, in which each example is typically a pair of an input signal/vector and a desired output value, label (or classification) or signal. The machine learning algorithm analyzes the training data and produces a generalized function that can be used with unseen datasets to produce desired output values or signals for the unseen input vectors/signals. Generally, the user determines what type of data is to be used as the training data and also prepares a representative real-world set of data. However, the user must take care to ensure that the training data contains enough information to accurately predict desired output values. The machine learning algorithm must be provided with enough data to be able to correctly learn and model for the dimensionality of the problem that is to be solved, without providing too many features (which can result in too many dimensions being considered by the machine learning process during training). The user also can determine the desired structure of the learned or generalized function, for example, whether to use support vector machines or decision trees.
  • The use of unsupervised or semi-supervised machine learning approaches are often used when labelled data is not readily available, or where the system generates new labelled data from unknown data given some initial seed labels.
  • For example, for the case where machine learning is used for image enhancement, using dictionary representations for images, techniques are generally referred to as dictionary learning. In dictionary learning, where sufficient representations, or atoms, are not available in a dictionary to enable accurate representation of an image, machine learning techniques can be employed to tailor dictionary atoms such that they can more accurately represent the image features and thus obtain more accurate representations.
  • When using machine learning where there is an objective function and optimization process, for example, when using sparse coding principles, a training process can be used to find optimal representations that can best represent a given signal or labelling (where the labelling can be externally provided such as in supervised or semi-supervised learning or where the labelling is implicit within the data as for unsupervised learning), subject to predetermined initial conditions such as a level of sparsity.
  • Many current methods of neural-network super resolution use a least squares objective or a variant thereof such as peak signal-to-noise (PSNR) ratio. Generally, the training objective of minimizing a least squares error (MSE) is represented by:
  • min θ 𝔼 x , y y - y ^ ( x ; θ ) 2 2
  • where x is a low-resolution image, y is a high-resolution image, and ŷ is an estimate of the high-resolution image generated by a neural network with the parameters of θ.
  • Least squares methods struggle when there are multiple equivalently probable solutions to the problem. For example, where there are multiple equivalently good solutions to the problem, a low-resolution image may provide enough detail to be able to determine the content of the image, but not in enough details to be able to precisely determine the location of each object within a high-resolution version of the image.
  • Also, despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, a central problem remains largely unsolved: How to recover lost texture detail from large downsampling factors. During image downsampling, information is lost, making super-resolution a highly ill-posed inverse problem with a large set of possible solutions. The behavior of optimization-based super-resolution methods is therefore principally driven by the choice of objective function. Recent work has largely focused on minimizing the mean squared reconstruction error (MSE). The resulting estimates can have high peak signal-to-noise-ratio (PSNR), but they are often blurry or overly-smoothed, lack high-frequency detail, making them perceptually unsatisfying.
  • SUMMARY
  • In a general aspect, a method for training an algorithm to process at least a section of received low resolution visual data to estimate a high resolution version of the low resolution visual data using a training dataset and a reference dataset includes: (a) generating a set of training data (e.g., by using the generator neural network of (b)); (b) training a generator neural network by comparing one or more characteristics of the training data to one or more characteristics of at least a section of the reference dataset, wherein the first network is trained to generate super-resolved image data from low resolution image data and wherein the training includes optimizing processed visual data based on the comparison between the one or more characteristics of the training data and the one or more characteristics of the reference dataset; and (c) training a discriminator neural network by comparing one or more characteristics of the generated super-resolved image data to one or more characteristics of at least a section of the reference dataset, wherein the second network is trained to discriminate super-resolved image data from real image data.
  • Implementations can include one or more of the following features, alone or in any combination with each other. For example, the steps (a), (b), and (c) can be iterated over and the training data can be updated during an iteration. The order of the steps (a), (b), and (c) can be selected to achieve different goals. For example, performing (a) after (b) can result in training the discriminator with an updated (and improved) generator. The generator neural network and/or the discriminator neural network can be convolutional neural networks. The generator neural network and/or the discriminator neural network can be parameterized by weights and biases. The weights and biases that parameterize the generator and the discriminator networks can be the same or they can differ. The training dataset can include a plurality of visual data. The reference dataset can include a plurality of visual data. The plurality of visual data of the reference dataset may or may not be increased quality versions of the visual data of the training dataset. The estimated high-resolution version can be used for any of: removing compression artifacts, dynamic range enhancement, image generation and synthesis, image inpainting, image de-mosaicing, and denoising. The discriminating of the super-resolved image data from real image data can include using a binary classifier that discriminates between the super-resolved image data and reference data. Comparing the one or more characteristics of the training data to the one or more characteristics of at least a section of the reference dataset can include assessing the similarity between one or more characteristics of an input of the algorithm and one or more characteristics of an output of the algorithm. The algorithm can be hierarchical and can include a plurality of layers. The layers can potentially be arbitrarily connected with each other or any of sequential, recurrent, recursive, branching, recursive or merging.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
  • FIG. 1A is an example original image of a high-resolution image.
  • FIG. 1B is an example image generated from a 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1B is generated using bi-cubic interpolation techniques on data in the downsampled image.
  • FIG. 1C is an example image generated from the 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1C is generated from data in the downsampled image using a deep residual network optimized for MSE.
  • FIG. 1D is an example image generated from a 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1D is generated from data in the downsampled image using a deep residual generative adversarial network optimized for a loss more sensitive to human perception.
  • FIG. 2A is an example high resolution image.
  • FIG. 2B is a super-resolved image created using the techniques described herein from a 4× downsampled version of the image shown in FIG. 2A.
  • FIG. 3 is a schematic illustration of patches from the natural image manifold and super-resolved patches obtained with mean square error and generative adversarial networks.
  • FIG. 4 is a schematic diagram of an example GAN framework for obtaining super resolution images.
  • FIG. 5 is a schematic diagram of the generator network.
  • FIG. 6 is a schematic diagram of a discriminator network.
  • FIG. 7 is a flow chart of a process used to train a network.
  • DETAILED DESCRIPTION
  • As described herein, a super-resolution generative adversarial network (SRGAN) provides a framework that is based on a generative adversarial network (GAN) and is capable of recovering photo-realistic images from 4X downsampled images. With SRGAN, a perceptual loss function that consists of an adversarial loss function and a content loss function is proposed. The adversarial loss pushes the solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images push the solution to the natural image manifold using a discriminator network. In addition, a content loss function motivated by perceptual similarity instead of similarity in pixel space is used. Trained on a large number (e.g., tens or hundreds of thousands) of images using the perceptual loss function, the deep residual network can recover photo-realistic textures from heavily downsampled images on public benchmarks.
  • The highly challenging task of estimating a high-resolution (HR), ideally perceptually superior image from its low-resolution (LR) counterpart is referred to as super-resolution (SR). Despite the difficulty of the problem, research into SR has received substantial attention from within the computer vision community. The wide range of applications includes face recognition in surveillance videos, video streaming and medical applications.
  • A major difficulty when estimating the HR image is the ambiguity of solutions to the underdetermined SR problem. The ill-posed nature of the SR problem is particularly pronounced for high downsampling factors, for which texture detail in the reconstructed SR images is typically absent. Assumptions about the data must be made to approximate the HR image, such as exploiting image redundancies or employing specifically trained feature models.
  • Recently, substantial advances have been made in image SR, with early methods based on interpolation, simple image features (e.g., edges) or statistical image priors. Later example-based methods detected and exploited patch correspondences within a training database or calculated optimized dictionaries allowing for high-detail data representation. While of good accuracy, the involved optimization procedures for both patch detection and sparse coding are computationally intensive. More advanced methods formulate image-based SR as a regression problem that can be tackled, for example, with Random Forests. The recent rise of convolutional neural networks (CNNs) also has impacted image SR, not only improving the state of the art with respect to accuracy but also computational speed, enabling real-time SR for 2D video frames.
  • The optimization target of supervised SR algorithms often is the minimization of the mean squared error (MSE) of the recovered HR image with respect to the ground truth. This is convenient as minimizing MSE also maximizes the peak signal to noise ratio (PSNR), which is a common measure used to evaluate and compare SR algorithms. However, the ability of MSE (and PSNR) to capture perceptually relevant differences, such as high texture detail, good contrast, and defined edges, is very limited as they are defined based on pixel-wise image differences. For example, as shown in FIGS. 1A, 1B, 1C, and 1D, the highest PSNR does not necessarily reflect the perceptually better SR result. FIG. 1A is an example original image of a high-resolution image. FIG. 1B is an example image generated from a 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1B is generated using bi-cubic interpolation techniques on data in the downsampled image. The image in FIG. 1B has a PNSR of 21.69 dB. FIG. 1C is an example image generated from the 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1C is generated from data in the downsampled image using a deep residual network optimized for MSE. The image in FIG. 1C has a PNSR of 23.62 dB. FIG. 1D is an example image generated from a 4× downsampled version of the image of FIG. 1A, where the image in FIG. 1D is generated from data in the downsampled image using a deep residual generative adversarial network optimized for a loss more sensitive to human perception. The image in FIG. 1D has a PNSR of 21.10 dB.
  • A perceptual difference between a super-resolved version of a downsampled image and an original version of the image means that the super-resolved images are not generally considered as photo-realistic, at least in terms of the level of image fidelity/details expected for a given resolution of the image.
  • In the techniques described herein, we propose super-resolution generative adversarial network (SRGAN) for which we employ a deep residual network and diverge from MSE as the sole optimization target. Different from previous works, we define a novel perceptual loss using high-level feature maps of the Visual Geometry Group (VGG) network combined with a discriminator that encourages solutions perceptually hard to distinguish from the HR reference images. An example of a photo-realistic image that was super-resolved from a 4× downsampling factor using SRGAN is shown in FIG. 2B, which is a SR image created using the techniques described herein from a 4× downsampled version of the original image shown in FIG. 2A.
  • The techniques described herein are described in connection with single image super-resolution (SISR) but are also applicable to recovering high resolution images from multiple images, such as object images acquired from varying viewpoints or temporal sequences of image frames (e.g., recorded, or live video data).
  • Design of Convolutional Neural Networks
  • The state of the art for many computer vision problems can be expressed by specifically designed Convolutional Neural Networks (CNN) architectures. Although deeper network architectures can be difficult to train, they have the potential to substantially increase the network's accuracy as they allow modeling mappings of very high complexity. To efficiently train these deeper network architectures batch-normalization can be used to counteract the internal covariate shift. Deeper network architectures have also been shown to increase performance for SISR, e.g., using a recursive CNN. Another powerful design choice that eases the training of deep CNNs is the concept of residual blocks and skip-connections. Skip-connections relieve the network architecture of modeling the identity mapping that is trivial in nature, but that is, however, potentially non-trivial to represent with convolutional kernels.
  • In the context of SISR, learning upscaling filters can be beneficial both in terms of speed and accuracy, and can offer an improvement over using data-independent, bicubic interpolation to upscale the LR observation before feeding the image to the CNN. In addition, by extracting the feature maps in LR space, the gain in speed can be used to employ a deep residual network (ResNet) to increase accuracy.
  • As mentioned above, pixel-wise loss functions such as MSE struggle to handle the uncertainty inherent in recovering lost high-frequency details such as texture: minimizing MSE encourages finding pixel-wise averages of plausible solutions which are typically blurry, overly-smooth, and thus have poor perceptual quality. Example reconstructions of varying perceptual quality are exemplified with corresponding PSNR in FIGS. 1A, 1B, 1C, and 1D. The problem of minimizing pixel-wise MSE is illustrated in FIG. 3, in which multiple potential solutions with high texture details are averaged to create a smooth reconstruction. As can be seen from FIG. 3, the generative adversarial network (GAN) approach can converge to a different solution 302 than the pixel-wise MSE approach 304, and the GAN approach can often result in a more photo-realistic solution than the MSE approach. For example, in FIG. 3, the MSE-based solution appears overly smooth due to the pixel-wise averaging of possible solutions in the pixel space, while the GAN approach drives the reconstruction towards the natural image manifold producing perceptually a more convincing solution.
  • Thus, Generative Adversarial Networks (GANs) can be used to tackle the problem of image super resolution. GANs can be used to learn a mapping from one manifold to another for style transfer, and for inpainting. In some implementations, high-level features extracted from a pretrained VGG network can be used instead of low-level pixel-wise error measures. In one implementation, a loss function based on the Euclidean distance between feature maps extracted from the VGG19 network can be used to obtain perceptually superior results for both super-resolution and artistic style-transfer.
  • FIG. 4 is a schematic diagram of an example GAN system 400 for obtaining super resolution images. GANs can provide a powerful framework for generating plausible-looking natural images with high perceptual quality. The GAN system 400 can include one or more computing devices that include one or more processors 402 and memory 404 storing instructions that are executable by the processors. A generator neural network 406 and a discriminator neural network 408 can be trained together (e.g., jointly, interactively, alternately, etc.) but with competing goals. The discriminator network 408 can be trained to distinguish natural and synthetically generated images, while the generator network 406 can learn to generate images that are indistinguishable from natural images by the best discriminator. In effect, the GAN system 400 encourages the generated synthetic samples to move towards regions of the search space with high probability and thus closer to the natural image manifold.
  • The SRGAN system 400 and its techniques described herein sets a new state of the art for image SR from a high downsampling factor (4×) as measured by human subjects using MOS tests. Specifically, we first employ the fast learning in low resolution (LR) space and batch-normalize to robustly train a network of a plurality (e.g., 15) of residual blocks for better accuracy.
  • With the GAN system 400 it is possible to recover photo-realistic SR images from high downsampling factors (e.g., 4×) by using a combination of content loss and adversarial loss as perceptual loss functions. For example, the adversarial loss is driven by the discriminator network 408 to encourage solutions from the natural image domain, while the content loss function ensures that the super-resolved images have the same content as their low-resolution counterparts. In addition, in some implementations, the MSE-based content loss function can be replaced with the Euclidean distance between the last convolutional feature maps of a neural network, where the similarities of the feature maps/feature spaces of the neural network are consistent with human notions of content similarity and can be more invariant to changes in pixel space. In one implementation, the VGG network can be used, as linear interpolation in the VGG feature space corresponds to intuitive, meaningful interpolation between the contents of two images. Although the VGG network is trained for object classification, here it can be used to solve the task of image super-resolution. Other neural networks also can be used or image super-resolution, for example, a network trained on a specific dataset (e.g., face recognition) may work well for super-resolution of images containing faces.
  • The approaches described herein can be validated using images from publicly available benchmark datasets and compared against previous works including SRCNN and DRCN to confirm our GAN system's 400 potential to compute photo-realistic image reconstruction under 4× downsampling factors as compared to conventional methods. In the following, the network architecture and the perceptual loss are described. In addition, quantitative evaluations on public benchmark datasets as well as visual illustrations are provided.
  • In SISR, a goal is to estimate a high-resolution, super-resolved image ISR from a low-resolution input image ILR. Here, ILR is the low-resolution input image of its high-resolution counterpart IHR. The high-resolution images can be provided during training of the networks 406, 408. In some implementations, when training the networks 406, 408, ILR can be obtained by applying a Gaussian-filter to IHR followed by a downsampling operation with a downsampling factor r. For an image with C color channels, ILR can be described by a real-valued tensor of size W×H×C and ILR, ISR can be described by a real-valued tensor of size rW×rH×C.
  • A generating function G can be trained such that G estimates, for a given LR image, the corresponding HR counterpart image of the LR image. To achieve this, the generator network 406 can be trained as a feed-forward CNN, Gθ G , which is parameterized by θG. Here, BG={W1:L;b1:L} denotes the weights and biases of an L-layer deep network and is obtained by optimizing a SR-specific loss function ISR. For given training images In HR, for n 1, . . . N, and with corresponding In LR, for n 1, . . . N, the following equation can be solved:
  • solve:
  • θ ^ G = arg min θ G 1 N n = 1 N l SR ( G θ G ( I n LR ) , I n HR ) ( 1 )
  • Here, a perceptual loss lSR is specifically designed as a weighted combination of several loss components that model distinct desirable characteristics of the recovered SR image. The individual loss functions are described in more detail below.
  • We can define a discriminator network, Dθ D , 408 in FIG. 4, which can be optimized alternating with Gθ G to solve the adversarial min-max problem:
  • min θ G max θ D 𝔼 I HR ~ p m i n ( I HR ) [ log D θ D ( I HR ) ] + 𝔼 ( I LR ) ~ p G ( I LR ) [ log ( 1 - D θ D ( G θ G ( I LR ) ) ] ( 2 )
  • This formulation therefore allows training a generative model G with the goal of fooling a differentiable discriminator D that was trained to distinguish super-resolved images from real images. With this approach, the generator can learn to create solutions that are highly similar to real images and thus difficult to classify by D. Eventually this encourages perceptually superior solutions residing in the subspace or the manifold of natural images. This is in contrast to SR solutions obtained by minimizing pixel-wise error measurements, such as the MSE.
  • FIG. 5 is a schematic diagram of the generator network 500, which is also referred to herein a G. The generator network 500 can include B residual blocks 502 with identical layout. In some implementations, a residual block that uses two convolutional layers 504 with small 3×3 kernels and 64 feature maps can be used to stabilize, and allow the optimization of, a particularly deep neural network. Residual blocks are described in K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770-778, 2016, which is incorporated herein by reference, and can be used, e.g., to learn nontrivial parts of the transformation in the residual block, which allowing other parts of the transformation to be modeled elsewhere, e.g., via a skip connection. The residual block layer(s) can be followed by batch-normalization layers 506 and Rectified Linear Unit (PReLU) or parametric Rectified Linear Unit (PReLU) layers 508 as activation function to enable the network to learn complex, nonlinear functions. In PReLu all activations smaller than zero can be scaled with a learnable parameter and all activations larger than zero can be retained, as in ReLU.
  • We can further introduce a skip-connection over all residual blocks to relieve the network of modeling simple transformations (e.g., the identity transformation). The trained network thus can more effectively exploit network parameters for modeling complex nonlinear transformations. The resolution of the input image can be increased with a trained deconvolution layer that increases the spatial resolution of feature maps while reducing the number of feature channels.
  • FIG. 6 is a schematic diagram of a discriminator network 600. To discriminate real HR images from generated SR samples the discriminator network 600 can be trained to solve the maximization problem in Equation 2. In one implementation, LeakyReLU activation 602 can be used and to avoid max-pooling throughout the network. In one implementation, the discriminator network 600 can include eight convolutional layers with an increasing number of filter kernels, increasing by a factor of 2 with each layer from 64 to 512 kernels, as in the VGG network. The spatial resolution of feature maps can be reduced each time the number of feature channels is doubled. Reducing the spatial resolution of feature maps can be achieved by specific network layers such as, for example, max-pooling or strided-convolutions. The last convolutional layer can have a larger number of feature maps, e.g., 512. To obtain a final probability for sample classification those numerous feature maps can be collapsed into a single scalar by employing one or more dense layers that accumulate each individual feature into a single scalar. This scalar can be converted into a probability for sample classification by applying a bounded activation function such as a sigmoid function.
  • Perceptual Loss Function
  • The definition of the perceptual loss function lLR influences the performance of the generator network and thus the SR algorithm. While lLR is commonly modeled based on the MSE, here a loss function that can assess the quality of a solution with respect to perceptually relevant characteristics is used instead.
  • Given weighting parameters γi, i=1, . . . , K, the perceptual loss function lLR can be defined as the weighted sum of individual loss functions: lLRi=1 K γilI SR, In particular, the perceptual loss function can include a content loss function, an adversarial loss function, and a regularization loss function, as explained in further detail below.
  • Content Loss
  • The pixel-wise MSE loss can be calculated as:
  • l MSE SR = 1 r 2 WH x = 1 rW y = 1 rH ( I x , y HR - G θ G ( I LR ) x , y ) 2 ( 3 )
  • which is a widely used optimization target for image SR on which many previous approaches rely. However, although achieving particularly high PSNR, solutions of MSE optimization problems often lack high-frequency content, which results in perceptually unsatisfying, overly smooth solutions, as can be seen from a comparison of FIGS. 1A, 1, 1C, and 1D.
  • Reconstruction quality has commonly been assessed on a pixel-level in image space. For under-determined optimization problems, such as image super-resolution, artifact removal, this generally means optimizing for the mean (e.g., mean-squared-error, L2 loss) or median (L1 loss) of several, equally likely possible solutions. When optimizing for the average of a large number of possible solutions, the obtained result generally appears overly smooth and thus perceptually not convincing.
  • Instead of relying on such pixel-wise losses, a loss function that is closer to perceptual similarity can be used. In one implementation, this loss can be calculated in a more abstract feature space. The feature space representation of a given input image can be described by its feature activations in a network layer of a pre-trained convolutional neural network, such as, for example, the VGG19 network. A feature space can be explicitly or implicitly defined such that it provides valuable feature representations for optimization problems. For example, in image reconstruction problems losses calculated in feature space may not penalize perceptually important details (e.g., textures, high frequency information) of solutions, while at the same time, ensuring that overall similarity is retained.
  • In a particular example, within the VGG19 network, the feature map obtained by the jth convolution before the ith maxpooling layer within the VGG19 network can be represented by Øi,j. Then, the VGG loss can be defined as the Euclidean distance between the feature representations of a reconstructed image Gθ G (ILR) and a reference image (IHR) that the reconstructed image represents:
  • l VGG / i , j SR = 1 W i , j H i , j x = 1 W i , j y = 1 H i , j ( ϕ i , j ( I HR ) x , y - ϕ i , j ( G θ G ( I LR ) ) x , y ) 2 ( 4 )
  • where Wi,j and Hi,j describe the dimensions of the respective feature maps within the VGG network.
  • Adversarial Loss
  • In addition to the content losses described so far, the generative component of the GAN can be added to the perceptual loss. This encourages the network to favor solutions that reside on the manifold of natural images by trying to fool the discriminator network. The generative loss lGen SR can be defined based on the probabilities of the discriminator Dθ D (Gθ G (ILR)) over all training samples as:

  • l Gen SRn=1 N−log D θ D (G θ G (I LR))  (5)
  • where Dθ D (Gθ G (ILR)) is the estimated probability that the reconstructed image Gθ G (ILR) is a natural HR image. Note that, in some implementations, for better gradient behavior, the term −log Dθ D (Gθ G (ILR)) can be minimized rather than the term log [1−Dθ D (Gθ G (ILR))].
  • Regularization Loss
  • In addition, a regularizer based on the total variation can be employed to encourage spatially coherent solutions. The regularization loss, lTV, can be calculated as:
  • l TV SR = 1 r 2 WH x = 1 rW y = 1 rH G θ G ( I LR ) x , y ( 6 )
  • Experiments
  • Data and Similarity Measures
  • To test the performance of the techniques and systems described herein, experiments were performed on the three widely used benchmark datasets Set5 (described in M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,” BMVC (2012), which is incorporated herein by reference), Set14 (described in D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” IEEE International Conference on Computer Vision (ICCV), volume 2, pages 416-423, 200, which is incorporated herein by reference), and BSD100 (described in R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” In Curves and Surfaces, pages 711-730, Springer, (2012), which is incorporated herein by reference). All experiments were performed with a downsampling factor of 4× used on the original images in the datasets. For fair quantitative comparison, all reported PSNR [dB] and SSIM measures were calculated on the ychannel using the daala package available at github.com/xiph/daala. Super-resolved images for the reference methods bicubic, SRCNN and Self-ExSR were obtained from github.com/jbhuang0604/SelfExSR and for DRCN from cv.snu.ac.kr/research/DRCN/.
  • Training Details and Parameters
  • All networks were trained on an NVIDIA Tesla M40 GPU using a random sample of a large number (e.g., tens or hundreds of thousands) of images from the ImageNet database. These images were distinct from the Set5, Set14 and BSD testing images. The LR images were obtained by downsampling the HR images using bicubic kernel with downsampling factor r=4. For each minibatch, 16 random 96×96 sub images of distinct training images were cropped. Note that the generator model was applied to images of arbitrary size, as it is fully convolutional. For optimization, Adam with β1=0.9 was used. The SRRES networks were trained with a learning rate of 10−4 and 106 update iterations. We used the pre-trained MSE-based SRRES network as an initialization for the generator when training the actual GAN to avoid undesired local optima. All SRGAN network variants were trained with 100,000 update iterations at a learning rate of 10−4, and another 100,000 iterations at a lower learning rate of 10-. We alternate updates to the generator and discriminator network. In one implementation, the generator network 406 can have 15 identical (B=15) residual blocks. With the help of the pretraining and the content loss function, the training of the generator and discriminator networks can be relatively stable.
  • The network architecture for the generator network 406 of GAN system 400 can combine the effectiveness of the efficient sub-pixel convolutional neural network (ESPCN) and the high performance of the ResNet. The performance of the generator network 406 for lSR=IMSE SR without any adversarial component, which can be referred to as SRResNet, was compared to bicubic interpolation and four state of the art frameworks: the super-resolution CNN (SRCNN), a method based on transformed self-exemplars (SelfExSR), a deeply-recursive convolutional network (DRCN), and the efficient sub-pixel CNN (ESPCNN) allowing real-time video SR. Quantitative results confirmed that SRResNet sets a new state of the art on the three benchmark datasets.
  • Investigation of Content Loss
  • The effect of different content loss choices in the perceptual loss for the GAN-based networks, which can be referred to as SRGAN, also were investigated. Specifically, the following losses were investigated:
  • l SR = l X SR content loss + 10 - 3 l Gen SR adversarial loss perceptual loss ( for VGG based content losses ) ( 7 )
  • The term lX SR can represent different content losses, such as, for example, the standard MSE content loss, a loss defined on feature maps that represent lower-level features, a loss defined on feature maps of higher-level features from deeper network layers with more potential to focus on the content of the images, etc. It was determined that, even when combined with the adversarial loss, although MSE may provide solutions with relatively high PSNR values, the results achieved with a loss component more sensitive to visual perception provides are perceptually superior. This is caused by competition between the MSE-based content loss and the adversarial loss. In general, the further away the content loss is from pixel space, the perceptually better the result of the GAN system. Thus, we observed a better texture detail using the higher level VGG feature maps as compared with lower level feature maps.
  • The experiments suggest superior perceptual performance of the proposed framework purely based on visual comparison. Standard quantitative measures such as PSNR and SSIM clearly fail to capture and accurately assess image quality with respect to the human visual system. The presented model can be extended to provide video SR in real-time, e.g., by performing SR techniques on frames of video data. The techniques described herein have a wide variety of applications in which increasing the resolution of a visual image would be helpful. For example, the resolution of still, or video, images can be enhanced, where the images are uploaded to a social media site, where the images are provided to a live streaming application or platform, where the images are presented in a video game or media stream, where the images are rendered in a virtual reality application or where the images are part of a spherical video or 360-degree video/image, where the images are formed by a microscope or a telescope, etc. In addition, visual images based on invisible radiation (e.g., X-rays, MRI images, infrared images, etc.) also can be enhanced with the techniques described herein.
  • To generate photo-realistic solutions to the SR problem a content loss defined on feature maps of higher level features from deeper network layers with more potential to focus on the content of the images to yield the perceptually most convincing results, which we attribute to the potential of deeper network layers to represent features of higher abstraction away from pixel space. Feature maps of these deeper layers may focus purely on the content, while leaving the adversarial loss focusing on texture details that are the main difference between the super-resolved images without the adversarial loss and photo-realistic images. The development of loss functions that describe image spatial content, but that are orthogonal to the direction of the adversarial loss can further improve photo-realistic image SR results.
  • Aspects and/or implementations of the techniques described herein can improve the effectiveness of synthesizing content using machine learning techniques. Certain aspects and/or implementations seek to provide techniques for generating hierarchical algorithms that can be used to enhance visual data based on received input visual data and a plurality of pieces of training data. Other aspects and/or implementations seek to provide techniques for machine learning.
  • In some implementations, it is possible to overcome the problem of performing super-resolution techniques based on an MSE approach by using one or more generative adversarial networks, including a generating network and a discriminating network and by using one or more loss functions that are not based only on MSE, but that also can be based on other perceptual loss functions, e.g., content loss, adversarial loss, and a regularization loss. As mentioned above, a least-squares method picks an average of all possible solutions, thereby resulting in an output which may not accurately represent a higher quality version of the inputted visual data. In contrast, the techniques described herein select a most probable output when compared to a training dataset and an output that is most realistic, as determined by the discriminator.
  • Further implementations may use this approach to generate high quality versions of inputted low quality visual data by training an algorithm so that the generating function is optimized. In some implementations, only low-quality data is required along with a high-quality reference dataset that may contain unrelated visual data.
  • An implementation is described in relation to FIG. 7, which shows a flow chart of a process used to train a network 710.
  • In one implementation, training the network 710 includes increasing the quality of the input visual data 720. It will be appreciated that the visual data can be processed in many ways, such as by creating photorealistic outputs, removing noise from received visual data, and generating or synthesizing new images. The network 710 receives at least one section of low-quality visual data 720 used to initialize the network 710 with a set of parameters 715. The network 710 may also receive a low-quality visual data training set 730. In some implementations, the plurality of low-quality visual data training set 730 may be a selection of low-quality images, frames of video data or rendered frames, although other types of low-quality visual data may be received by the network 710. The low-quality images or frames can include downsampled versions of high-quality images or frames.
  • The low-quality visual data training set 730 may be received by the network 710 from an external source, such as the Internet or may be stored in a memory of a computing device.
  • The low-quality visual data 720 can be used as a training dataset and can be provided to the network 710 that, using the parameters 715, seeks to produce estimated enhanced quality visual dataset 740 corresponding to the low-quality visual data training set 730. In some implementations, only a subset of the low-quality visual data 720 may be used when producing the estimate enhanced quality visual dataset 740. The estimated enhanced quality visual dataset 740 may include a set of visual data representing enhanced quality versions of the corresponding lower quality visual data from a subset of the low-quality visual data training set 730. In some implementations, the entire low-quality visual data training set 730 may be used.
  • In some implementations, the enhanced quality visual dataset 740 may be used as an input to a comparison network 760, along with a high quality visual data reference set 750. The high-quality visual data reference set 750 may be received by the network 710, from an external source, such as the Internet, or may be stored in a memory of a computing device that is used to train the network 710.
  • The comparison network 760 may use a plurality of characteristics determined from the high-quality visual data reference set 750 and the estimated enhanced quality visual dataset 740 to determine similarities and differences between the two datasets 740, 750. The comparison may be made between empirical probability distributions of visual data. The plurality of characteristics use may include sufficient statistics computed across subsets of visual data.
  • The comparison network 760 may utilize an adversarial training procedure such as the one used to train a Generative Adversarial Network (GAN) that includes, for example, a generating network and a discriminating network. In some implementations, such a comparison network 760 may use a discriminator trained to discriminate between data items sampled from the high-quality visual data reference set 750 and those sampled from the estimated enhanced quality visual dataset 740. The classification accuracy of this discriminator may then form the basis of the comparison.
  • The comparison network 760 can produce updated parameters 765 that can be used to replace the parameters 715 of the network 710. Using the updated parameters 765, the method may iterate, seeking to reduce the differences between the plurality of characteristics determined from the high-quality visual data 730 and the estimated enhanced quality visual data 740, each time using the updated parameters 765 produced by the comparison network 760.
  • The method continues to iterate until the network 710 produces an estimated enhanced quality visual data 740 representative of high quality visual data corresponding to the low-quality visual data training set 730. After training the network 710, an enhanced quality visual data 770 may be output, where the enhanced quality visual data 770 corresponds to an enhanced quality version of the at least one section of low-quality visual data 720.
  • In some implementations, the method may be used to apply a style transfer to the input visual data. For example, input visual data may include a computer graphics rendering, and the method may be used to process the computer graphics rendering. Using a photorealistic set of reference data 750, the output of the network 710 may appear to have photo-real characteristics to represent a photo-real version of the computer graphics rendering.
  • In some implementations, the trained network 710 may be used to recover information from corrupted, downsampled, compressed, or lower-quality input visual data, by using a reference dataset to recover estimates of the corrupted, downsampled, compressed, or lower-quality input visual data.
  • In yet further implementations, the trained network may be used for the removal of compression artifacts, dynamic range inference, image inpainting, image de-mosaicing, and denoising, from corrupted, downsampled, compressed, or lower-quality input visual data, thus allowing for a range of visual data to be processed, each with different quality degrading characteristics. It will be appreciated other characteristics that affect the quality of the visual data may be enhanced by the network. Furthermore, in some implementations, the network may be configured to process the visual data consisting of one or more of the above-mentioned quality characteristics.
  • Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
  • These computer programs (also known as programs, software, software applications or code) include machine instructions for a programmable processor and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms “machine-readable medium” “computer-readable medium” refers to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term “machine-readable signal” refers to any signal used to provide machine instructions and/or data to a programmable processor.
  • To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
  • The systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.
  • The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
  • A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the specification.
  • It will also be understood that when an element is referred to as being on, connected to, electrically connected to, coupled to, or electrically coupled to another element, it may be directly on, connected or coupled to the other element, or one or more intervening elements may be present. In contrast, when an element is referred to as being directly on, directly connected to or directly coupled to another element, there are no intervening elements present. Although the terms directly on, directly connected to, or directly coupled to may not be used throughout the detailed description, elements that are shown as being directly on, directly connected or directly coupled can be referred to as such. The claims of the application may be amended to recite exemplary relationships described in the specification or shown in the figures.
  • While certain features of the described implementations have been illustrated as described herein, many modifications, substitutions, changes, and equivalents will now occur to those skilled in the art, and it should be understood that the implementations described herein have been presented by way of example only, not limitation, and various changes in form and details may be made. Any portion of the apparatus and/or methods described herein may be combined in any combination, except mutually exclusive combinations. The implementations described herein can include various combinations and/or sub-combinations of the functions, components and/or features of the different implementations described.
  • In addition, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. In addition, other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems.

Claims (10)

What is claimed is:
1. A method comprising:
receiving an initial image;
generating a super-resolution image from the initial image by using a generator convolutional neural network trained to minimize perceptual loss from the initial image, the generator convolutional neural network being parameterized by first weights and biases selected to optimize processed visual data based on the comparison between the one or more characteristics of visual image training data and the one or more characteristics of a visual image reference dataset and by using a discriminator convolutional neural network that is parameterized by second weights and biases, wherein the discriminator convolutional neural network is trained to discriminate super-resolved image data from real image data; and
storing the generated super-resolution image.
2. The method of claim 1, wherein using the generator convolutional neural network trained to minimize perceptual loss includes using a generator convolutional neural network that minimizes a Euclidean distance between feature representations of an image that is reconstructed from a downsampled version of a reference image and the reference image.
3. The method of claim 1, wherein the generator convolutional neural network uses a perceptual loss function that is a weighted combination of content loss and adversarial loss.
4. The method of claim 1, wherein the generator convolutional neural network uses a perceptual loss function that is a weighted combination of content loss, adversarial loss, and regularization loss.
5. The method of claim 1, wherein using the generator convolutional neural network trained to minimize perceptual loss includes using a visual geometry group neural network.
6. A computer-readable medium storing generator convolutional neural network and a discriminator convolutional neural network trained to generate an image using a method comprising:
receiving an initial image;
generating a super-resolution image from the initial image by using a generator convolutional neural network trained to minimize perceptual loss from the initial image, the generator convolutional neural network being parameterized by first weights and biases selected to optimize processed visual data based on the comparison between the one or more characteristics of visual image training data and the one or more characteristics of a visual image reference dataset and by using a discriminator convolutional neural network that is parameterized by second weights and biases, wherein the discriminator convolutional neural network is trained to discriminate super-resolved image data from real image data; and
storing the generated super-resolution image.
7. The computer-readable medium of claim 6, wherein using the generator convolutional neural network trained to minimize perceptual loss includes using a generator convolutional neural network that minimizes a Euclidean distance between feature representations of an image that is reconstructed from a downsampled version of a reference image and the reference image.
8. The computer-readable medium of claim 6, wherein the generator convolutional neural network uses a perceptual loss function that is a weighted combination of content loss and adversarial loss.
9. The computer-readable medium of claim 6, wherein the generator convolutional neural network uses a perceptual loss function that is a weighted combination of content loss, adversarial loss, and regularization loss.
10. The computer-readable medium of claim 6, wherein using the generator convolutional neural network trained to minimize perceptual loss includes using a visual geometry group neural network.
US17/302,537 2016-09-15 2021-05-05 Super resolution using a generative adversarial network Abandoned US20210264568A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/302,537 US20210264568A1 (en) 2016-09-15 2021-05-05 Super resolution using a generative adversarial network

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662395186P 2016-09-15 2016-09-15
US201662422012P 2016-11-14 2016-11-14
US15/706,428 US11024009B2 (en) 2016-09-15 2017-09-15 Super resolution using a generative adversarial network
US17/302,537 US20210264568A1 (en) 2016-09-15 2021-05-05 Super resolution using a generative adversarial network

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/706,428 Division US11024009B2 (en) 2016-09-15 2017-09-15 Super resolution using a generative adversarial network

Publications (1)

Publication Number Publication Date
US20210264568A1 true US20210264568A1 (en) 2021-08-26

Family

ID=59955761

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/706,428 Active US11024009B2 (en) 2016-09-15 2017-09-15 Super resolution using a generative adversarial network
US17/302,537 Abandoned US20210264568A1 (en) 2016-09-15 2021-05-05 Super resolution using a generative adversarial network

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/706,428 Active US11024009B2 (en) 2016-09-15 2017-09-15 Super resolution using a generative adversarial network

Country Status (2)

Country Link
US (2) US11024009B2 (en)
WO (1) WO2018053340A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11416967B2 (en) * 2020-01-03 2022-08-16 Beijing Baidu Netcom Science And Technology Co., Ltd. Video processing method, apparatus, device and storage medium
US20220262106A1 (en) * 2021-02-18 2022-08-18 Robert Bosch Gmbh Device and method for training a machine learning system for generating images
US11748846B2 (en) 2018-07-03 2023-09-05 Nanotronics Imaging, Inc. Systems, devices, and methods for providing feedback on and improving the accuracy of super-resolution imaging
WO2023229589A1 (en) * 2022-05-25 2023-11-30 Innopeak Technology, Inc. Real-time video super-resolution for mobile devices
US20240161365A1 (en) * 2022-11-10 2024-05-16 International Business Machines Corporation Enhancing images in text documents

Families Citing this family (442)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3259920A1 (en) 2015-02-19 2017-12-27 Magic Pony Technology Limited Visual processing using temporal and spatial interpolation
GB201603144D0 (en) 2016-02-23 2016-04-06 Magic Pony Technology Ltd Training end-to-end video processes
GB201604672D0 (en) 2016-03-18 2016-05-04 Magic Pony Technology Ltd Generative methods of super resolution
US10198624B2 (en) * 2016-02-18 2019-02-05 Pinscreen, Inc. Segmentation-guided real-time facial performance capture
EP3298579B1 (en) 2016-04-12 2021-07-21 Magic Pony Technology Limited Visual data processing using energy networks
KR101974261B1 (en) * 2016-06-24 2019-04-30 한국과학기술원 Encoding method and apparatus comprising convolutional neural network(cnn) based in-loop filter, and decoding method and apparatus comprising convolutional neural network(cnn) based in-loop filter
US10366328B2 (en) * 2017-09-19 2019-07-30 Gyrfalcon Technology Inc. Approximating fully-connected layers with multiple arrays of 3x3 convolutional filter kernels in a CNN based integrated circuit
US10339445B2 (en) * 2016-10-10 2019-07-02 Gyrfalcon Technology Inc. Implementation of ResNet in a CNN based digital integrated circuit
US10360470B2 (en) * 2016-10-10 2019-07-23 Gyrfalcon Technology Inc. Implementation of MobileNet in a CNN based digital integrated circuit
KR102271285B1 (en) * 2016-11-09 2021-07-01 삼성전자주식회사 Image processing apparatus and method for processing image
US10121103B2 (en) * 2016-12-09 2018-11-06 Cisco Technologies, Inc. Scalable deep learning video analytics
CN108229508B (en) * 2016-12-15 2022-01-04 富士通株式会社 Training apparatus and training method for training image processing apparatus
KR101871098B1 (en) * 2017-01-12 2018-06-25 포항공과대학교 산학협력단 Apparatus and method for image processing
US10636141B2 (en) * 2017-02-09 2020-04-28 Siemens Healthcare Gmbh Adversarial and dual inverse deep learning networks for medical image analysis
US10303965B2 (en) * 2017-03-06 2019-05-28 Siemens Healthcare Gmbh Defective pixel identification using machine learning
US10600185B2 (en) * 2017-03-08 2020-03-24 Siemens Healthcare Gmbh Automatic liver segmentation using adversarial image-to-image network
US10489887B2 (en) 2017-04-10 2019-11-26 Samsung Electronics Co., Ltd. System and method for deep learning image super resolution
CN108932697B (en) * 2017-05-26 2020-01-17 杭州海康威视数字技术股份有限公司 Distortion removing method and device for distorted image and electronic equipment
US11273553B2 (en) * 2017-06-05 2022-03-15 Autodesk, Inc. Adapting simulation data to real-world conditions encountered by physical processes
CN109218727B (en) * 2017-06-30 2021-06-25 书法报视频媒体(湖北)有限公司 Video processing method and device
EP3649618A1 (en) 2017-07-03 2020-05-13 Artomatix Ltd. Systems and methods for providing non-parametric texture synthesis of arbitrary shape and/or material data in a unified framework
EP3662439A1 (en) * 2017-07-31 2020-06-10 Institut Pasteur Method, device, and computer program for improving the reconstruction of dense super-resolution images from diffraction-limited images acquired by single molecule localization microscopy
US11734955B2 (en) * 2017-09-18 2023-08-22 Board Of Trustees Of Michigan State University Disentangled representation learning generative adversarial network for pose-invariant face recognition
US10482575B2 (en) * 2017-09-28 2019-11-19 Intel Corporation Super-resolution apparatus and method for virtual and mixed reality
US10579785B2 (en) * 2017-09-29 2020-03-03 General Electric Company Automatic authentification for MES system using facial recognition
US10552944B2 (en) * 2017-10-13 2020-02-04 Adobe Inc. Image upscaling with controllable noise reduction using a neural network
JP2019079374A (en) * 2017-10-26 2019-05-23 株式会社Preferred Networks Image processing system, image processing method, and image processing program
EP3499459A1 (en) * 2017-12-18 2019-06-19 FEI Company Method, device and system for remote deep learning for microscopic image reconstruction and segmentation
US10592779B2 (en) 2017-12-21 2020-03-17 International Business Machines Corporation Generative adversarial network medical image generation for training of a classifier
US10540578B2 (en) * 2017-12-21 2020-01-21 International Business Machines Corporation Adapting a generative adversarial network to new data sources for image classification
US10937540B2 (en) 2017-12-21 2021-03-02 International Business Machines Coporation Medical image classification based on a generative adversarial network trained discriminator
US10388002B2 (en) * 2017-12-27 2019-08-20 Facebook, Inc. Automatic image correction using machine learning
CN108062780B (en) * 2017-12-29 2019-08-09 百度在线网络技术(北京)有限公司 Method for compressing image and device
US10699388B2 (en) * 2018-01-24 2020-06-30 Adobe Inc. Digital image fill
US10152970B1 (en) * 2018-02-08 2018-12-11 Capital One Services, Llc Adversarial learning and generation of dialogue responses
WO2019166332A1 (en) * 2018-02-27 2019-09-06 Koninklijke Philips N.V. Ultrasound system with a neural network for producing images from undersampled ultrasound data
CN110232392B (en) * 2018-03-05 2021-08-17 北京大学 Visual optimization method, optimization system, computer device and readable storage medium
US10867195B2 (en) * 2018-03-12 2020-12-15 Microsoft Technology Licensing, Llc Systems and methods for monitoring driver state
CN110276720B (en) * 2018-03-16 2021-02-12 华为技术有限公司 Image generation method and device
CN108537747A (en) * 2018-03-22 2018-09-14 南京大学 A kind of image repair method based on the convolutional neural networks with symmetrical parallel link
CN110309692B (en) * 2018-03-27 2023-06-02 杭州海康威视数字技术股份有限公司 Face recognition method, device and system, and model training method and device
CN108510004B (en) * 2018-04-04 2022-04-08 深圳大学 Cell classification method and system based on deep residual error network
CN108537733B (en) * 2018-04-11 2022-03-11 南京邮电大学 Super-resolution reconstruction method based on multi-path deep convolutional neural network
CN108573479A (en) * 2018-04-16 2018-09-25 西安电子科技大学 The facial image deblurring and restoration methods of confrontation type network are generated based on antithesis
US10956817B2 (en) * 2018-04-18 2021-03-23 Element Ai Inc. Unsupervised domain adaptation with similarity learning for images
CN108710896B (en) * 2018-04-24 2021-10-29 浙江工业大学 Domain learning method based on generative confrontation learning network
US10762337B2 (en) * 2018-04-27 2020-09-01 Apple Inc. Face synthesis using generative adversarial networks
CN108573243A (en) * 2018-04-27 2018-09-25 上海敏识网络科技有限公司 A kind of comparison method of the low quality face based on depth convolutional neural networks
CN108596830B (en) * 2018-04-28 2022-04-22 国信优易数据股份有限公司 Image style migration model training method and image style migration method
CN110473147A (en) * 2018-05-09 2019-11-19 腾讯科技(深圳)有限公司 A kind of video deblurring method and device
CN110472457A (en) * 2018-05-10 2019-11-19 成都视观天下科技有限公司 Low-resolution face image identification, restoring method, equipment and storage medium
CN108629753A (en) * 2018-05-22 2018-10-09 广州洪森科技有限公司 A kind of face image restoration method and device based on Recognition with Recurrent Neural Network
CN109801214B (en) * 2018-05-29 2023-08-29 京东方科技集团股份有限公司 Image reconstruction device, image reconstruction method, image reconstruction device, image reconstruction apparatus, computer-readable storage medium
CN108877832B (en) * 2018-05-29 2022-12-23 东华大学 Audio tone quality restoration system based on GAN
KR102184755B1 (en) * 2018-05-31 2020-11-30 서울대학교 산학협력단 Apparatus and Method for Training Super Resolution Deep Neural Network
CN108830209B (en) * 2018-06-08 2021-12-17 西安电子科技大学 Remote sensing image road extraction method based on generation countermeasure network
US10810460B2 (en) * 2018-06-13 2020-10-20 Cosmo Artificial Intelligence—AI Limited Systems and methods for training generative adversarial networks and use of trained generative adversarial networks
CN110619535B (en) * 2018-06-19 2023-07-14 华为技术有限公司 Data processing method and device
CN108921789A (en) * 2018-06-20 2018-11-30 华北电力大学 Super-resolution image reconstruction method based on recurrence residual error network
CN108921788A (en) * 2018-06-20 2018-11-30 华北电力大学 Image super-resolution method, device and storage medium based on deep layer residual error CNN
US10672174B2 (en) 2018-06-28 2020-06-02 Adobe Inc. Determining image handle locations
CN108877809B (en) * 2018-06-29 2020-09-22 北京中科智加科技有限公司 Speaker voice recognition method and device
US10621764B2 (en) * 2018-07-05 2020-04-14 Adobe Inc. Colorizing vector graphic objects
CN109190750B (en) * 2018-07-06 2021-06-08 国家计算机网络与信息安全管理中心 Small sample generation method and device based on countermeasure generation network
TWI667576B (en) * 2018-07-09 2019-08-01 國立中央大學 Machine learning method and machine learning device
CN109035142B (en) * 2018-07-16 2020-06-19 西安交通大学 Satellite image super-resolution method combining countermeasure network with aerial image prior
CN108921123A (en) * 2018-07-17 2018-11-30 重庆科技学院 A kind of face identification method based on double data enhancing
CN109086779B (en) * 2018-07-28 2021-11-09 天津大学 Attention target identification method based on convolutional neural network
CN109190665B (en) * 2018-07-30 2023-07-04 国网上海市电力公司 Universal image classification method and device based on semi-supervised generation countermeasure network
CN109345604B (en) * 2018-08-01 2023-07-18 深圳大学 Picture processing method, computer device and storage medium
EP3827412A4 (en) * 2018-08-01 2021-08-18 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method and apparatus for image processing
KR20200015095A (en) 2018-08-02 2020-02-12 삼성전자주식회사 Image processing apparatus and operating method for the same
US10706308B2 (en) * 2018-08-07 2020-07-07 Accenture Global Solutions Limited Image processing for automated object identification
CN110544488B (en) * 2018-08-09 2022-01-28 腾讯科技(深圳)有限公司 Method and device for separating multi-person voice
US10785646B2 (en) * 2018-08-24 2020-09-22 International Business Machines Corporation Use of generative adversarial networks (GANs) for robust transmitter authentication
CN110570358A (en) * 2018-09-04 2019-12-13 阿里巴巴集团控股有限公司 vehicle loss image enhancement method and device based on GAN network
CN110569864A (en) * 2018-09-04 2019-12-13 阿里巴巴集团控股有限公司 vehicle loss image generation method and device based on GAN network
CN109308461A (en) * 2018-09-06 2019-02-05 广东智媒云图科技股份有限公司 A kind of vehicle picture repairs the generation method of training sample
EP3620989A1 (en) * 2018-09-07 2020-03-11 Panasonic Intellectual Property Corporation of America Information processing method, information processing apparatus, and program
KR102553146B1 (en) * 2018-09-13 2023-07-07 삼성전자주식회사 Image processing apparatus and operating method for the same
CN109461120A (en) * 2018-09-19 2019-03-12 华中科技大学 A kind of microwave remote sensing bright temperature image reconstructing method based on SRGAN
CN109389556B (en) * 2018-09-21 2023-03-21 五邑大学 Multi-scale cavity convolutional neural network super-resolution reconstruction method and device
CN109345448B (en) * 2018-09-25 2023-05-05 广东工业大学 Contour map coloring method and device
CN109118432B (en) * 2018-09-26 2022-09-13 福建帝视信息科技有限公司 Image super-resolution reconstruction method based on rapid cyclic convolution network
CN110956582B (en) * 2018-09-26 2024-02-02 Tcl科技集团股份有限公司 Image processing method, device and equipment
US11514313B2 (en) * 2018-09-27 2022-11-29 Google Llc Sampling from a generator neural network using a discriminator neural network
US10978051B2 (en) * 2018-09-28 2021-04-13 Capital One Services, Llc Adversarial learning framework for persona-based dialogue modeling
EP3859681A4 (en) 2018-09-29 2021-12-15 Zhejiang University Method for generating facial animation from single image
CN109448083B (en) * 2018-09-29 2019-09-13 浙江大学 A method of human face animation is generated from single image
CN109255390B (en) * 2018-09-30 2021-01-29 京东方科技集团股份有限公司 Training image preprocessing method and module, discriminator and readable storage medium
CN109345455B (en) * 2018-09-30 2021-01-26 京东方科技集团股份有限公司 Image authentication method, authenticator and computer-readable storage medium
CN109272048B (en) * 2018-09-30 2022-04-12 北京工业大学 Pattern recognition method based on deep convolutional neural network
CN109360151B (en) * 2018-09-30 2021-03-05 京东方科技集团股份有限公司 Image processing method and system, resolution improving method and readable storage medium
CN109274621B (en) * 2018-09-30 2021-05-14 中国人民解放军战略支援部队信息工程大学 Communication protocol signal identification method based on depth residual error network
US11615505B2 (en) * 2018-09-30 2023-03-28 Boe Technology Group Co., Ltd. Apparatus and method for image processing, and system for training neural network
CN109410135B (en) * 2018-10-02 2022-03-18 复旦大学 Anti-learning image defogging and fogging method
US11232541B2 (en) * 2018-10-08 2022-01-25 Rensselaer Polytechnic Institute CT super-resolution GAN constrained by the identical, residual and cycle learning ensemble (GAN-circle)
CN109284786B (en) * 2018-10-10 2020-05-29 西安电子科技大学 SAR image terrain classification method for generating countermeasure network based on distribution and structure matching
KR102220029B1 (en) * 2018-10-12 2021-02-25 한국과학기술원 Method for processing unmatched low dose x-ray computed tomography image using artificial neural network and apparatus therefor
WO2020081125A1 (en) * 2018-10-17 2020-04-23 Purdue Research Foundation Analyzing complex single molecule emission patterns with deep learning
CN109543674B (en) * 2018-10-19 2023-04-07 天津大学 Image copy detection method based on generation countermeasure network
CN109410974B (en) * 2018-10-23 2021-09-28 百度在线网络技术(北京)有限公司 Voice enhancement method, device, equipment and storage medium
CN109359608B (en) * 2018-10-25 2021-10-19 电子科技大学 Face recognition method based on deep learning model
CN109360153B (en) * 2018-10-26 2023-05-02 北京金山云网络技术有限公司 Image processing method, super-resolution model generation method and device and electronic equipment
CN111105349B (en) * 2018-10-26 2022-02-11 珠海格力电器股份有限公司 Image processing method
US10314477B1 (en) 2018-10-31 2019-06-11 Capital One Services, Llc Systems and methods for dynamically modifying visual content to account for user visual impairment
CN109410239B (en) * 2018-11-07 2021-11-16 南京大学 Text image super-resolution reconstruction method based on condition generation countermeasure network
CN109658347A (en) * 2018-11-14 2019-04-19 天津大学 Data enhancement methods that are a kind of while generating plurality of picture style
CN111191667B (en) * 2018-11-15 2023-08-18 天津大学青岛海洋技术研究院 Crowd counting method based on multiscale generation countermeasure network
US10885384B2 (en) * 2018-11-15 2021-01-05 Intel Corporation Local tone mapping to reduce bit depth of input images to high-level computer vision tasks
JP6569047B1 (en) * 2018-11-28 2019-09-04 株式会社ツバサファクトリー Learning method, computer program, classifier, and generator
CN109636746B (en) * 2018-11-30 2020-09-08 上海皓桦科技股份有限公司 Image noise removing system, method and equipment
CN109615582B (en) * 2018-11-30 2023-09-01 北京工业大学 Face image super-resolution reconstruction method for generating countermeasure network based on attribute description
US11087170B2 (en) * 2018-12-03 2021-08-10 Advanced Micro Devices, Inc. Deliberate conditional poison training for generative models
CN109741255A (en) * 2018-12-12 2019-05-10 深圳先进技术研究院 PET image super-resolution reconstruction method, device, equipment and medium based on decision tree
TWI705340B (en) * 2018-12-13 2020-09-21 財團法人工業技術研究院 Training method for phase image generator and training method of phase image classifier
CN109685716B (en) * 2018-12-14 2022-12-20 大连海事大学 Image super-resolution reconstruction method for generating countermeasure network based on Gaussian coding feedback
CN109815800A (en) * 2018-12-17 2019-05-28 广东电网有限责任公司 Object detection method and system based on regression algorithm
CN111340879B (en) * 2018-12-17 2023-09-01 台达电子工业股份有限公司 Image positioning system and method based on up-sampling
CN109801228A (en) * 2018-12-18 2019-05-24 合肥阿巴赛信息科技有限公司 A kind of jewelry picture beautification algorithm based on deep learning
US20220027709A1 (en) * 2018-12-18 2022-01-27 Nokia Technologies Oy Data denoising based on machine learning
DE102018222300A1 (en) 2018-12-19 2020-06-25 Leica Microsystems Cms Gmbh Scaling detection
WO2020125505A1 (en) * 2018-12-21 2020-06-25 Land And Fields Limited Image processing system
KR102169242B1 (en) * 2018-12-26 2020-10-23 포항공과대학교 산학협력단 Machine Learning Method for Restoring Super-Resolution Image
RU2697928C1 (en) * 2018-12-28 2019-08-21 Самсунг Электроникс Ко., Лтд. Superresolution of an image imitating high detail based on an optical system, performed on a mobile device having limited resources, and a mobile device which implements
CN109858362A (en) * 2018-12-28 2019-06-07 浙江工业大学 A kind of mobile terminal method for detecting human face based on inversion residual error structure and angle associated losses function
CN111382775B (en) * 2018-12-29 2023-10-20 清华大学 Generating countermeasure network for X-ray image processing and method therefor
CN109509152B (en) * 2018-12-29 2022-12-20 大连海事大学 Image super-resolution reconstruction method for generating countermeasure network based on feature fusion
CN111383187B (en) * 2018-12-29 2024-04-26 Tcl科技集团股份有限公司 Image processing method and device and intelligent terminal
US11196769B2 (en) 2019-01-02 2021-12-07 International Business Machines Corporation Efficient bootstrapping of transmitter authentication and use thereof
CN109949219B (en) * 2019-01-12 2021-03-26 深圳先进技术研究院 Reconstruction method, device and equipment of super-resolution image
CN109903223B (en) * 2019-01-14 2023-08-25 北京工商大学 Image super-resolution method based on dense connection network and generation type countermeasure network
CN109816593B (en) * 2019-01-18 2022-12-20 大连海事大学 Super-resolution image reconstruction method for generating countermeasure network based on attention mechanism
CN109785270A (en) * 2019-01-18 2019-05-21 四川长虹电器股份有限公司 A kind of image super-resolution method based on GAN
CN109903236B (en) * 2019-01-21 2020-12-18 南京邮电大学 Face image restoration method and device based on VAE-GAN and similar block search
CN109492627B (en) * 2019-01-22 2022-11-08 华南理工大学 Scene text erasing method based on depth model of full convolution network
CN109815893B (en) * 2019-01-23 2021-03-26 中山大学 Color face image illumination domain normalization method based on cyclic generation countermeasure network
US11521131B2 (en) * 2019-01-24 2022-12-06 Jumio Corporation Systems and methods for deep-learning based super-resolution using multiple degradations on-demand learning
CN110458765B (en) * 2019-01-25 2022-12-02 西安电子科技大学 Image quality enhancement method based on perception preserving convolution network
CN109785237B (en) * 2019-01-25 2022-10-18 广东工业大学 Terahertz image super-resolution reconstruction method, system and related device
US20200242771A1 (en) * 2019-01-25 2020-07-30 Nvidia Corporation Semantic image synthesis for generating substantially photorealistic images using neural networks
CN111488895B (en) * 2019-01-28 2024-01-30 北京达佳互联信息技术有限公司 Countermeasure data generation method, device, equipment and storage medium
US10380724B1 (en) * 2019-01-28 2019-08-13 StradVision, Inc. Learning method and learning device for reducing distortion occurred in warped image generated in process of stabilizing jittered image by using GAN to enhance fault tolerance and fluctuation robustness in extreme situations
CN109859107B (en) * 2019-02-12 2023-04-07 广东工业大学 Remote sensing image super-resolution method, device, equipment and readable storage medium
JP7354268B2 (en) * 2019-02-20 2023-10-02 サウジ アラビアン オイル カンパニー A method for high-speed calculation of earthquake attributes using artificial intelligence
US10832734B2 (en) 2019-02-25 2020-11-10 International Business Machines Corporation Dynamic audiovisual segment padding for machine learning
CN109978762B (en) * 2019-02-27 2023-06-16 南京信息工程大学 Super-resolution reconstruction method based on condition generation countermeasure network
EP3932318A4 (en) 2019-02-28 2022-04-20 FUJIFILM Corporation Learning method, learning system, learned model, program, and super-resolution image generation device
GB2581991B (en) * 2019-03-06 2022-06-01 Huawei Tech Co Ltd Enhancement of three-dimensional facial scans
US11024013B2 (en) * 2019-03-08 2021-06-01 International Business Machines Corporation Neural network based enhancement of intensity images
JP7504120B2 (en) * 2019-03-18 2024-06-21 グーグル エルエルシー High-resolution real-time artistic style transfer pipeline
CN110020987B (en) * 2019-03-24 2023-06-30 北京工业大学 Medical image super-resolution reconstruction method based on deep learning
CN110084119A (en) * 2019-03-26 2019-08-02 安徽艾睿思智能科技有限公司 Low-resolution face image recognition methods based on deep learning
US11449989B2 (en) * 2019-03-27 2022-09-20 The General Hospital Corporation Super-resolution anatomical magnetic resonance imaging using deep learning for cerebral cortex segmentation
US11455495B2 (en) 2019-04-02 2022-09-27 Synthesis Ai, Inc. System and method for visual recognition using synthetic training data
CN111489290B (en) * 2019-04-02 2023-05-16 长信智控网络科技有限公司 Face image super-resolution reconstruction method and device and terminal equipment
US11120526B1 (en) 2019-04-05 2021-09-14 Snap Inc. Deep feature generative adversarial neural networks
CN109993702B (en) * 2019-04-10 2023-09-26 大连民族大学 Full-text image super-resolution reconstruction method based on generation countermeasure network
CN110009568A (en) * 2019-04-10 2019-07-12 大连民族大学 The generator construction method of language of the Manchus image super-resolution rebuilding
CN110189253B (en) * 2019-04-16 2023-03-31 浙江工业大学 Image super-resolution reconstruction method based on improved generation countermeasure network
US11900026B1 (en) 2019-04-24 2024-02-13 X Development Llc Learned fabrication constraints for optimizing physical devices
CN111861878B (en) * 2019-04-30 2023-09-22 达音网络科技(上海)有限公司 Optimizing a supervisory generated countermeasure network through latent spatial regularization
US11048974B2 (en) * 2019-05-06 2021-06-29 Agora Lab, Inc. Effective structure keeping for generative adversarial networks for single image super resolution
CN110276708B (en) * 2019-05-08 2023-04-18 山东浪潮科学研究院有限公司 Image digital watermark generation and identification system and method based on GAN network
CN110197458B (en) * 2019-05-14 2023-08-01 广州视源电子科技股份有限公司 Training method and device for visual angle synthesis network, electronic equipment and storage medium
US11263726B2 (en) * 2019-05-16 2022-03-01 Here Global B.V. Method, apparatus, and system for task driven approaches to super resolution
CN110120024B (en) * 2019-05-20 2021-08-17 百度在线网络技术(北京)有限公司 Image processing method, device, equipment and storage medium
CN111986069A (en) 2019-05-22 2020-11-24 三星电子株式会社 Image processing apparatus and image processing method thereof
CN111986127B (en) * 2019-05-22 2022-03-08 腾讯科技(深圳)有限公司 Image processing method and device, computer equipment and storage medium
KR102410907B1 (en) * 2019-05-22 2022-06-21 삼성전자주식회사 Image processing apparatus and image processing method thereof
CN110166779B (en) * 2019-05-23 2021-06-08 西安电子科技大学 Video compression method based on super-resolution reconstruction
CN110175953B (en) * 2019-05-24 2023-04-18 鹏城实验室 Image super-resolution method and system
CN110136067B (en) * 2019-05-27 2022-09-06 商丘师范学院 Real-time image generation method for super-resolution B-mode ultrasound image
CN110189255B (en) * 2019-05-29 2023-01-17 电子科技大学 Face detection method based on two-stage detection
CN110189276A (en) * 2019-05-31 2019-08-30 华东理工大学 A kind of facial image restorative procedure based on very big radius circle domain
CN111612711B (en) * 2019-05-31 2023-06-09 北京理工大学 Picture deblurring method based on generation of countermeasure network improvement
CN110222758B (en) * 2019-05-31 2024-04-23 腾讯科技(深圳)有限公司 Image processing method, device, equipment and storage medium
US11379633B2 (en) 2019-06-05 2022-07-05 X Development Llc Cascading models for optimization of fabrication and design of a physical device
US12079954B2 (en) 2019-06-10 2024-09-03 Google Llc Modifying sensor data using generative adversarial models
CN110310345A (en) * 2019-06-11 2019-10-08 同济大学 A kind of image generating method generating confrontation network based on hidden cluster of dividing the work automatically
CN112087556B (en) * 2019-06-12 2023-04-07 武汉Tcl集团工业研究院有限公司 Dark light imaging method and device, readable storage medium and terminal equipment
DE102019208496B4 (en) * 2019-06-12 2024-01-25 Siemens Healthcare Gmbh Computer-implemented methods and devices for providing a difference image data set of an examination volume and for providing a trained generator function
CN110322403A (en) * 2019-06-19 2019-10-11 怀光智能科技(武汉)有限公司 A kind of more supervision Image Super-resolution Reconstruction methods based on generation confrontation network
CN110276399B (en) * 2019-06-24 2021-06-04 厦门美图之家科技有限公司 Image conversion network training method and device, computer equipment and storage medium
CN110415309B (en) * 2019-06-26 2023-09-08 公安部第三研究所 Method for automatically generating fingerprint pictures based on generation countermeasure network
CN110310227B (en) * 2019-06-27 2020-09-08 电子科技大学 Image super-resolution reconstruction method based on high-low frequency information decomposition
CN110298791B (en) * 2019-07-08 2022-10-28 西安邮电大学 Super-resolution reconstruction method and device for license plate image
CN112215761A (en) * 2019-07-12 2021-01-12 华为技术有限公司 Image processing method, device and equipment
CN110490968B (en) * 2019-07-18 2022-10-04 西安理工大学 Optical field axial refocusing image super-resolution method based on generation countermeasure network
CN110532871B (en) * 2019-07-24 2022-05-10 华为技术有限公司 Image processing method and device
CN110428378B (en) 2019-07-26 2022-02-08 北京小米移动软件有限公司 Image processing method, device and storage medium
CN110660025B (en) * 2019-08-02 2023-01-17 西安理工大学 Industrial monitoring video image sharpening method based on GAN network
CN110415261B (en) * 2019-08-06 2021-03-16 山东财经大学 Expression animation conversion method and system for regional training
CN112396554B (en) * 2019-08-14 2023-04-25 天津大学青岛海洋技术研究院 Image super-resolution method based on generation of countermeasure network
KR20210020387A (en) * 2019-08-14 2021-02-24 삼성전자주식회사 Electronic apparatus and control method thereof
CN110458133A (en) * 2019-08-19 2019-11-15 电子科技大学 Lightweight method for detecting human face based on production confrontation network
CN110570353B (en) * 2019-08-27 2023-05-12 天津大学 Super-resolution reconstruction method for generating single image of countermeasure network by dense connection
CN110504029B (en) * 2019-08-29 2022-08-19 腾讯医疗健康(深圳)有限公司 Medical image processing method, medical image identification method and medical image identification device
CN110503606B (en) * 2019-08-29 2023-06-20 广州大学 Method for improving face definition
WO2021035629A1 (en) * 2019-08-29 2021-03-04 深圳市大疆创新科技有限公司 Method for acquiring image quality enhancement network, image quality enhancement method and apparatus, mobile platform, camera, and storage medium
CN110634170B (en) * 2019-08-30 2022-09-13 福建帝视信息科技有限公司 Photo-level image generation method based on semantic content and rapid image retrieval
CN110517203B (en) * 2019-08-30 2023-06-23 山东工商学院 Defogging method based on reference image reconstruction
CN110675335B (en) * 2019-08-31 2022-09-06 南京理工大学 Superficial vein enhancement method based on multi-resolution residual error fusion network
US10628931B1 (en) 2019-09-05 2020-04-21 International Business Machines Corporation Enhancing digital facial image using artificial intelligence enabled digital facial image generation
CN110533623B (en) * 2019-09-06 2022-09-30 兰州交通大学 Full convolution neural network multi-focus image fusion method based on supervised learning
CN110533004A (en) * 2019-09-07 2019-12-03 哈尔滨理工大学 A kind of complex scene face identification system based on deep learning
CN110570355B (en) * 2019-09-12 2020-09-01 杭州海睿博研科技有限公司 Multi-scale automatic focusing super-resolution processing system and method
US11152785B1 (en) * 2019-09-17 2021-10-19 X Development Llc Power grid assets prediction using generative adversarial networks
CN110706157B (en) * 2019-09-18 2022-09-30 中国科学技术大学 Face super-resolution reconstruction method for generating confrontation network based on identity prior
CN110689482B (en) * 2019-09-18 2022-09-30 中国科学技术大学 Face super-resolution method based on supervised pixel-by-pixel generation countermeasure network
CN110689483B (en) * 2019-09-24 2022-07-01 重庆邮电大学 Image super-resolution reconstruction method based on depth residual error network and storage medium
CN110852944B (en) * 2019-10-12 2023-11-21 天津大学 Multi-frame self-adaptive fusion video super-resolution method based on deep learning
KR102624027B1 (en) * 2019-10-17 2024-01-11 삼성전자주식회사 Image processing apparatus and method
US11645791B2 (en) * 2019-10-17 2023-05-09 Rutgers, The State University Of New Jersey Systems and methods for joint reconstruction and segmentation of organs from magnetic resonance imaging data
US11586912B2 (en) 2019-10-18 2023-02-21 International Business Machines Corporation Integrated noise generation for adversarial training
TWI730467B (en) 2019-10-22 2021-06-11 財團法人工業技術研究院 Method of transforming image and network for transforming image
CN110751224B (en) * 2019-10-25 2022-08-05 Oppo广东移动通信有限公司 Training method of video classification model, video classification method, device and equipment
KR20210050684A (en) 2019-10-29 2021-05-10 에스케이하이닉스 주식회사 Image processing system
CN111127316B (en) * 2019-10-29 2022-10-25 山东大学 Single face image super-resolution method and system based on SNGAN network
CN110796080B (en) * 2019-10-29 2023-06-16 重庆大学 Multi-pose pedestrian image synthesis algorithm based on generation countermeasure network
CN110796622B (en) * 2019-10-30 2023-04-18 天津大学 Image bit enhancement method based on multi-layer characteristics of series neural network
CN110827258B (en) * 2019-10-31 2022-02-11 西安交通大学 Device and method for screening diabetic retinopathy based on counterstudy
CN110827201A (en) * 2019-11-05 2020-02-21 广东三维家信息科技有限公司 Generative confrontation network training method and device for high-dynamic-range image super-resolution reconstruction
CN111563841B (en) * 2019-11-13 2023-07-25 南京信息工程大学 High-resolution image generation method based on generation countermeasure network
CN112905132B (en) * 2019-11-19 2023-07-18 华为技术有限公司 Screen projection method and device
KR20210061597A (en) 2019-11-20 2021-05-28 삼성전자주식회사 Method and device to improve radar data using reference data
CN111008930B (en) * 2019-11-20 2024-03-19 武汉纺织大学 Fabric image super-resolution reconstruction method
WO2021101243A1 (en) 2019-11-20 2021-05-27 Samsung Electronics Co., Ltd. Apparatus and method for using ai metadata related to image quality
US11599974B2 (en) * 2019-11-22 2023-03-07 Nec Corporation Joint rolling shutter correction and image deblurring
CN111047512B (en) * 2019-11-25 2022-02-01 中国科学院深圳先进技术研究院 Image enhancement method and device and terminal equipment
CN111008938B (en) * 2019-11-25 2023-04-14 天津大学 Real-time multi-frame bit enhancement method based on content and continuity guidance
CN111091616B (en) * 2019-11-25 2024-01-05 艾瑞迈迪医疗科技(北京)有限公司 Reconstruction method and device of three-dimensional ultrasonic image
CN110992262B (en) * 2019-11-26 2023-04-07 南阳理工学院 Remote sensing image super-resolution reconstruction method based on generation countermeasure network
CN111161141B (en) * 2019-11-26 2023-02-28 西安电子科技大学 Hyperspectral simple graph super-resolution method for counterstudy based on inter-band attention mechanism
CN111145131B (en) * 2019-11-28 2023-05-26 中国矿业大学 Infrared and visible light image fusion method based on multiscale generation type countermeasure network
CN111179187B (en) * 2019-12-09 2022-09-27 南京理工大学 Single image rain removing method based on cyclic generation countermeasure network
CN111010493B (en) * 2019-12-12 2021-03-02 清华大学 Method and device for video processing by using convolutional neural network
CN111127587B (en) * 2019-12-16 2023-06-23 杭州电子科技大学 Reference-free image quality map generation method based on countermeasure generation network
CN111105352B (en) * 2019-12-16 2023-04-25 佛山科学技术学院 Super-resolution image reconstruction method, system, computer equipment and storage medium
CN111091151B (en) * 2019-12-17 2021-11-05 大连理工大学 Construction method of generation countermeasure network for target detection data enhancement
US20220383452A1 (en) * 2019-12-20 2022-12-01 Beijing Kingsoft Cloud Network Technology Co., Ltd. Method, apparatus, electronic device and medium for image super-resolution and model training
CN111080528B (en) * 2019-12-20 2023-11-07 北京金山云网络技术有限公司 Image super-resolution and model training method and device, electronic equipment and medium
CN111260594B (en) * 2019-12-22 2023-10-31 天津大学 Unsupervised multi-mode image fusion method
CN111047515B (en) * 2019-12-29 2024-01-09 兰州理工大学 Attention mechanism-based cavity convolutional neural network image super-resolution reconstruction method
CN111179177B (en) * 2019-12-31 2024-03-26 深圳市联合视觉创新科技有限公司 Image reconstruction model training method, image reconstruction method, device and medium
CN111161137B (en) * 2019-12-31 2023-04-11 四川大学 Multi-style Chinese painting flower generation method based on neural network
CN111080531B (en) * 2020-01-10 2024-02-23 北京农业信息技术研究中心 Super-resolution reconstruction method, system and device for underwater fish image
CN111311472B (en) * 2020-01-15 2023-03-28 中国科学技术大学 Property right protection method for image processing model and image processing algorithm
IT202000000664A1 (en) 2020-01-15 2021-07-15 Digital Design S R L GENERATIVE SYSTEM FOR THE CREATION OF DIGITAL IMAGES FOR PRINTING ON DESIGN SURFACES
CN111260584A (en) * 2020-01-17 2020-06-09 北京工业大学 Underwater degraded image enhancement method based on GAN network
CN111275713B (en) * 2020-02-03 2022-04-12 武汉大学 Cross-domain semantic segmentation method based on countermeasure self-integration network
US11157763B2 (en) 2020-02-07 2021-10-26 Wipro Limited System and method for identifying target sections within images
CN111402196A (en) * 2020-02-10 2020-07-10 浙江工业大学 Bearing roller image generation method based on countermeasure generation network
CN111414990B (en) * 2020-02-20 2024-03-19 北京迈格威科技有限公司 Convolutional neural network processing method and device, electronic equipment and storage medium
CN111429402B (en) * 2020-02-25 2023-05-30 西北大学 Image quality evaluation method for fusion of advanced visual perception features and depth features
CN111355965B (en) * 2020-02-28 2022-02-25 中国工商银行股份有限公司 Image compression and restoration method and device based on deep learning
US11638032B2 (en) * 2020-03-05 2023-04-25 The Hong Kong University Of Science And Technology VistGAN: unsupervised video super-resolution with temporal consistency using GAN
CN111507898A (en) * 2020-03-16 2020-08-07 徐州工程学院 Image super-resolution reconstruction method based on self-adaptive adjustment
CN111368790A (en) * 2020-03-18 2020-07-03 北京三快在线科技有限公司 Construction method, identification method and construction device of fine-grained face identification model
CN111402137B (en) * 2020-03-20 2023-04-18 南京信息工程大学 Depth attention coding and decoding single image super-resolution algorithm based on perception loss guidance
CN113496465A (en) * 2020-03-20 2021-10-12 微软技术许可有限责任公司 Image scaling
CN111461977B (en) * 2020-03-26 2022-07-26 华南理工大学 Power data super-resolution reconstruction method based on improved generation type countermeasure network
CN111383200B (en) * 2020-03-30 2023-05-23 西安理工大学 CFA image demosaicing method based on generated antagonistic neural network
CN111489305B (en) * 2020-03-31 2023-05-30 天津大学 Image enhancement method based on reinforcement learning
CN111414888A (en) * 2020-03-31 2020-07-14 杭州博雅鸿图视频技术有限公司 Low-resolution face recognition method, system, device and storage medium
US11900563B2 (en) * 2020-04-01 2024-02-13 Boe Technology Group Co., Ltd. Computer-implemented method, apparatus, and computer-program product
CN111539263B (en) * 2020-04-02 2023-08-11 江南大学 Video face recognition method based on aggregation countermeasure network
CN111476353B (en) * 2020-04-07 2022-07-15 中国科学院重庆绿色智能技术研究院 Super-resolution method of GAN image introducing significance
CN111614974B (en) * 2020-04-07 2021-11-30 上海推乐信息技术服务有限公司 Video image restoration method and system
CN111626927B (en) * 2020-04-09 2023-05-30 上海交通大学 Binocular image super-resolution method, system and device adopting parallax constraint
GB2600348A (en) * 2020-04-15 2022-04-27 Nvidia Corp Video compression and decompression using neural networks
US20210329306A1 (en) * 2020-04-15 2021-10-21 Nvidia Corporation Video compression using neural networks
CN111583109B (en) * 2020-04-23 2024-02-13 华南理工大学 Image super-resolution method based on generation of countermeasure network
CN112699844B (en) * 2020-04-23 2023-06-20 华南理工大学 Image super-resolution method based on multi-scale residual hierarchy close-coupled network
CN113556496B (en) * 2020-04-23 2022-08-09 京东方科技集团股份有限公司 Video resolution improving method and device, storage medium and electronic equipment
CN111539940B (en) * 2020-04-27 2023-06-09 上海鹰瞳医疗科技有限公司 Super wide angle fundus image generation method and equipment
CN111553861B (en) * 2020-04-29 2023-11-24 苏州大学 Image super-resolution reconstruction method, device, equipment and readable storage medium
CN111583113A (en) * 2020-04-30 2020-08-25 电子科技大学 Infrared image super-resolution reconstruction method based on generation countermeasure network
US11948281B2 (en) * 2020-05-01 2024-04-02 Adobe Inc. Guided up-sampling for image inpainting
CN111696026B (en) * 2020-05-06 2023-06-23 华南理工大学 Reversible gray scale graph algorithm and computing equipment based on L0 regular term
CN113628121B (en) * 2020-05-06 2023-11-14 阿里巴巴集团控股有限公司 Method and device for processing and training multimedia data
CN111539897A (en) * 2020-05-09 2020-08-14 北京百度网讯科技有限公司 Method and apparatus for generating image conversion model
CN111598808B (en) * 2020-05-18 2022-08-23 腾讯科技(深圳)有限公司 Image processing method, device and equipment and training method thereof
CN111695455B (en) * 2020-05-28 2023-11-10 广西申能达智能技术有限公司 Low-resolution face recognition method based on coupling discrimination manifold alignment
CN111738267B (en) * 2020-05-29 2023-04-18 南京邮电大学 Visual perception method and visual perception device based on linear multi-step residual learning
CN111753670A (en) * 2020-05-29 2020-10-09 清华大学 Human face overdividing method based on iterative cooperation of attention restoration and key point detection
TWI768364B (en) * 2020-06-01 2022-06-21 宏碁股份有限公司 Method and electronic device for processing images that can be played on a virtual device
CN111931553B (en) * 2020-06-03 2024-02-06 西安电子科技大学 Method, system, storage medium and application for enhancing generation of remote sensing data into countermeasure network
CN115699099A (en) * 2020-06-04 2023-02-03 谷歌有限责任公司 Visual asset development using generation of countermeasure networks
CN111667004B (en) * 2020-06-05 2024-05-31 孝感市思创信息科技有限公司 Data generation method, device, equipment and storage medium
US11640711B2 (en) 2020-06-05 2023-05-02 Advanced Micro Devices, Inc. Automated artifact detection
CN111667409B (en) * 2020-06-09 2024-03-22 云南电网有限责任公司电力科学研究院 Super-resolution algorithm-based insulator image resolution enhancement method
CN111833282B (en) * 2020-06-11 2023-08-04 毛雅淇 Image fusion method based on improved DDcGAN model
CN111652822B (en) * 2020-06-11 2023-03-31 西安理工大学 Single image shadow removing method and system based on generation countermeasure network
US12061862B2 (en) 2020-06-11 2024-08-13 Capital One Services, Llc Systems and methods for generating customized content based on user preferences
WO2021251614A1 (en) 2020-06-12 2021-12-16 Samsung Electronics Co., Ltd. Image processing apparatus and method of operating the same
WO2021248473A1 (en) * 2020-06-12 2021-12-16 Baidu.Com Times Technology (Beijing) Co., Ltd. Personalized speech-to-video with three-dimensional (3d) skeleton regularization and expressive body poses
CN111986079A (en) * 2020-06-16 2020-11-24 长安大学 Pavement crack image super-resolution reconstruction method and device based on generation countermeasure network
CN111738953A (en) * 2020-06-24 2020-10-02 北京航空航天大学 Atmospheric turbulence degraded image restoration method based on boundary perception counterstudy
CN111899168B (en) * 2020-07-02 2023-04-07 中国地质大学(武汉) Remote sensing image super-resolution reconstruction method and system based on feature enhancement
CN111951177B (en) * 2020-07-07 2022-10-11 浙江大学 Infrared image detail enhancement method based on image super-resolution loss function
EP3937120B1 (en) * 2020-07-08 2023-12-20 Sartorius Stedim Data Analytics AB Computer-implemented method, computer program product and system for processing images
CN111932454B (en) * 2020-07-22 2022-05-27 杭州电子科技大学 LOGO pattern reconstruction method based on improved binary closed-loop neural network
CN111861924B (en) * 2020-07-23 2023-09-22 成都信息工程大学 Cardiac magnetic resonance image data enhancement method based on evolutionary GAN
CN111861888A (en) * 2020-07-27 2020-10-30 上海商汤智能科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111861930B (en) * 2020-07-27 2024-08-23 京东方科技集团股份有限公司 Image denoising method and device, electronic equipment and image super-resolution denoising method
CN112001868B (en) * 2020-07-30 2024-06-11 山东师范大学 Infrared and visible light image fusion method and system based on generation of antagonism network
CN111932456B (en) * 2020-07-31 2023-05-16 浙江师范大学 Single image super-resolution reconstruction method based on generation countermeasure network
CN112001427B (en) * 2020-08-04 2022-11-15 中国科学院信息工程研究所 Image conversion method and device based on analogy learning
CN111738230B (en) * 2020-08-05 2020-12-15 深圳市优必选科技股份有限公司 Face recognition method, face recognition device and electronic equipment
CN111915525B (en) * 2020-08-05 2024-03-01 湖北工业大学 Low-illumination image enhancement method capable of generating countermeasure network based on improved depth separation
CN111915491A (en) * 2020-08-14 2020-11-10 深圳清研智城科技有限公司 Weak supervision super-resolution reconstruction model and method based on distant and close scenes
CN112070667B (en) * 2020-08-14 2024-06-18 深圳市九分文化传媒有限公司 Multi-scale feature fusion video super-resolution reconstruction method
CN112102385B (en) * 2020-08-20 2023-02-10 复旦大学 Multi-modal liver magnetic resonance image registration system based on deep learning
CN114078089A (en) * 2020-08-21 2022-02-22 宏碁股份有限公司 Method for processing picture capable of being played on virtual device and electronic device
CN112102167B (en) * 2020-08-31 2024-04-26 深圳市航宇数字视觉科技有限公司 Image super-resolution method based on visual perception
CN112085677B (en) * 2020-09-01 2024-06-28 深圳先进技术研究院 Image processing method, system and computer storage medium
CN112184547B (en) * 2020-09-03 2023-05-05 红相股份有限公司 Super resolution method of infrared image and computer readable storage medium
US11366983B2 (en) 2020-09-09 2022-06-21 International Business Machines Corporation Study-level multi-view processing system
US12061672B2 (en) * 2020-09-10 2024-08-13 Canon Kabushiki Kaisha Image processing method, image processing apparatus, learning method, learning apparatus, and storage medium
CN112365398B (en) * 2020-09-11 2024-04-05 成都旷视金智科技有限公司 Super-resolution network training method, digital zooming method, device and electronic equipment
CN112070677B (en) * 2020-09-18 2024-04-02 中国科学技术大学 Video space-time super-resolution enhancement method based on time slicing
CN112132012B (en) * 2020-09-22 2022-04-26 中国科学院空天信息创新研究院 High-resolution SAR ship image generation method based on generation countermeasure network
CN112163998A (en) * 2020-09-24 2021-01-01 肇庆市博士芯电子科技有限公司 Single-image super-resolution analysis method matched with natural degradation conditions
CN112184582B (en) * 2020-09-28 2022-08-19 中科人工智能创新技术研究院(青岛)有限公司 Attention mechanism-based image completion method and device
CN112183727B (en) * 2020-09-29 2024-08-02 中科方寸知微(南京)科技有限公司 Countermeasure generation network model, and method and system for rendering scenery effect based on countermeasure generation network model
WO2022067653A1 (en) * 2020-09-30 2022-04-07 京东方科技集团股份有限公司 Image processing method and apparatus, device, video processing method, and storage medium
CN112215119B (en) * 2020-10-08 2022-04-12 华中科技大学 Small target identification method, device and medium based on super-resolution reconstruction
WO2022077417A1 (en) * 2020-10-16 2022-04-21 京东方科技集团股份有限公司 Image processing method, image processing device and readable storage medium
CN112232425B (en) * 2020-10-21 2023-11-28 腾讯科技(深圳)有限公司 Image processing method, device, storage medium and electronic equipment
CN112261415B (en) * 2020-10-23 2022-04-08 青海民族大学 Image compression coding method based on overfitting convolution self-coding network
US11538136B2 (en) * 2020-10-28 2022-12-27 Qualcomm Incorporated System and method to process images of a video stream
US20220138500A1 (en) * 2020-10-30 2022-05-05 Samsung Electronics Co., Ltd. Unsupervised super-resolution training data construction
US11908233B2 (en) 2020-11-02 2024-02-20 Pinscreen, Inc. Normalization of facial images using deep neural networks
CN112419177B (en) * 2020-11-10 2023-04-07 中国人民解放军陆军炮兵防空兵学院 Single image motion blur removing-oriented perception quality blind evaluation method
CN112435162B (en) * 2020-11-13 2024-03-05 中国科学院沈阳自动化研究所 Terahertz image super-resolution reconstruction method based on complex domain neural network
CN112419151B (en) * 2020-11-19 2023-07-21 北京有竹居网络技术有限公司 Image degradation processing method and device, storage medium and electronic equipment
CN112419192B (en) * 2020-11-24 2022-09-09 北京航空航天大学 Convolutional neural network-based ISMS image restoration and super-resolution reconstruction method and device
CN113191945B (en) * 2020-12-03 2023-10-27 陕西师范大学 Heterogeneous platform-oriented high-energy-efficiency image super-resolution system and method thereof
CN112488956A (en) * 2020-12-14 2021-03-12 南京信息工程大学 Method for image restoration based on WGAN network
KR102273377B1 (en) * 2020-12-14 2021-07-06 국방기술품질원 Method for synthesizing image
CN112541876B (en) * 2020-12-15 2023-08-04 北京百度网讯科技有限公司 Satellite image processing method, network training method, related device and electronic equipment
US11874899B2 (en) 2020-12-15 2024-01-16 International Business Machines Corporation Automated multimodal adaptation of multimedia content
CN112580502B (en) * 2020-12-17 2024-10-01 南京航空航天大学 SICNN-based low-quality video face recognition method
CN112381215B (en) * 2020-12-17 2023-08-11 之江实验室 Self-adaptive search space generation method and device oriented to automatic machine learning
EP4016445A1 (en) * 2020-12-21 2022-06-22 Dassault Systèmes Detection of loss of details in a denoised image
EP4016446A1 (en) * 2020-12-21 2022-06-22 Dassault Systèmes Intelligent denoising
CN112508792B (en) * 2020-12-22 2024-09-10 北京航空航天大学杭州创新研究院 Online knowledge migration-based deep neural network integrated model single image super-resolution method and system
CN112634135B (en) * 2020-12-23 2022-09-13 中国地质大学(武汉) Remote sensing image super-resolution reconstruction method based on super-resolution style migration network
CN112565819B (en) * 2020-12-24 2023-04-07 新奥特(北京)视频技术有限公司 Video data processing method and device, electronic equipment and storage medium
CN112731327B (en) * 2020-12-25 2023-05-23 南昌航空大学 HRRP radar target identification method based on CN-LSGAN, STFT and CNN
CN112529828B (en) * 2020-12-25 2023-01-31 西北大学 Reference data non-sensitive remote sensing image space-time fusion model construction method
CN112598598B (en) * 2020-12-25 2023-11-28 南京信息工程大学滨江学院 Image reflected light removing method based on two-stage reflected light eliminating network
CN112598579B (en) * 2020-12-28 2024-08-27 苏州科达特种视讯有限公司 Monitoring scene-oriented image super-resolution method, device and storage medium
CN112907441B (en) * 2020-12-29 2023-05-30 中央财经大学 Space downscaling method based on super-resolution of ground water satellite image
CN112669212B (en) * 2020-12-30 2024-03-26 杭州趣链科技有限公司 Face image super-resolution reconstruction method, device, computer equipment and medium
CN112785498B (en) * 2020-12-31 2023-06-02 达科为(深圳)医疗设备有限公司 Pathological image superscore modeling method based on deep learning
US20220215232A1 (en) * 2021-01-05 2022-07-07 Nvidia Corporation View generation using one or more neural networks
US11310464B1 (en) * 2021-01-24 2022-04-19 Dell Products, Lp System and method for seviceability during execution of a video conferencing application using intelligent contextual session management
CN112967185A (en) * 2021-02-18 2021-06-15 复旦大学 Image super-resolution algorithm based on frequency domain loss function
US12045315B2 (en) * 2021-02-24 2024-07-23 Sony Group Corporation Neural network-based image-to-image translation
US11341699B1 (en) 2021-03-09 2022-05-24 Carmax Enterprise Services, Llc Systems and methods for synthetic image generation
CN112884673A (en) * 2021-03-11 2021-06-01 西安建筑科技大学 Reconstruction method for missing information between coffin chamber mural blocks of improved loss function SinGAN
CN112819731B (en) * 2021-03-19 2021-11-05 广东众聚人工智能科技有限公司 Gray scale image enhancement method, device, computer equipment and storage medium
CN112991177B (en) * 2021-03-23 2024-08-09 数量级(上海)信息技术有限公司 Infrared image super-resolution method based on antagonistic neural network
JP2022150562A (en) * 2021-03-26 2022-10-07 キヤノン株式会社 Image processing apparatus, image processing method, and program
CN113191495A (en) * 2021-03-26 2021-07-30 网易(杭州)网络有限公司 Training method and device for hyper-resolution model and face recognition method and device, medium and electronic equipment
US11271984B1 (en) 2021-03-29 2022-03-08 International Business Machines Corporation Reduced bandwidth consumption via generative adversarial networks
WO2022204868A1 (en) * 2021-03-29 2022-10-06 深圳高性能医疗器械国家研究院有限公司 Method for correcting image artifacts on basis of multi-constraint convolutional neural network
CN112991220B (en) * 2021-03-29 2024-06-28 深圳高性能医疗器械国家研究院有限公司 Method for correcting image artifact by convolutional neural network based on multiple constraints
CN113160055A (en) * 2021-04-07 2021-07-23 哈尔滨理工大学 Image super-resolution reconstruction method based on deep learning
CN113129231B (en) * 2021-04-07 2023-05-30 中国科学院计算技术研究所 Method and system for generating high-definition image based on countermeasure generation network
CN113516585B (en) * 2021-04-12 2023-04-11 中国科学院西安光学精密机械研究所 Optical remote sensing image quality improvement method based on non-pairwise
CN112801881B (en) * 2021-04-13 2021-06-22 湖南大学 High-resolution hyperspectral calculation imaging method, system and medium
CN113160101B (en) * 2021-04-14 2023-08-01 中山大学 Method for synthesizing high-simulation image
CN113160056A (en) * 2021-04-19 2021-07-23 东南大学 Deep learning-based noisy image super-resolution reconstruction method
CN113160057B (en) * 2021-04-27 2023-09-05 沈阳工业大学 RPGAN image super-resolution reconstruction method based on generation countermeasure network
CN113191949B (en) * 2021-04-28 2023-06-20 中南大学 Multi-scale super-resolution pathology image digitizing method, system and storage medium
CN112884657B (en) * 2021-05-06 2021-07-16 中南大学 Face super-resolution reconstruction method and system
CN113379597A (en) * 2021-05-19 2021-09-10 宜宾电子科技大学研究院 Face super-resolution reconstruction method
CN113269256B (en) * 2021-05-26 2024-08-27 广州密码营地信息科技有限公司 Construction method and application of MiSrc-GAN medical image model
CN113269691B (en) * 2021-05-27 2022-10-21 北京卫星信息工程研究所 SAR image denoising method for noise affine fitting based on convolution sparsity
CN113379602B (en) * 2021-06-08 2024-02-27 中国科学技术大学 Light field super-resolution enhancement method using zero sample learning
CN113269818B (en) * 2021-06-09 2023-07-25 河北工业大学 Deep learning-based seismic data texture feature reconstruction method
CN113469959B (en) * 2021-06-16 2024-07-23 北京理工大学 Countermeasure training optimization method and device based on quality defect imaging model
CN113421188B (en) * 2021-06-18 2024-01-05 广东奥普特科技股份有限公司 Method, system, device and storage medium for image equalization enhancement
CN113538263A (en) * 2021-06-28 2021-10-22 江苏威尔曼科技有限公司 Motion blur removing method, medium, and device based on improved DeblurgAN model
US11610284B2 (en) * 2021-07-09 2023-03-21 X Development Llc Enhancing generative adversarial networks using combined inputs
US20230029188A1 (en) * 2021-07-26 2023-01-26 GE Precision Healthcare LLC Systems and methods to reduce unstructured and structured noise in image data
CN113781316B (en) * 2021-07-28 2024-05-17 杭州火烧云科技有限公司 High-resolution image restoration method and restoration system based on countermeasure generation network
CN113688694B (en) * 2021-08-03 2023-10-27 上海交通大学 Method and device for improving video definition based on unpaired learning
CN113610731B (en) * 2021-08-06 2023-08-08 北京百度网讯科技有限公司 Method, apparatus and computer program product for generating image quality improvement model
CN113538246B (en) * 2021-08-10 2023-04-07 西安电子科技大学 Remote sensing image super-resolution reconstruction method based on unsupervised multi-stage fusion network
CN113538247B (en) * 2021-08-12 2022-04-15 中国科学院空天信息创新研究院 Super-resolution generation and conditional countermeasure network remote sensing image sample generation method
CN113689337B (en) * 2021-08-27 2023-09-19 华东师范大学 Ultrasonic image super-resolution reconstruction method and system based on generation countermeasure network
CN113792723B (en) * 2021-09-08 2024-01-16 浙江力石科技股份有限公司 Optimization method and system for identifying stone carving characters
CN113762277B (en) * 2021-09-09 2024-05-24 东北大学 Multiband infrared image fusion method based on Cascade-GAN
CN115841522A (en) * 2021-09-18 2023-03-24 华为技术有限公司 Method, apparatus, storage medium, and program product for determining image loss value
CN113793267B (en) * 2021-09-18 2023-08-25 中国石油大学(华东) Self-supervision single remote sensing image super-resolution method based on cross-dimension attention mechanism
CN113902617B (en) * 2021-09-27 2024-06-14 中山大学·深圳 Super-resolution method, device, equipment and medium based on reference image
CN113837945B (en) * 2021-09-30 2023-08-04 福州大学 Display image quality optimization method and system based on super-resolution reconstruction
CN114063168B (en) * 2021-11-16 2023-04-21 电子科技大学 Artificial intelligent noise reduction method for seismic signals
CN114419630A (en) * 2021-11-26 2022-04-29 王希佳 Text recognition method based on neural network search in automatic machine learning
CN114202460B (en) * 2021-11-29 2024-09-06 上海艾麒信息科技股份有限公司 Super-resolution high-definition reconstruction method, system and equipment for different damage images
EP4395329A1 (en) 2021-11-30 2024-07-03 Samsung Electronics Co., Ltd. Method for allowing streaming of video content between server and electronic device, and server and electronic device for streaming video content
CN114331821B (en) * 2021-12-29 2023-09-22 中国人民解放军火箭军工程大学 Image conversion method and system
CN114331903B (en) * 2021-12-31 2023-05-12 电子科技大学 Image restoration method and storage medium
CN116563122A (en) * 2022-01-27 2023-08-08 安翰科技(武汉)股份有限公司 Image processing method, data set acquisition method and image processing device
CN114519679B (en) * 2022-02-21 2022-10-21 安徽大学 Intelligent SAR target image data enhancement method
US12121382B2 (en) 2022-03-09 2024-10-22 GE Precision Healthcare LLC X-ray tomosynthesis system providing neural-net guided resolution enhancement and thinner slice generation
US11785262B1 (en) 2022-03-16 2023-10-10 International Business Machines Corporation Dynamic compression of audio-visual data
CN114820303A (en) * 2022-03-24 2022-07-29 南京邮电大学 Method, system and storage medium for reconstructing super-resolution face image from low-definition image
CN114792287B (en) * 2022-03-25 2024-10-15 南京航空航天大学 Medical ultrasonic image super-resolution reconstruction method based on multi-image fusion
CN114757827B (en) * 2022-03-28 2024-10-22 浙江大学 Universal real-world single-frame image super-resolution enhancement method
CN114708146B (en) * 2022-04-05 2024-11-05 西南财经大学 2S times super-resolution recovery device for JPG image data entity
CN114782247A (en) * 2022-04-06 2022-07-22 温州理工学院 Image super-resolution reconstruction method
CN114677281B (en) * 2022-04-12 2024-05-31 西南石油大学 FIB-SEM super-resolution method based on generation of countermeasure network
CN114862699B (en) * 2022-04-14 2022-12-30 中国科学院自动化研究所 Face repairing method, device and storage medium based on generation countermeasure network
US11907186B2 (en) 2022-04-21 2024-02-20 Bank Of America Corporation System and method for electronic data archival in a distributed data network
CN114972073B (en) * 2022-04-24 2024-04-30 武汉大学 Image demosaicing method for generating countermeasure network SRGAN based on super resolution
WO2023206343A1 (en) * 2022-04-29 2023-11-02 中国科学院深圳先进技术研究院 Image super-resolution method based on image pre-training strategy
CN115063293B (en) * 2022-05-31 2024-05-31 北京航空航天大学 Rock microscopic image super-resolution reconstruction method adopting generation of countermeasure network
US11689601B1 (en) 2022-06-17 2023-06-27 International Business Machines Corporation Stream quality enhancement
DE102022116464A1 (en) 2022-07-01 2024-01-04 Bayerische Motoren Werke Aktiengesellschaft Device and method for dynamic adaptation and display of relevant infotainment areas via an output unit of a vehicle
CN115205117B (en) * 2022-07-04 2024-03-08 中国电信股份有限公司 Image reconstruction method and device, computer storage medium and electronic equipment
CN114972332B (en) * 2022-07-15 2023-04-07 南京林业大学 Bamboo laminated wood crack detection method based on image super-resolution reconstruction network
CN115439361B (en) * 2022-09-02 2024-02-20 江苏海洋大学 Underwater image enhancement method based on self-countermeasure generation countermeasure network
CN115170399A (en) * 2022-09-08 2022-10-11 中国人民解放军国防科技大学 Multi-target scene image resolution improving method, device, equipment and medium
CN115936983A (en) * 2022-11-01 2023-04-07 青岛哈尔滨工程大学创新发展中心 Method and device for super-resolution of nuclear magnetic image based on style migration and computer storage medium
CN115578265B (en) * 2022-12-06 2023-04-07 中汽智联技术有限公司 Point cloud enhancement method, system and storage medium
CN116132239B (en) * 2023-01-31 2024-07-26 齐鲁工业大学(山东省科学院) OFDM channel estimation method adopting pre-activation residual error unit and super-resolution network
CN116452435A (en) * 2023-03-10 2023-07-18 支付宝(杭州)信息技术有限公司 Image high-quality harmonious model training and device
CN118781376A (en) * 2023-03-30 2024-10-15 北京字跳网络技术有限公司 Model training method, picture generating device, medium and electronic equipment
CN116723305B (en) * 2023-04-24 2024-05-03 南通大学 Virtual viewpoint quality enhancement method based on generation type countermeasure network
CN117036161A (en) * 2023-06-13 2023-11-10 河海大学 Dam defect recovery method based on generation type countermeasure network
CN116862803B (en) * 2023-07-13 2024-05-24 北京中科闻歌科技股份有限公司 Reverse image reconstruction method, device, equipment and readable storage medium
CN116663619B (en) * 2023-07-31 2023-10-13 山东科技大学 Data enhancement method, device and medium based on GAN network
CN116721316A (en) * 2023-08-11 2023-09-08 之江实验室 Model training and geomagnetic chart optimizing method, device, medium and equipment
CN117391975B (en) * 2023-12-13 2024-02-13 中国海洋大学 Efficient real-time underwater image enhancement method and model building method thereof
CN117575916B (en) * 2024-01-19 2024-04-30 青岛漫斯特数字科技有限公司 Image quality optimization method, system, equipment and medium based on deep learning
CN118333864A (en) * 2024-04-09 2024-07-12 电子科技大学 Image processing method, computer device, and storage medium
CN118200573B (en) * 2024-05-17 2024-08-23 天津大学 Image compression method, training method and device of image compression model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5052043A (en) * 1990-05-07 1991-09-24 Eastman Kodak Company Neural network with back propagation controlled through an output confidence measure

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11748846B2 (en) 2018-07-03 2023-09-05 Nanotronics Imaging, Inc. Systems, devices, and methods for providing feedback on and improving the accuracy of super-resolution imaging
US11948270B2 (en) 2018-07-03 2024-04-02 Nanotronics Imaging , Inc. Systems, devices, and methods for providing feedback on and improving the accuracy of super-resolution imaging
US11416967B2 (en) * 2020-01-03 2022-08-16 Beijing Baidu Netcom Science And Technology Co., Ltd. Video processing method, apparatus, device and storage medium
US20220262106A1 (en) * 2021-02-18 2022-08-18 Robert Bosch Gmbh Device and method for training a machine learning system for generating images
WO2023229589A1 (en) * 2022-05-25 2023-11-30 Innopeak Technology, Inc. Real-time video super-resolution for mobile devices
US20240161365A1 (en) * 2022-11-10 2024-05-16 International Business Machines Corporation Enhancing images in text documents
US12079912B2 (en) * 2022-11-10 2024-09-03 International Business Machines Corporation Enhancing images in text documents

Also Published As

Publication number Publication date
WO2018053340A1 (en) 2018-03-22
US11024009B2 (en) 2021-06-01
US20180075581A1 (en) 2018-03-15

Similar Documents

Publication Publication Date Title
US20210264568A1 (en) Super resolution using a generative adversarial network
Ledig et al. Photo-realistic single image super-resolution using a generative adversarial network
Lei et al. Coupled adversarial training for remote sensing image super-resolution
Menon et al. Pulse: Self-supervised photo upsampling via latent space exploration of generative models
Qayyum et al. Untrained neural network priors for inverse imaging problems: A survey
Li et al. Diffusion Models for Image Restoration and Enhancement--A Comprehensive Survey
Li et al. Survey of single image super‐resolution reconstruction
EP3298576B1 (en) Training a neural network
Dosovitskiy et al. Generating images with perceptual similarity metrics based on deep networks
Prakash et al. Fully unsupervised diversity denoising with convolutional variational autoencoders
Li et al. FilterNet: Adaptive information filtering network for accurate and fast image super-resolution
Zhou et al. High-frequency details enhancing DenseNet for super-resolution
Liu et al. Learning cascaded convolutional networks for blind single image super-resolution
Wang et al. Dclnet: Dual closed-loop networks for face super-resolution
Dastmalchi et al. Super-resolution of very low-resolution face images with a wavelet integrated, identity preserving, adversarial network
Krishnan et al. SwiftSRGAN-Rethinking super-resolution for efficient and real-time inference
Cherian et al. A Novel AlphaSRGAN for Underwater Image Super Resolution.
Ates et al. Deep learning-based blind image super-resolution with iterative kernel reconstruction and noise estimation
Liu et al. Component semantic prior guided generative adversarial network for face super-resolution
Sarah et al. Evaluating the effect of super-resolution for automatic plant disease detection: application to potato late blight detection
Dixit et al. A Review of Single Image Super Resolution Techniques using Convolutional Neural Networks
Sharma et al. Multilevel progressive recursive dilated networks with correlation filter (MPRDNCF) for image super-resolution
Chilukuri et al. Analysing Of Image Quality Computation Models Through Convolutional Neural Network
Viriyavisuthisakul et al. Parametric regularization loss in super-resolution reconstruction
Dhawan et al. Improving resolution of images using Generative Adversarial Networks

Legal Events

Date Code Title Description
AS Assignment

Owner name: TWITTER, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHI, WENZHE;LEDIG, CHRISTIAN;WANG, ZEHAN;AND OTHERS;SIGNING DATES FROM 20171002 TO 20171015;REEL/FRAME:056182/0221

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: MORGAN STANLEY SENIOR FUNDING, INC., MARYLAND

Free format text: SECURITY INTEREST;ASSIGNOR:TWITTER, INC.;REEL/FRAME:062079/0677

Effective date: 20221027

Owner name: MORGAN STANLEY SENIOR FUNDING, INC., MARYLAND

Free format text: SECURITY INTEREST;ASSIGNOR:TWITTER, INC.;REEL/FRAME:061804/0086

Effective date: 20221027

Owner name: MORGAN STANLEY SENIOR FUNDING, INC., MARYLAND

Free format text: SECURITY INTEREST;ASSIGNOR:TWITTER, INC.;REEL/FRAME:061804/0001

Effective date: 20221027

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION