[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110956582B - Image processing method, device and equipment - Google Patents

Image processing method, device and equipment Download PDF

Info

Publication number
CN110956582B
CN110956582B CN201811123662.2A CN201811123662A CN110956582B CN 110956582 B CN110956582 B CN 110956582B CN 201811123662 A CN201811123662 A CN 201811123662A CN 110956582 B CN110956582 B CN 110956582B
Authority
CN
China
Prior art keywords
image
neural network
resolution
value
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811123662.2A
Other languages
Chinese (zh)
Other versions
CN110956582A (en
Inventor
关婧玮
冯万良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Technology Group Co Ltd
Original Assignee
TCL Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Technology Group Co Ltd filed Critical TCL Technology Group Co Ltd
Priority to CN201811123662.2A priority Critical patent/CN110956582B/en
Publication of CN110956582A publication Critical patent/CN110956582A/en
Application granted granted Critical
Publication of CN110956582B publication Critical patent/CN110956582B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention is applicable to the technical field of image processing, and provides an image processing method, an image processing device and image processing equipment. According to the embodiment of the invention, the attention focus of the attention model, namely the weight value, is introduced into the process of constructing the loss function, the super-resolution algorithm in the neural network model is improved by improving the loss function, and more details are supplemented for the position with larger weight value in the image, so that the image quality can be improved, the super-resolution image which accords with the human eye vision rule better is obtained, and the quality stability of the generated super-resolution image is improved.

Description

Image processing method, device and equipment
Technical Field
The present invention belongs to the field of image processing technology, and in particular, relates to an image processing method, apparatus and device.
Background
The prior super-resolution image processing technology has very wide application, for example, in the aspect of image transmission, the image can be subjected to dimension reduction processing at an input end, and then the image is subjected to dimension lifting processing at an output end through the super-resolution technology, so that the transmitted data volume can be greatly reduced, and the transmission speed is improved and the network transmission pressure is relieved; in the aspect of image storage, the image can be stored after the dimension reduction processing, so that the size of the image can be reduced, the storage pressure is relieved, and when the images are to be checked and applied, the images are subjected to dimension increasing processing through a super-resolution technology, and the details of the image are supplemented, so that the high-dimension image is obtained.
The core of the super-resolution method is the fitting of low-resolution images to high-resolution images. With the continuous development of deep learning technology, some super-resolution methods apply neural networks to solving fitting functions, so that the super-resolution effect is greatly improved. By minimizing the difference between the generated super-resolution map and the original high-resolution map during the training process, parameters in the neural network are trained, and high-quality super-resolution images can be generated through the trained neural network. How to measure the difference between the two images is very important, determines the quality of the generated super-resolution image.
The existing super-resolution method is mainly used for comparing at the pixel level when measuring the difference between the super-resolution map and the original high-resolution map. For example, the original high-resolution map is a, the low-resolution map after dimension reduction is B, and the super-resolution map generated by the super-resolution technique is C. In training a neural network, the loss function is defined as a form of mean square error (MSE, mean Square Error) and is calculated as follows:
where a and b are the height and width of the image. (x, y) is the position coordinates of the image. Defining the loss function in MSE is advantageous to get a good PSNR (Peak Signalto Noise Ratio ) value when evaluating the algorithm. However, the following problems also exist:
comparing the images at the pixel level does not fully conform to the visual laws of the human eye. Because of the different parts of the image, the effects on human eye senses are quite different due to the combined effects of various factors such as the position, the content, the color and the like. For example, the degree of influence of the green background and the bird on the human eye is definitely different, and for the green background, even if we just use simple Bilinear (Bilinear interpolation ) or Bicubic (Bicubic interpolation, bicubic interpolation) for interpolation calculation, the human eye does not perceive how much difference. However, the positions of the birds are different, in the process of image superdivision, the reservation and the reduction process of details of the birds can greatly influence the feeling of human eyes on the whole picture, and the traditional training of the neural network by using the MSE as the loss function does not consider the difference of the importance degree of each position, but the unified processing is carried out, so that the quality of the super-resolution picture generated by the super-resolution method based on the MSE loss function cannot be ensured.
Disclosure of Invention
In view of the above, the embodiments of the present invention provide an image processing method, apparatus and device, so as to solve the problem of low image processing efficiency and low image quality in the existing super-resolution technology.
A first aspect of an embodiment of the present invention provides an image processing method, including:
establishing a loss function based on weight values obtained by the attention model for attention focuses at different positions in the image;
training a neural network model by using the loss function;
inputting an image to be processed into the trained neural network model, and generating a super-resolution image of the image to be processed.
A second aspect of an embodiment of the present invention provides an image processing apparatus including:
the loss function building unit is used for building a loss function based on weight values obtained by the attention model on attention focuses at different positions in the image;
the neural network model training unit is used for training the neural network model by utilizing the loss function;
the super-resolution image generation unit is used for inputting the image to be processed into the trained neural network model and generating the super-resolution image of the image to be processed.
A third aspect of an embodiment of the present invention provides an image processing apparatus including:
The image processing device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the image processing method provided in the first aspect of the embodiment of the invention when executing the computer program.
Wherein the computer program comprises:
the loss function building unit is used for building a loss function based on weight values obtained by the attention model on attention focuses at different positions in the image;
the neural network model training unit is used for training the neural network model by utilizing the loss function;
the super-resolution image generation unit is used for inputting the image to be processed into the trained neural network model and generating the super-resolution image of the image to be processed.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the image processing method provided in the first aspect of the embodiments of the present invention.
Wherein the computer program comprises:
the loss function building unit is used for building a loss function based on weight values obtained by the attention model on attention focuses at different positions in the image;
The neural network model training unit is used for training the neural network model by utilizing the loss function;
the super-resolution image generation unit is used for inputting the image to be processed into the trained neural network model and generating the super-resolution image of the image to be processed.
Compared with the prior art, the embodiment of the invention has the beneficial effects that: the super-resolution processing of the image to be processed can be realized by establishing a loss function based on weight values obtained by the attention model on attention focuses at different positions in the image, training a neural network model by utilizing the loss function, inputting the image to be processed into the trained neural network model, and generating the super-resolution image of the image to be processed. According to the embodiment of the invention, the attention focus of the attention model, namely the weight value, is introduced into the process of constructing the loss function, the super-resolution algorithm in the neural network model is improved by improving the loss function, and more details are supplemented for the position with larger weight value in the image, so that the image quality can be improved, the super-resolution image which accords with the human eye vision rule better is obtained, and the quality stability of the generated super-resolution image is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an implementation of an image processing method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a specific implementation of a method for calibrating an attention model according to an embodiment of the present invention;
FIG. 3 is a flowchart of a specific implementation of a method for calculating a weight value of a sample image based on an attention model according to an embodiment of the present invention;
FIG. 4 is a flowchart of a specific implementation of a method for adjusting parameters of a neural network model according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention;
fig. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to illustrate the technical scheme of the invention, the following description is made by specific examples. Referring to fig. 1, fig. 1 shows an implementation flow of an image processing method according to an embodiment of the present invention, which is described in detail below:
in step S101, a loss function is established based on the weight values obtained by the attention model for the focus of attention at different positions in the image.
In the embodiment of the invention, the attention model is used for simulating the attention process of people to things, and is particularly used for simulating the prediction of the attention of the human eye vision system to different things in the image, namely, the attention prediction is used for obtaining the attention degree of the human eye to different positions in the image.
Here, when the attention model processes an image, the calculated amount of the convolutional neural network increases linearly with the increase of the picture pixels, if attention is selectively allocated to the vision of a reference person, a series of regions, that is, regions which we want to pay attention to, can be selectively extracted from the picture or the video, and the weight values of different image regions are different due to the difference of the attention focus. For example, when we watch a picture, we can see the whole picture, but when we watch deeply and carefully, we focus on only a small part of eyes, and the human brain mainly focuses on the small part of patterns, that is, the human brain focuses on the whole picture at this time is not balanced, but is distinguished by a certain weight, and the attention degree of different positions in the image, that is, the visual law of human eyes, is that is, the attention model simulates the attention process of human eyes on different positions of the image.
Here, when a sample image is input into the attention model, a weight value M of the sample image can be obtained b Based on the weight value M b The loss function is reconstructed, so that the image quality of the finally obtained super-resolution image is better, the visual law of human eyes is more met, and the user experience is better.
In the embodiment of the invention, the loss function (loss function) is used for measuring the inconsistency degree of the predicted value and the true value of the model, and is a non-negative real value function, and the smaller the loss function, the higher the accuracy of the model. The loss function established in the embodiment of the invention is specifically as follows:
wherein A represents a first image, C represents a second image, A (x, y) represents the first image in placeSetting an intensity value of (x, y), C (x, y) representing the intensity value of the second image at the location (x, y); a and b represent the height and width of the image; m represents a super-resolution coefficient, for example, m=2, and the image size is enlarged by 4 times; if m=3, the image size is enlarged by 9 times; m is M b (x, y) represents the weight value of the image at the position (x, y), namely the importance of the position (x, y), which is a weight value calculated based on the attention model, and the weight values of all the position coordinates (x, y) in the image form the weight value M of the whole image b The method comprises the steps of carrying out a first treatment on the surface of the Wherein the first image is an original image; the second image is an image output by the neural network model after the low-resolution image corresponding to the original image is input into the neural network model in the training process.
It can be understood that the image output by the attention model is a weight map, and each position in the weight map has a corresponding weight value to represent the importance degree of the position, and the weight values corresponding to all positions form the weight value of the weight map. By means of the weight graph, the existing Loss function is improved, the Loss function Loss (A, C) is built, different processing is conducted on different positions in an image in an adaptive mode, and more accurate guidance can be given to a neural network model based on different weight values of different positions in the image, so that the quality of the finally obtained image is better, the visual law of human eyes is more met, and user experience is better.
It can be understood that the sizes of the first image and the second image in the embodiment of the invention are consistent, so that the first image and the second image can be better distinguished, the neural network model can be adjusted according to the distinction, and the first image and the second image are continuously approximated through continuous training of the neural network model, so that the quality of the finally generated super-resolution image is improved. That is, by comparing the difference between the first image and the second image, the effect of the super-resolution algorithm in the neural network model is evaluated, so that the quality of the super-resolution image finally generated by the neural network model is better.
In step S102, the neural network model is trained using the loss function.
In an embodiment of the present invention, the neural network model includes, but is not limited to, convolutional neural network CNN and cyclic neural network RNN.
Optionally, step S102 specifically includes:
based on the established loss function, each parameter in the neural network model is corrected and trained.
In the embodiment of the invention, the loss function can act on each parameter in the neural network model through a gradient descent method based on the established loss function, and the parameters are corrected and trained. Further, based on an error back propagation algorithm, correction of the current error for each parameter is calculated through a gradient descent method, each parameter of the whole neural network model is adjusted according to the correction, so that the error of the whole neural network model is continuously reduced, and the first image and the second image are continuously approximated. For example, if image a and image C are very similar, the loss value loss obtained based on the loss function is smaller, and the correction effect on the deep neural network is smaller in the error back propagation process; if the difference between the image A and the image C is large, the loss value loss obtained based on the loss function is large, and in the error back propagation process, the parameters in the neural network model are greatly corrected towards the reverse direction of loss reduction, and the image A and the image C can be continuously approximated through continuous training of the neural network model.
The neural network model is trained through the built loss function, so that the neural network model can supplement more details for the corresponding image position with a larger weight value, the image quality of the position is better, more attention is not given to the image position with a smaller weight value, or the details are not needed to be supplemented to the position, the trained neural network model can rapidly perform super-resolution operation to obtain an image with better image quality, the quality stability of the generated super-resolution image is improved, and the image is a super-resolution image which accords with the human eye vision rule better, so that the user experience is better.
In step S103, the image to be processed is input into the trained neural network model, and a super-resolution image of the image to be processed is generated.
In the embodiment of the invention, the image to be processed is an image with improved pixel quality by a super-resolution technology, including but not limited to a low-resolution image, an old movie and a monitoring image.
The image to be processed is an image which needs super-resolution processing. The super-resolution image of the image to be processed can be generated by inputting the image to be processed into the trained neural network model, the quality of the image is improved, and the neural network model is subjected to correction training based on a loss function established by the attention model, so that the quality of the super-resolution image obtained by the neural network model is very high, the obtained super-resolution image is more in line with the visual law of human eyes, and the application scene of the super-resolution technology is widened.
It can be understood that the sizes of the low-resolution image input in the neural network model and the super-resolution image output finally are inconsistent, and the super-resolution image with the same size as the original image is output after the low-resolution image is subjected to super-resolution processing by the neural network model.
In the embodiment of the invention, the super-resolution processing of the image to be processed can be realized by establishing the loss function based on the weight values obtained by the attention model for the attention focuses at different positions in the image, inputting the image to be processed into the trained neural network model after training the neural network model by utilizing the loss function, and generating the super-resolution image of the image to be processed. According to the embodiment of the invention, the attention focus of the attention model, namely the weight value, is introduced into the process of constructing the loss function, the super-resolution algorithm in the neural network model is improved by improving the loss function, and more details are supplemented for the position with larger weight value in the image, so that the image quality can be improved, the super-resolution image which accords with the human eye vision rule better is obtained, and the quality stability of the generated super-resolution image is improved.
Optionally, before step S101, the embodiment of the present invention further provides a specific implementation flow of a method for calibrating an attention model as described in fig. 2, which is described in detail below:
In step S201, an attention model is constructed.
In the embodiment of the invention, attention degree of a human eye vision system on different things in an image is simulated by using an attention model, and the attention model is constructed first. The attention model is an attention model capable of predicting the attention intensity of different positions in the image, i.e. the attention model is constructed based on the difference of the attraction of the different positions of the image to the human eye.
In step S202, a sample image is acquired, and the sample image is input into an eye tracker, so as to obtain a heat point diagram corresponding to the sample image.
In an embodiment of the invention, the sample image is an image selected for training the attention model. Before the sample image is input into the attention model, a weight reference value of the sample image is obtained, specifically, after a thermal point diagram is obtained through an eye tracker, the thermal point diagram is quantized to obtain a weight value of each position, and the weight reference value of the whole sample image is integrally formed.
Here, at least one set of images may be included in the sample image, and each set of images includes an original image, which is a high resolution image, for comparison with the image output by the neural network model, thereby evaluating the effect of the neural network model.
The eye tracker mainly helps to collect which points in an image are easier to be focused by people by collecting the motion track of the eye sight, so as to determine the focus of the eye on the image. For example, a person is concerned about a location where the eye remains for a longer period of time, and there are many acquisition points in the data acquired by the eye tracker that form a corresponding hot spot of the image, or that are the same. By means of the eye tracker, a plurality of hot spot areas of the sample image are obtained, and a hot spot map is formed based on the hot spot areas, namely, a hot spot map corresponding to the sample image is obtained.
In step S203, the hotspot graph is quantized according to a preset quantization formula to obtain a weight reference value M label
In the embodiment of the invention, the process of carrying out quantization processing on the hotspot graph is to carry out quantization calculation on the hotspot graph according to the following quantization formula:
M label (x,y)=∑∑(HM(x,y)-(min(HM))/(max(HM)-min(HM))
wherein M is label (x, y) represents the weight reference value of the hotspot graph HM at the position (x, y), M label A weight reference value representing the entire hotspot graph; HM (x, y) represents the hotspot graph HM at location (x, y).
Here, weight reference value M label Mainly used as a reference value for calibrating the attention model and a weight value M obtained by directly inputting a sample image into the attention model b And comparing, and training the attention model according to the difference between the two models to achieve the aim of calibration, so that the attention model can better simulate the prediction of the eye vision system for attention, and the simulation accuracy is improved.
In step S204, the sample image is input into the attention model to obtain a weight value M of the sample image b
In the embodiment of the invention, before the attention model is calibrated, the weight value M corresponding to the sample image is obtained according to the input sample image b But for various reasons, the weight value M b May not necessarily be the most suitable point of interest for the human eye vision system, and needs to be based on the weight reference value M label The parameters in the attention model are corrected, so that the attention model can better simulate the prediction of the attention of the human eye vision system, and the simulation accuracy is improved.
In step S205, the weight reference value M is used label As a reference value, according to the weight reference value M label And weight value M b For the differences of the attention modes And (5) performing fine adjustment and calibration on parameters in the model to obtain a calibrated attention model.
In the embodiment of the invention, the weight reference value M obtained based on the eye movement instrument label For the reference value, determining the weight value M of the attention model obtained for the current input sample image b And weight reference value M label And (3) fine tuning and calibrating the parameters in the attention model, so that a calibrated attention model is obtained, the parameters in the neural network model can be better corrected by a loss function established based on the attention model, and the effect of image superdivision is improved.
Optionally, referring to fig. 3 for a specific implementation flow of step S204, an embodiment of the present invention provides a specific implementation flow of calculating a weight value of a sample image based on an attention model as shown in fig. 3, which is described in detail below:
in step S301, the dimension reduction process is performed on the sample image, so as to obtain a corresponding low resolution image.
In the embodiment of the invention, the purpose of performing dimension reduction processing on the sample image is to obtain a low-resolution image, so that the attention model can rapidly process the low-resolution image to obtain a corresponding weight value M s
In step S302, the low-resolution image is input into the attention model to obtain a weight value M of the low-resolution image s
In the embodiment of the invention, because the attention model does not have high requirements on the pixels of the image, after the low-resolution image is input into the attention model, the attention model can quickly obtain the weight value M corresponding to the low-resolution image s That is to say the weight value M s Corresponding to an estimated value of the focus of attention of the low resolution image at the location.
It will be appreciated that, because the pixels of the low resolution image are relatively low, the image size is also a small image after compression, and therefore, the positions of the pixels in the low resolution image are reduced considerably, so that the image processing process is very fast, and the focus of attention at different positions in the image is not ignored even though the image processing process is very fast, that is, the weight value at the position (x, y) in the low resolution image is the same as the weight value of the sample image or the super resolution image.
In step S303, according to the weight value M s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b
In the embodiment of the invention, the bicubic interpolation algorithm is a more complex interpolation mode, which can create smoother image edges than bilinear interpolation. Bicubic interpolation algorithms are commonly used in a part of image processing software, printer drivers and digital cameras to magnify the original image or some area of the original image.
Here, although the weight value at the position (x, y) in the low-resolution image is the same as that of the sample image or the super-resolution image, since it is not the same as that of the sample image or the super-resolution image, that is, the number of pixels included is not the same, if only the weight value M of the low-resolution image is used s Weight value M as sample image b And weight reference value M label The difference between the two is very large, so that the final result is inaccurate, and the weight value M is needed s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b Thereby comparing the weight value with M b M label The difference between the two models is used for calibrating the attention model, so that the obtained attention model can better simulate the prediction of the eye vision system on attention, and the simulation accuracy is improved.
It can be appreciated that the size of the sample image is the same as the size of the super-resolution image, so that the sample image and the super-resolution image are comparable, and the effect of the super-resolution algorithm in the neural network model can be better evaluated by comparing the difference between the sample image and the super-resolution image. That is, in the process of obtaining the weight value of the sample image through the attention model, the input and output images are two images with the same size, namely the two images have the same dimension and have a one-to-one correspondence relationship in position, and the effect of the super-resolution algorithm in the neural network model can be better evaluated by comparing the difference of the two images.
Optionally, after step S303, the embodiment of the present invention further provides a specific implementation flow of a method for adjusting parameters of a neural network model, as shown in fig. 4, which is described in detail below:
in step S401, the low resolution image is input into the neural network model, and a third image output from the neural network model is acquired.
In this embodiment, the third image is an image output after the neural network model performs super-resolution processing on the low-resolution image. For example, a high-quality high-resolution image (image a) may be subjected to a dimensionality reduction method to obtain a low-resolution image (image B), and the image B is used as an input of the neural network model. Thus, after a high-quality and high-resolution image (image C) is obtained through the neural network model, the effect of the neural network model can be evaluated by comparing the difference between the image A and the image C. Wherein image a corresponds to the original image, image B corresponds to the low resolution image corresponding to the original image, and image C corresponds to the third image.
In step S402, an original image corresponding to the low-resolution image is input into the calibrated attention model to obtain a weight value M of the original image b
In step S403, according to the image information of the original image corresponding to the low resolution image, the image information of the third image, and the weight value M b And calculating a first value by the loss function, wherein the first value is a loss function value corresponding to the current neural network model.
In the embodiment of the present invention, the image information may include the number of pixels of the image and the position information of each pixel. Based on the image information of the original image and the third image, and the weight of the original imageValue M b Substituting the first value into the loss function can calculate a first value, and the first value can represent the similarity degree of the original image and the third image, so that the super-resolution effect of the current neural network model is reflected.
In step S404, parameters of the neural network model are adjusted according to the first value.
In this embodiment, the parameters of the neural network model may be adjusted according to the first value. For example, the value of each parameter may be corrected by applying a loss function to each parameter in the network by a gradient descent method. During the back propagation, the corrective action of the current error on each parameter is calculated by gradient descent. Following this correction, the parameters of the overall network model are adjusted so that the errors of the overall network model are continuously reduced.
For example, if the original image and the third image are very similar, the first value is smaller and the correction of the network parameters during the error back propagation is smaller; if the original image and the third image differ significantly, the first value is relatively large and the network parameters are substantially corrected in the direction of loss reduction during the error back propagation. Thus, by training the neural network model, the original image and the third image can be made closer.
In this embodiment, the low resolution image is input to the third image obtained by the neural network model, and the original image corresponding to the low resolution image is input to the weight value M obtained by the attention model b The value of the loss function is calculated according to the original image and the third image, and then the parameters of the neural network model are adjusted, so that the network parameters are more suitable for super-resolution processing of the image, training of the neural network model is realized, and the quality of the super-resolution image generated by the neural network model is improved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
Corresponding to an image processing method described in the above embodiments, fig. 5 shows a schematic diagram of an image processing apparatus provided in an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown.
Referring to fig. 5, the apparatus includes:
a loss function establishing unit 51, configured to establish a loss function based on weight values obtained by the attention model for attention focuses at different positions in the image;
a neural network model training unit 52 for training the neural network model using the loss function;
the super-resolution image generating unit 53 is configured to input an image to be processed into the trained neural network model, and generate a super-resolution image of the image to be processed.
Optionally, the loss function is:
wherein a represents the first image, C represents the second image, a (x, y) represents the intensity value of the first image at the location (x, y), and C (x, y) represents the intensity value of the second image at the location (x, y); a and b represent the height and width of the image; m represents a super-resolution coefficient; m is M b (x, y) represents the weight value of the image at the position (x, y), and the weight values of all the position coordinates (x, y) in the image form the weight value M of the whole image b The method comprises the steps of carrying out a first treatment on the surface of the Wherein the first image is an original image; the second image is an image output by the neural network model after the low-resolution image corresponding to the original image is input into the neural network model in the training process.
Optionally, the apparatus further includes:
an attention model construction unit for constructing an attention model;
the system comprises a heat point diagram acquisition unit, a control unit and a control unit, wherein the heat point diagram acquisition unit is used for acquiring a sample image, inputting the sample image into an eye movement instrument and obtaining a heat point diagram corresponding to the sample image;
the quantization processing unit is used for carrying out quantization processing on the hotspot graph according to a preset quantization formula to obtain a weight reference value M label
Weight value M b A calculation unit for inputting the sample image into the attention model to obtain a weight value M of the sample image b
An attention model calibration unit for referencing the value M with the weight label As a reference value, according to the weight reference value M label And weight value M b And (3) carrying out fine adjustment and calibration on parameters in the attention model to obtain a calibrated attention model.
Optionally, the weight value calculating unit includes:
the image dimension reduction processing subunit is used for performing dimension reduction processing on the sample image to obtain a corresponding low-resolution image;
Weight value M s A calculation subunit for inputting the low-resolution image into the attention model to obtain a weight value M of the low-resolution image s
First weight value M b A calculating subunit for calculating the weight value M s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b
Optionally, the neural network model training unit 52 includes:
a third image output subunit, configured to input the low-resolution image into the neural network model, and obtain a third image output by the neural network model;
second weight value M b A calculation subunit, configured to input an original image corresponding to the low-resolution image into the calibrated attention model, to obtain a weight value M of the original image b
A first numerical computation subunit, configured to, according to image information of an original image corresponding to the low-resolution image, image information of the third image, and the weightValue M b Calculating a first value by the loss function, wherein the first value is a loss function value corresponding to the current neural network model;
and the parameter adjustment subunit is used for adjusting the parameters of the neural network model according to the first numerical value.
Optionally, the neural network model training unit 52 is specifically configured to:
the values of each parameter in the neural network model are corrected by a gradient descent method based on the loss function.
In the embodiment of the invention, the super-resolution processing of the image to be processed can be realized by establishing the loss function based on the weight values obtained by the attention model for the attention focuses at different positions in the image, inputting the image to be processed into the trained neural network model after training the neural network model by utilizing the loss function, and generating the super-resolution image of the image to be processed. According to the embodiment of the invention, the attention focus of the attention model, namely the weight value, is introduced into the process of constructing the loss function, the super-resolution algorithm in the neural network model is improved by improving the loss function, and more details are supplemented for the position with larger weight value in the image, so that the image quality can be improved, the super-resolution image which accords with the human eye vision rule better is obtained, and the quality stability of the generated super-resolution image is improved.
Fig. 6 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention. As shown in fig. 6, the image processing apparatus 6 of this embodiment includes: a processor 60, a memory 61 and a computer program 62 stored in said memory 61 and executable on said processor 60. The processor 60, when executing the computer program 62, implements the steps of the various image processing method embodiments described above, such as steps 101 through 106 shown in fig. 1. Alternatively, the processor 60, when executing the computer program 62, implements the functions of the units in the system embodiments described above, such as the functions of the modules 51 to 53 shown in fig. 5.
Illustratively, the computer program 62 may be partitioned into one or more units that are stored in the memory 61 and executed by the processor 60 to complete the present invention. The one or more units may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program 62 in the image processing device 6. For example, the computer program 62 may be divided into a loss function creation unit 51, a neural network model training unit 52, and a super-resolution image generation unit 53, each of which functions specifically as follows:
a loss function establishing unit 51, configured to establish a loss function based on weight values obtained by the attention model for attention focuses at different positions in the image;
a neural network model training unit 52 for training the neural network model using the loss function;
the super-resolution image generating unit 53 is configured to input an image to be processed into the trained neural network model, and generate a super-resolution image of the image to be processed.
Optionally, the loss function is:
wherein a represents the first image, C represents the second image, a (x, y) represents the intensity value of the first image at the location (x, y), and C (x, y) represents the intensity value of the second image at the location (x, y); a and b represent the height and width of the image; m represents a super-resolution coefficient; m is M b (x, y) represents the weight value of the image at the position (x, y), and the weight values of all the position coordinates (x, y) in the image form the weight value M of the whole image b The method comprises the steps of carrying out a first treatment on the surface of the Wherein the first image is an original image; the second image is an image output by the neural network model after the low-resolution image corresponding to the original image is input into the neural network model in the training process.
Optionally, the apparatus further includes:
an attention model construction unit for constructing an attention model;
the system comprises a heat point diagram acquisition unit, a control unit and a control unit, wherein the heat point diagram acquisition unit is used for acquiring a sample image, inputting the sample image into an eye movement instrument and obtaining a heat point diagram corresponding to the sample image;
the quantization processing unit is used for carrying out quantization processing on the hotspot graph according to a preset quantization formula to obtain a weight reference value M label
Weight value M b A calculation unit for inputting the sample image into the attention model to obtain a weight value M of the sample image b
An attention model calibration unit for referencing the value M with the weight label As a reference value, according to the weight reference value M label And weight value M b And (3) carrying out fine adjustment and calibration on parameters in the attention model to obtain a calibrated attention model.
Optionally, the weight value calculating unit includes:
the image dimension reduction processing subunit is used for performing dimension reduction processing on the sample image to obtain a corresponding low-resolution image;
weight value M s A calculation subunit for inputting the low-resolution image into the attention model to obtain a weight value M of the low-resolution image s
First weight value M b A calculating subunit for calculating the weight value M s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b
Optionally, the neural network model training unit 52 includes:
a third image output subunit, configured to input the low-resolution image into the neural network model, and obtain a third image output by the neural network model;
second weight value M b A calculation subunit, configured to input an original image corresponding to the low-resolution image into the calibrated attention model, to obtain a weight value M of the original image b
A first numerical calculation subunit configured to calculate, based on the image information of the original image corresponding to the low-resolution image, the image information of the third image, and the weight value M b Calculating a first value by the loss function, wherein the first value is a loss function value corresponding to the current neural network model;
and the parameter adjustment subunit is used for adjusting the parameters of the neural network model according to the first numerical value.
Optionally, the neural network model training unit 52 is specifically configured to:
the values of each parameter in the neural network model are corrected by a gradient descent method based on the loss function.
The image processing device 6 may include, but is not limited to, a processor 60, a memory 61. It will be appreciated by those skilled in the art that fig. 6 is merely an example of the image processing device 6 and does not constitute a limitation of the image processing device 6, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal may further include an input-output device, a network access device, a bus, etc.
The processor 60 may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 61 may be an internal storage unit of the image processing apparatus 6, such as a hard disk or a memory of the image processing apparatus 6. The memory 61 may also be an external storage device of the image processing apparatus 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the image processing apparatus 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the image processing apparatus 6. The memory 61 is used for storing the computer program and other programs and data required by the terminal. The memory 61 may also be used for temporarily storing data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the system is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed system/image processing apparatus and method may be implemented in other manners. For example, the system/image processing device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, systems or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or system capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (8)

1. An image processing method, the method comprising:
establishing a loss function based on weight values obtained by the attention model for attention focuses at different positions in the image;
training a neural network model by using the loss function;
inputting an image to be processed into a trained neural network model to generate a super-resolution image of the image to be processed;
before the step of establishing a loss function based on the weight values obtained by the attention model for the attention focuses at different positions in the image, the method comprises the following steps:
constructing an attention model;
acquiring a sample image, inputting the sample image into an eye tracker, and obtaining a heat point diagram corresponding to the sample image;
Performing quantization processing on the hotspot graph according to a preset quantization formula to obtain a weight reference value M label
Inputting the sample image into the attention model to obtain a weight value M of the sample image b
With the weight reference value M label As a reference value, according to the weight reference value M label And weight value M b Fine tuning and calibrating parameters in the attention model to obtain a calibrated attention model;
the sample image is input into the attention model to obtain a weight value M of the sample image b Comprises the steps of:
performing dimension reduction processing on the sample image to obtain a corresponding low-resolution image;
inputting the low-resolution image into the attention model to obtain a weight value M of the low-resolution image s
According to the weight value M s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b
2. The method of claim 1, wherein the loss function is:
wherein a represents the first image, C represents the second image, a (x, y) represents the intensity value of the first image at the location (x, y), and C (x, y) represents the intensity value of the second image at the location (x, y); a and b represent the height and width of the image; m represents a super-resolution coefficient; m is M b (x, y) represents the weight value of the image at the position (x, y), and the weight values of all the position coordinates (x, y) in the image form the weight of the whole imageWeight M b The method comprises the steps of carrying out a first treatment on the surface of the Wherein the first image is an original image; the second image is an image output by the neural network model after the low-resolution image corresponding to the original image is input into the neural network model in the training process.
3. The method of claim 1, wherein the training the neural network model with the loss function comprises:
inputting the low-resolution image into the neural network model to obtain a third image output by the neural network model;
inputting an original image corresponding to the low-resolution image into the calibrated attention model to obtain a weight value M of the original image b
According to the image information of the original image corresponding to the low resolution image, the image information of the third image and the weight value M b Calculating a first value by the loss function, wherein the first value is a loss function value corresponding to the current neural network model;
and adjusting parameters of the neural network model according to the first numerical value.
4. The method of claim 1, wherein the training the neural network model with the loss function comprises:
the values of each parameter in the neural network model are corrected by a gradient descent method based on the loss function.
5. An image processing apparatus, characterized in that the apparatus comprises:
the loss function building unit is used for building a loss function based on weight values obtained by the attention model on attention focuses at different positions in the image;
the neural network model training unit is used for training the neural network model by utilizing the loss function;
the super-resolution image generation unit is used for inputting an image to be processed into the trained neural network model to generate a super-resolution image of the image to be processed;
the apparatus further comprises:
an attention model construction unit for constructing an attention model;
the system comprises a heat point diagram acquisition unit, a control unit and a control unit, wherein the heat point diagram acquisition unit is used for acquiring a sample image, inputting the sample image into an eye movement instrument and obtaining a heat point diagram corresponding to the sample image;
the quantization processing unit is used for carrying out quantization processing on the hotspot graph according to a preset quantization formula to obtain a weight reference value M label
A weight value calculation unit for inputting the sample image into the attention model to obtain a weight value M of the sample image b
An attention model calibration unit for referencing the value M with the weight label As a reference value, according to the weight reference value M label And weight value M b Fine tuning and calibrating parameters in the attention model to obtain a calibrated attention model;
the weight value calculation unit includes:
the image dimension reduction processing subunit is used for performing dimension reduction processing on the sample image to obtain a corresponding low-resolution image;
weight value M s A calculation subunit for inputting the low-resolution image into the attention model to obtain a weight value M of the low-resolution image s
First weight value M b A calculating subunit for calculating the weight value M s Performing dimension increasing processing on the low-resolution image through a bicubic interpolation algorithm to obtain a weight value M of the corresponding high-resolution image b
6. The apparatus of claim 5, wherein the loss function is:
wherein a represents the first image, C represents the second image, a (x, y) represents the intensity value of the first image at the location (x, y), and C (x, y) represents the intensity value of the second image at the location (x, y); a and b represent the height and width of the image; m represents a super-resolution coefficient; m is M b (x, y) represents the weight value of the image at the position (x, y), and the weight values of all the position coordinates (x, y) in the image form the weight value M of the whole image b The method comprises the steps of carrying out a first treatment on the surface of the Wherein the first image is an original image; the second image is an image output by the neural network model after the low-resolution image corresponding to the original image is input into the neural network model in the training process.
7. An image processing device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the image processing method according to any one of claims 1 to 4 when the computer program is executed by the processor.
8. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the image processing method according to any one of claims 1 to 4.
CN201811123662.2A 2018-09-26 2018-09-26 Image processing method, device and equipment Active CN110956582B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811123662.2A CN110956582B (en) 2018-09-26 2018-09-26 Image processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811123662.2A CN110956582B (en) 2018-09-26 2018-09-26 Image processing method, device and equipment

Publications (2)

Publication Number Publication Date
CN110956582A CN110956582A (en) 2020-04-03
CN110956582B true CN110956582B (en) 2024-02-02

Family

ID=69964720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811123662.2A Active CN110956582B (en) 2018-09-26 2018-09-26 Image processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN110956582B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102036073B (en) * 2010-12-21 2012-11-28 西安交通大学 Method for encoding and decoding JPEG2000 image based on vision potential attention target area
CN106067161A (en) * 2016-05-24 2016-11-02 深圳市未来媒体技术研究院 A kind of method that image is carried out super-resolution
CN106204449B (en) * 2016-07-06 2019-09-10 安徽工业大学 A kind of single image super resolution ratio reconstruction method based on symmetrical depth network
US11024009B2 (en) * 2016-09-15 2021-06-01 Twitter, Inc. Super resolution using a generative adversarial network

Also Published As

Publication number Publication date
CN110956582A (en) 2020-04-03

Similar Documents

Publication Publication Date Title
US10205896B2 (en) Automatic lens flare detection and correction for light-field images
CN110163080B (en) Face key point detection method and device, storage medium and electronic equipment
CN107172418B (en) A kind of tone scale map image quality evaluating method based on exposure status analysis
WO2008038748A1 (en) Prediction coefficient operation device and method, image data operation device and method, program, and recording medium
CN114511605B (en) Light field depth estimation method and device, electronic equipment and storage medium
CN105357519B (en) Quality objective evaluation method for three-dimensional image without reference based on self-similarity characteristic
JP2018156640A (en) Learning method and program
CN110335330A (en) Image simulation generation method and its system, deep learning algorithm training method and electronic equipment
CN112581392A (en) Image exposure correction method, system and storage medium based on bidirectional illumination estimation and fusion restoration
CN110766153A (en) Neural network model training method and device and terminal equipment
CN105282543A (en) Total blindness three-dimensional image quality objective evaluation method based on three-dimensional visual perception
CN114549383A (en) Image enhancement method, device, equipment and medium based on deep learning
CN114494347A (en) Single-camera multi-mode sight tracking method and device and electronic equipment
CN111046893A (en) Image similarity determining method and device, and image processing method and device
CN110458754B (en) Image generation method and terminal equipment
CN117726542B (en) Controllable noise removing method and system based on diffusion model
CN110956582B (en) Image processing method, device and equipment
CN113989165A (en) Image processing method, image processing device, electronic equipment and storage medium
JP7446797B2 (en) Image processing device, imaging device, image processing method and program
CN113706400A (en) Image correction method, image correction device, microscope image correction method, and electronic apparatus
CN114022529B (en) Depth perception method and device based on self-adaptive binocular structured light
CN110766631A (en) Face image modification method and device, electronic equipment and computer readable medium
CN103841327A (en) Four-dimensional light field decoding preprocessing method based on original image
CN116977190A (en) Image processing method, apparatus, device, storage medium, and program product
CN116385626A (en) Training method and device for image reconstruction model, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 516006 TCL technology building, No.17, Huifeng Third Road, Zhongkai high tech Zone, Huizhou City, Guangdong Province

Applicant after: TCL Technology Group Co.,Ltd.

Address before: 516006 Guangdong province Huizhou Zhongkai hi tech Development Zone No. nineteen District

Applicant before: TCL Corp.

CB02 Change of applicant information
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40018686

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant