CN113222813A - Image super-resolution reconstruction method and device, electronic equipment and storage medium - Google Patents
Image super-resolution reconstruction method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113222813A CN113222813A CN202110420575.9A CN202110420575A CN113222813A CN 113222813 A CN113222813 A CN 113222813A CN 202110420575 A CN202110420575 A CN 202110420575A CN 113222813 A CN113222813 A CN 113222813A
- Authority
- CN
- China
- Prior art keywords
- image
- super
- resolution
- feature map
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000003062 neural network model Methods 0.000 claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 37
- 238000013139 quantization Methods 0.000 claims abstract description 13
- 238000010586 diagram Methods 0.000 claims description 25
- 238000004891 communication Methods 0.000 claims description 20
- 230000006870 function Effects 0.000 claims description 17
- 238000005070 sampling Methods 0.000 claims description 17
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an image super-resolution reconstruction method, an image super-resolution reconstruction device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring an image to be processed; inputting the image to be processed into a super-resolution reconstruction model so that the super-resolution reconstruction model predicts each pixel point in the image to be processed and obtains a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model is a binary neural network model obtained by using a plurality of training samples in advance for training, wherein each training sample comprises a first-resolution image block and a corresponding second-resolution image block; wherein the second resolution is greater than the first resolution. Because the double-current binary inference layer in the super-resolution reconstruction model can improve the binary quantization precision through the quantization threshold value and improve the information bearing capacity of the super-resolution reconstruction model by using the double-current network structure, the performance of the super-resolution reconstruction model can be obviously improved, and meanwhile, the reconstruction speed can be improved on the basis of ensuring the image reconstruction precision.
Description
Technical Field
The invention belongs to the field of machine learning, and particularly relates to an image super-resolution reconstruction method and device, electronic equipment and a storage medium.
Background
Image super-resolution reconstruction is widely applied to a plurality of fields such as digital media, medical images, satellite remote sensing, video monitoring and the like as an image quality enhancement means with low cost and easy operation.
In order to maximally reduce model storage and computing resources, in the related art, a binary neural network is generally used to implement super-resolution reconstruction of images on mobile devices with low computing performance, such as mobile phones, and storage resource consumption during model deployment is reduced by converting floating point numbers in the neural network into one-bit binary numbers.
However, in the related art, the binary neural network includes a full-precision upsampling layer, and a complex full-precision calculation process of the layer not only brings a huge calculation burden to the model, but also causes a reduction in quality of a reconstructed image due to the limited information carrying capacity of the layer.
Disclosure of Invention
In order to solve the above problems in the prior art, the invention provides an image super-resolution reconstruction method, an image super-resolution reconstruction device, an electronic device and a storage medium. The technical problem to be solved by the invention is realized by the following technical scheme:
in a first aspect, the present invention provides a method for reconstructing super-resolution images, comprising:
acquiring an image to be processed;
inputting the image to be processed into a super-resolution reconstruction model so that the super-resolution reconstruction model predicts each pixel point in the image to be processed and obtains a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an up-sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
In an embodiment of the present invention, the step of inputting the image to be processed to a super-resolution reconstruction model, so that the super-resolution reconstruction model predicts each pixel point in the image to be processed, and obtains a super-resolution reconstructed image of the image to be processed includes:
inputting the image to be processed into a super-resolution reconstruction model so that the input layer performs full-precision convolution on the image to be processed and outputs a first feature map;
inputting the first feature map into the multiple serial double-flow binary inference layers, wherein the output of each level of double-flow binary inference layer is used as the input of the next level of double-flow binary inference layer, and the output of the last level of double-flow binary inference layer is used as a second feature map;
inputting the second feature map into the up-sampling layer, so that the up-sampling layer performs up-sampling on the second feature map to obtain a third feature map;
and inputting the third feature map into the output layer, so that the output layer performs full-precision convolution on the third feature map, and a super-resolution reconstruction image of the image to be processed is obtained.
In an embodiment of the present invention, the step of inputting the first feature map into the multiple serially connected dual-flow binary inference layers, where an output of each level of dual-flow binary inference layer is used as an input of a next level of dual-flow binary inference layer, and an output of a last level of dual-flow binary inference layer is used as a second feature map includes:
judging whether a first sub output and a second sub output of a previous-stage double-current binary inference layer are received or not;
if so, determining a first sub input of a first branch network and a second sub input of a second branch network in the current double-flow binary inference layer according to the first feature map, the first sub output and the second sub output of the upper-stage double-flow binary inference layer and two preset current weights;
if not, determining a first sub-input of a first branch network and a second sub-input of a second branch network in the current double-flow binary inference layer according to the first feature map and two preset current weights;
after the first sub-input and the second sub-input are obtained, enabling the first branch network to carry out binarization on the first sub-input according to the first preset threshold value, carrying out convolution according to the first preset weight to obtain a first sub-output of a current double-current binary inference layer, enabling the second branch network to carry out binarization on the second sub-input according to the second preset threshold value, and carrying out convolution according to the second preset weight to obtain a second sub-output of the current double-current binary inference layer;
determining an updated first feature map according to the first sub-input and the second sub-input
Judging whether the current double-current binary inference layer is a last-stage double-current binary inference layer or not; if not, inputting the updated first feature map, the first sub-output and the second sub-output to a next-stage double-current binary inference layer;
and if so, carrying out weighted summation on the first sub-output, the second sub-output and the updated first feature map to obtain a second feature map output by the last stage of double-flow binary inference layer.
In an embodiment of the present invention, the step of inputting the second feature map into the upsampling layer so that the upsampling layer upsamples the second feature map includes:
inputting the third feature map into the upsampling layer, so that the upsampling layer performs neighborhood offset prediction according to the following formula:
in the formula, WrAnd brRespectively representing r parallel preset convolution weights and preset offsets, alrRepresenting said third characteristic diagram, Q () representing a quantization operation,denotes the neighborhood offset prediction, r 1, 2, …, s2Wherein s represents the magnification of the super-resolution reconstruction model to the image to be processed, and s>1;
And summing the neighborhood offset prediction result and the third feature map respectively to obtain a fourth feature map, and performing sub-pixel reconstruction on the fourth feature map to obtain a super-resolution reconstructed image of the image to be processed.
In an embodiment of the present invention, before the step of summing the neighborhood offset predictor and the third feature map, the method further includes:
and copying each pixel value in the third feature map into s2, so that the size of the copied third feature map is s times of the image to be processed.
In an embodiment of the present invention, the super-resolution reconstruction model is obtained by training using the following steps:
obtaining a plurality of training samples;
inputting a preset number of first-resolution image blocks into a neural network model to be trained, wherein the neural network model to be trained is a preset initial binary neural network model;
determining a loss value according to a super-resolution reconstruction result output by the neural network to be trained, a second resolution image block corresponding to each first resolution image block and a preset loss function;
judging whether the neural network model to be trained converges according to the loss value; if the model is converged, the neural network model to be trained is a trained super-resolution reconstruction model;
and if not, adjusting the network parameters of the neural network to be trained, and returning to the step of inputting the preset number of first resolution image blocks into the neural network model to be trained.
In one embodiment of the present invention, the predetermined loss function is:
wherein, M is the number of the first resolution image blocks, M represents the mth input first resolution image block, α represents the learning rate of the neural network to be trained, ymRepresenting an input m-th image block of a second resolution corresponding to said first resolution image block,and representing a super-resolution reconstruction result of the mth first resolution image block output by the neural network model to be trained, wherein theta is a network parameter of the neural network model to be trained.
In a second aspect, the present invention provides an image super-resolution reconstruction apparatus, comprising:
the acquisition module is used for acquiring an image to be processed;
the reconstruction module is used for inputting the image to be processed into a super-resolution reconstruction model so as to enable the super-resolution reconstruction model to predict each pixel point in the image to be processed and obtain a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an up-sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
In a third aspect, the present invention provides an electronic device, including a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
a processor for executing a program stored in the memory to perform the method steps of any of the first aspect.
In a fourth aspect, the invention also provides a computer readable storage medium, which when executed by a processor implements the steps of the method of any of the first aspects.
The invention has the beneficial effects that:
the invention provides an image super-resolution reconstruction method, an image super-resolution reconstruction device, an electronic device and a storage medium, wherein each double-current binary inference layer in a super-resolution reconstruction model can improve binary quantization precision through a quantization threshold value, and the information bearing capacity of the super-resolution reconstruction model is improved through a double-current network structure, so that the performance of the super-resolution reconstruction model can be obviously improved, meanwhile, the reconstruction speed can be improved on the basis of ensuring the image reconstruction precision, and the quality of a super-resolution reconstruction image is further ensured.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is a schematic flow chart of a super-resolution image reconstruction method according to an embodiment of the present invention;
FIG. 2 is another schematic flow chart of a super-resolution image reconstruction method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a dual-flow binary inference layer according to an embodiment of the present invention;
fig. 4 is another schematic structural diagram of a dual-flow binary inference layer according to an embodiment of the present invention;
fig. 5 is another schematic structural diagram of a dual-flow binary inference layer according to an embodiment of the present invention
Fig. 6 is a schematic structural diagram of an upsampling layer provided in an embodiment of the present invention;
FIG. 7 is a schematic diagram of a training process of a neural network model to be trained according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an image super-resolution reconstruction apparatus according to an embodiment of the present invention;
fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
Fig. 1 is a schematic flow chart of an image super-resolution reconstruction method according to an embodiment of the present invention. As shown in fig. 1, the image super-resolution reconstruction method provided by the embodiment of the present invention includes:
s101, acquiring an image to be processed;
s102, inputting the image to be processed into a super-resolution reconstruction model, so that the super-resolution reconstruction model predicts each pixel point in the image to be processed, and obtains a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an upper sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
It should be noted that, in this embodiment, the image to be processed is input by a user, and may be a shot photo, or may be a certain frame of image captured in a video, which is not limited in this application.
In this embodiment, the super-resolution reconstruction model is a binary neural network model obtained by using a plurality of training samples in advance, and each training sample includes a first resolution image block and a second resolution image block; since the second resolution is greater than the first resolution, the first resolution image block can be obtained by downsampling the second resolution image block. For example, the height and width of the second-resolution image block are H and W, and the height and width of the first-resolution image block obtained after downsampling are H/s and W/s, which means that the height and width of the second-resolution image block are s times of the corresponding first-resolution image block.
Illustratively, the super-resolution reconstruction model provided by the present embodiment includes an input layer, a plurality of two-stream binary inference layers connected in series, an upsampling layer, and an output layer. Because each double-current binary inference layer comprises a first branch network and a second branch network, the first branch network comprises a first preset threshold and a first preset weight, the second branch network comprises a second preset threshold and a second preset weight, and the first preset weight and the second preset weight are both binary representations, the double-current binary inference layer can improve the binary quantization precision through the quantization threshold, and improve the information bearing capacity of the super-resolution reconstruction model through the double-current network structure, so that the performance of the super-resolution reconstruction model can be remarkably improved, meanwhile, the reconstruction speed can be improved on the basis of ensuring the image reconstruction precision, and the quality of the super-resolution reconstruction image is further ensured.
Fig. 2 is another schematic flow chart of the image super-resolution reconstruction method according to the embodiment of the present invention. As shown in fig. 2, in step S102, the step of inputting the image to be processed into the super-resolution reconstruction model, so that the super-resolution reconstruction model predicts each pixel point in the image to be processed, and obtains a super-resolution reconstructed image of the image to be processed includes:
s201, inputting an image to be processed into a super-resolution reconstruction model, so that an input layer performs full-precision convolution on the image to be processed, and outputting a first feature map;
s202, inputting the first feature map into a plurality of series-connected double-flow binary inference layers, wherein the output of each level of double-flow binary inference layer is used as the input of the next level of double-flow binary inference layer, and the output of the last level of double-flow binary inference layer is used as a second feature map;
s203, inputting the second characteristic diagram to an upper sampling layer so that the upper sampling layer performs up-sampling on the second characteristic diagram to obtain a third characteristic diagram;
and S204, inputting the third feature map into an output layer, so that the output layer performs full-precision convolution on the third feature map, and a super-resolution reconstruction image of the image to be processed is obtained.
Fig. 3 to 5 are schematic structural diagrams of the dual-flow binary inference layer according to the embodiment of the present invention. In step S202, after the first feature map is input to the dual-flow binary inference layer, it is determined whether the first sub-output and the second sub-output of the previous dual-flow binary inference layer are received, and if not, the dual-flow binary inference layer is the first dual-flow binary inference layer. Specifically, please refer to fig. 3, at this time, the first feature map and two preset current weights, i.e. α, can be used1And beta1And calculating to obtain a first sub input of a first branch network and a second sub input of a second branch network in the first-stage double-flow binary inference layer.
On the contrary, if the first sub output and the second sub output of the upper-level double-current binary inference layer are received, the double-current binary inference layer is not the first-level double-current binary inference layer. Taking the second-level dual-flow binary inference layer as an example, please refer to fig. 4, at this time, there are three inputs of the second-level dual-flow binary inference layer, which are the first feature map, the first sub-input and the second sub-input after the first-level dual-flow binary inference layer is updated, respectively, where the first sub-input is input by the first-level dual-flow binary inference layerThe first sub-output and the first level updated first feature map according to the preset current weight alpha2The second sub-input is obtained by weighting, the second sub-output of the first-stage double-current binary inference layer and the first feature map after the first-stage updating are carried out according to the preset current weight beta2And obtaining the weight.
Meanwhile, as shown in fig. 4, there are three outputs of the second-stage dual-stream binary inference layer, which are the updated first feature map of the second-stage dual-stream binary inference layer, the first sub-output of the first branch network, and the second sub-output of the second branch network, respectively. Since the second sub-output of the second branch network is calculated in the same way as the first sub-output, it is not described herein again.
Referring to fig. 5, if the current dual-stream binary inference layer is the last level, the last dual-stream binary inference layer has three inputs and one output, where the three inputs are the first feature map, the first sub-input, and the second sub-input after the previous level update, and the first sub-input is formed by the first sub-output of the previous dual-stream binary inference layer and the first feature map after the previous level update according to the preset current weight αnThe second sub-input is obtained by weighting, the second sub-output of the previous stage double-current binary inference layer and the updated first feature map of the previous stage are according to the preset current weight betanAnd weighting, wherein n represents the number of the double-flow binary inference layers connected in series, and one output of the last-stage double-flow binary inference layer is a second feature map obtained by weighting and summing the first sub-output, the second sub-output and the first feature map.
It should be noted that, in this embodiment, the first branch network and the second branch network of each dual-current binary inference layer respectively perform binarization representation on the first sub-input and the second sub-input through a quantization function whose threshold can be learned, and then perform binarization convolution with the first preset weight and the second preset weight respectively.
Optionally, in this embodiment, the original high-resolution neighborhood value prediction task is converted into a task of predicting a high-resolution offset through residual learning, then the prediction process of each offset in the high-resolution neighborhood is allocated to an independent convolution module (a first preset convolution or a second preset convolution), and the original complex upsampling mapping process is implemented through binary convolution. In step S203, the second feature map is input to the upsampling layer, so that the upsampling layer upsamples the second feature map, which includes a neighborhood offset prediction stage and a sub-pixel reconstruction stage.
Fig. 5 is a schematic diagram of a training process of a neural network model to be trained according to an embodiment of the present invention. Specifically, as shown in fig. 5, in the neighborhood offset prediction stage, the upsampling layer performs neighborhood offset prediction according to the following formula:
in the formula, WrAnd brRespectively representing r parallel preset convolution weights and preset offsets, alr representing the third feature map, Q () representing a quantization operation,represents r sets of neighborhood offset predictors, r being 1, 2, …, s2Wherein s represents the magnification of the super-resolution reconstruction model to the image to be processed, and s>1。
In the sub-pixel reconstruction stage, the up-sampling layer sums the neighborhood offset prediction result and the third feature map according to the following formula to obtain a fourth feature map, and performs sub-pixel reconstruction on the fourth feature map to obtain a super-resolution reconstruction image of the image to be processed.
In the formula (I), the compound is shown in the specification,is r sets of neighborhood offset predictors, C () denotes thatThe stitching is performed by a concatenation function, and PS () represents the upsampling process by a sub-pixel reconstruction operation.
Optionally, each pixel value in the third feature map is copied to s2, so that the size of the third feature map after being copied is s times of the image to be processed. It should be understood that fig. 4 only exemplifies that one pixel is copied into 22 pixels, that is, the ratio of the height to the width of the third feature map after the copying process is 2 times of the image to be processed, and in some other embodiments of the present application, the magnification of the image to be processed may be arbitrary.
Fig. 7 is a schematic diagram of a training process of a neural network model to be trained according to an embodiment of the present invention. Referring to fig. 7, the super-resolution reconstruction model in the present embodiment can be obtained by training through the following steps:
s701, obtaining a plurality of training samples;
s702, inputting a preset number of first-resolution image blocks into a neural network model to be trained, wherein the neural network model to be trained is a preset initial binary neural network model;
s703, determining a loss value according to a super-resolution reconstruction result output by the neural network to be trained, a second resolution image block corresponding to each first resolution image block and a preset loss function;
s704, judging whether the neural network model to be trained converges according to the loss value; if the model is converged, executing the step S705, wherein the neural network model to be trained is a trained super-resolution reconstruction model;
if not, executing step S706, adjusting the network parameters of the neural network to be trained, and returning to the step of inputting the preset number of first resolution image blocks into the neural network model to be trained.
Specifically, two realizable manners exist for judging whether the neural network model to be trained converges, one realizable manner is as follows: and when the training times of the neural network to be trained reach the preset iteration times, converging the neural network model to be trained, and finishing the training.
Another way to achieve this is: and calculating a loss value according to a preset loss function, and when the loss value is less than or equal to a preset error value, converging the neural network model to be trained, and finishing the training. Optionally, the preset loss function is:
wherein M is the number of the input first resolution image blocks, M represents the input mth first resolution image block, alpha represents the learning rate of the neural network to be trained, ym represents the second resolution image block corresponding to the input mth first resolution image block,and representing a super-resolution reconstruction result of the mth first resolution image block output by the neural network model to be trained.
Illustratively, if the width and the height of the image to be processed are 48 × 48, respectively, the width and the height of the super-resolution reconstructed image output by the trained super-resolution reconstruction model are 48 × s.
It should be noted that, because the quantization operation in the binary neural network model is not differentiable, the present embodiment may use a higher-order segmented estimation function to calculate when calculating the gradient of the neural network model to be trained:
wherein x is the first sub-input/the second sub-input of the double-current binary inference layer or the network parameter of the neural network to be trained.
Fig. 8 is a schematic structural diagram of an image super-resolution reconstruction apparatus according to an embodiment of the present invention. As shown in fig. 8, based on the same inventive concept, an image super-resolution reconstruction apparatus according to an embodiment of the present invention includes:
an obtaining module 810, configured to obtain an image to be processed;
the reconstruction module 820 is used for inputting the image to be processed into the super-resolution reconstruction model, so that the super-resolution reconstruction model predicts each pixel point in the image to be processed and obtains a super-resolution reconstructed image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an upper sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
In the image super-resolution reconstruction device provided by the invention, the double-flow binary inference layer in the super-resolution reconstruction model can improve the binary quantization precision through the quantization threshold value, and the information bearing capacity of the super-resolution reconstruction model is improved by using the double-flow network structure, so that the performance of the super-resolution reconstruction model can be obviously improved, the reconstruction speed can be improved on the basis of ensuring the image reconstruction precision, and the quality of the super-resolution reconstruction image is further ensured.
Fig. 9 is a schematic diagram of an electronic device according to an embodiment of the present invention. An embodiment of the present invention further provides an electronic device, as shown in fig. 9, which includes a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,
a memory 903 for storing computer programs;
the processor 901 is configured to implement the following steps when executing the program stored in the memory 903:
acquiring an image to be processed;
inputting the image to be processed into a super-resolution reconstruction model so that the super-resolution reconstruction model predicts each pixel point in the image to be processed and obtains a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an up-sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The method provided by the embodiment of the invention can be applied to electronic equipment. Specifically, the electronic device may be: desktop computers, laptop computers, intelligent mobile terminals, servers, and the like. Without limitation, any electronic device that can implement the present invention is within the scope of the present invention.
For the apparatus/electronic device/storage medium embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.
It should be noted that the apparatus, the electronic device and the storage medium according to the embodiments of the present invention are an apparatus, an electronic device and a storage medium to which the above image super-resolution reconstruction method is applied, and all the embodiments of the image super-resolution reconstruction method are applicable to the apparatus, the electronic device and the storage medium, and can achieve the same or similar beneficial effects.
By applying the terminal equipment provided by the embodiment of the invention, proper nouns and/or fixed phrases can be displayed for a user to select, so that the input time of the user is reduced, and the user experience is improved.
The terminal device exists in various forms including but not limited to:
(1) a mobile communication device: such devices are characterized by mobile communications capabilities and are primarily targeted at providing voice, data communications. Such terminals include: smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.
(2) Ultra mobile personal computer device: the equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include: PDA, MID, and UMPC devices, etc., such as ipads.
(3) A portable entertainment device: such devices can display and play multimedia content. This type of device comprises: audio, video players (e.g., ipods), handheld game consoles, electronic books, and smart toys and portable car navigation devices.
(4) And other electronic devices with data interaction functions.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples described in this specification can be combined and combined by those skilled in the art.
While the present application has been described in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from a review of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device), or computer program product. Accordingly, this application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "module" or "system. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. A computer program stored/distributed on a suitable medium supplied together with or as part of other hardware, may also take other distributed forms, such as via the Internet or other wired or wireless telecommunication systems.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.
Claims (10)
1. An image super-resolution reconstruction method is characterized by comprising the following steps:
acquiring an image to be processed;
inputting the image to be processed into a super-resolution reconstruction model so that the super-resolution reconstruction model predicts each pixel point in the image to be processed and obtains a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an up-sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
2. The super-resolution reconstruction method according to claim 1, wherein the step of inputting the image to be processed into a super-resolution reconstruction model, so that the super-resolution reconstruction model predicts each pixel point in the image to be processed, and obtains a super-resolution reconstructed image of the image to be processed comprises:
inputting the image to be processed into a super-resolution reconstruction model so that the input layer performs full-precision convolution on the image to be processed and outputs a first feature map;
inputting the first feature map into the multiple serial double-flow binary inference layers, wherein the output of each level of double-flow binary inference layer is used as the input of the next level of double-flow binary inference layer, and the output of the last level of double-flow binary inference layer is used as a second feature map;
inputting the second feature map into the up-sampling layer, so that the up-sampling layer performs up-sampling on the second feature map to obtain a third feature map;
and inputting the third feature map into the output layer, so that the output layer performs full-precision convolution on the third feature map, and a super-resolution reconstruction image of the image to be processed is obtained.
3. The image super-resolution reconstruction method according to claim 2, wherein the step of inputting the first feature map into the plurality of cascaded dual-stream binary inference layers, the output of each level of dual-stream binary inference layer being input into the next level of dual-stream binary inference layer, and the output of the last level of dual-stream binary inference layer being input into the second feature map comprises:
judging whether a first sub output and a second sub output of a previous-stage double-current binary inference layer are received or not;
if so, determining a first sub input of a first branch network and a second sub input of a second branch network in the current double-flow binary inference layer according to the first feature map, the first sub output and the second sub output of the upper-stage double-flow binary inference layer and two preset current weights;
if not, determining a first sub-input of a first branch network and a second sub-input of a second branch network in the current double-flow binary inference layer according to the first feature map and two preset current weights;
after the first sub-input and the second sub-input are obtained, enabling the first branch network to carry out binarization on the first sub-input according to the first preset threshold value, carrying out convolution according to the first preset weight to obtain a first sub-output of a current double-current binary inference layer, enabling the second branch network to carry out binarization on the second sub-input according to the second preset threshold value, and carrying out convolution according to the second preset weight to obtain a second sub-output of the current double-current binary inference layer;
determining an updated first feature map according to the first sub-input and the second sub-input;
judging whether the current double-current binary inference layer is a last-stage double-current binary inference layer or not; if not, inputting the updated first feature map, the first sub-output and the second sub-output to a next-stage double-current binary inference layer;
and if so, carrying out weighted summation on the first sub-output, the second sub-output and the updated first feature map to obtain a second feature map output by the last stage of double-flow binary inference layer.
4. The image super-resolution reconstruction method according to claim 3, wherein the step of inputting the second feature map into the upsampling layer so that the upsampling layer upsamples the second feature map comprises:
inputting the third feature map into the upsampling layer, so that the upsampling layer performs neighborhood offset prediction according to the following formula:
in the formula, WrAnd brRespectively representing r parallel preset convolution weights and preset offsets, alrRepresents the thirdA characteristic diagram, Q () represents a quantization operation,denotes the neighborhood offset prediction, r 1, 2, …, s2Wherein s represents the magnification of the super-resolution reconstruction model to the image to be processed, and s>1;
And summing the neighborhood offset prediction result and the third feature map respectively to obtain a fourth feature map, and performing sub-pixel reconstruction on the fourth feature map to obtain a super-resolution reconstructed image of the image to be processed.
5. The image super-resolution reconstruction method according to claim 4, further comprising, before the step of summing the neighborhood offset predictors and the third feature map:
copying each pixel value in the third feature map as s2And enabling the size of the third feature map subjected to copying processing to be s times of the image to be processed.
6. The image super-resolution reconstruction method according to claim 1, wherein the super-resolution reconstruction model is trained by the following steps:
obtaining a plurality of training samples;
inputting a preset number of first-resolution image blocks into a neural network model to be trained, wherein the neural network model to be trained is a preset initial binary neural network model;
determining a loss value according to a super-resolution reconstruction result output by the neural network to be trained, a second resolution image block corresponding to each first resolution image block and a preset loss function;
judging whether the neural network model to be trained converges according to the loss value; if the model is converged, the neural network model to be trained is a trained super-resolution reconstruction model;
and if not, adjusting the network parameters of the neural network to be trained, and returning to the step of inputting the preset number of first resolution image blocks into the neural network model to be trained.
7. The image super-resolution reconstruction method according to claim 6, wherein the preset loss function is:
wherein, M is the number of the first resolution image blocks, M represents the mth input first resolution image block, α represents the learning rate of the neural network to be trained, ymRepresenting an input m-th image block of a second resolution corresponding to said first resolution image block,and representing a super-resolution reconstruction result of the mth first resolution image block output by the neural network model to be trained, wherein theta is a network parameter of the neural network model to be trained.
8. An image super-resolution reconstruction apparatus, comprising:
the acquisition module is used for acquiring an image to be processed;
the reconstruction module is used for inputting the image to be processed into a super-resolution reconstruction model so as to enable the super-resolution reconstruction model to predict each pixel point in the image to be processed and obtain a super-resolution reconstruction image of the image to be processed; the super-resolution reconstruction model comprises the following steps: the method comprises the steps that a binary neural network model obtained by training a plurality of training samples is used in advance, wherein each training sample comprises a first-resolution image block and a second-resolution image block corresponding to the first-resolution image block;
the super-resolution reconstruction model comprises an input layer, a plurality of serially connected double-flow binary inference layers, an up-sampling layer and an output layer, wherein the double-flow binary inference layers comprise a first branch network and a second branch network, the first branch network comprises a first preset threshold value and a first preset weight, the second branch network comprises a second preset threshold value and a second preset weight, and the first preset weight and the second preset weight are both binary representations.
9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 7 when executing a program stored in the memory.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110420575.9A CN113222813B (en) | 2021-04-19 | 2021-04-19 | Image super-resolution reconstruction method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110420575.9A CN113222813B (en) | 2021-04-19 | 2021-04-19 | Image super-resolution reconstruction method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113222813A true CN113222813A (en) | 2021-08-06 |
CN113222813B CN113222813B (en) | 2024-02-09 |
Family
ID=77087864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110420575.9A Active CN113222813B (en) | 2021-04-19 | 2021-04-19 | Image super-resolution reconstruction method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113222813B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762412A (en) * | 2021-09-26 | 2021-12-07 | 国网四川省电力公司电力科学研究院 | Power distribution network single-phase earth fault identification method, system, terminal and medium |
CN113992618A (en) * | 2021-10-21 | 2022-01-28 | 北京明略软件系统有限公司 | Super-resolution image processing method, system, electronic device, and storage medium |
CN115982418A (en) * | 2023-03-17 | 2023-04-18 | 亿铸科技(杭州)有限责任公司 | Method for improving super-division operation performance of AI (Artificial Intelligence) computing chip |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063565A (en) * | 2018-06-29 | 2018-12-21 | 中国科学院信息工程研究所 | A kind of low resolution face identification method and device |
CN111080528A (en) * | 2019-12-20 | 2020-04-28 | 北京金山云网络技术有限公司 | Image super-resolution and model training method, device, electronic equipment and medium |
EP3770847A1 (en) * | 2019-07-26 | 2021-01-27 | Beijing Xiaomi Mobile Software Co., Ltd. | Method and device for processing image, and storage medium |
-
2021
- 2021-04-19 CN CN202110420575.9A patent/CN113222813B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063565A (en) * | 2018-06-29 | 2018-12-21 | 中国科学院信息工程研究所 | A kind of low resolution face identification method and device |
EP3770847A1 (en) * | 2019-07-26 | 2021-01-27 | Beijing Xiaomi Mobile Software Co., Ltd. | Method and device for processing image, and storage medium |
CN111080528A (en) * | 2019-12-20 | 2020-04-28 | 北京金山云网络技术有限公司 | Image super-resolution and model training method, device, electronic equipment and medium |
Non-Patent Citations (3)
Title |
---|
李现国;孙叶美;杨彦利;苗长云;: "基于中间层监督卷积神经网络的图像超分辨率重建", 中国图象图形学报, no. 07 * |
王万良;李卓蓉;: "生成式对抗网络研究进展", 通信学报, no. 02 * |
赵艳芹;童朝娣;张恒;: "基于LeNet-5卷积神经网络的车牌字符识别", 黑龙江科技大学学报, no. 03 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113762412A (en) * | 2021-09-26 | 2021-12-07 | 国网四川省电力公司电力科学研究院 | Power distribution network single-phase earth fault identification method, system, terminal and medium |
CN113992618A (en) * | 2021-10-21 | 2022-01-28 | 北京明略软件系统有限公司 | Super-resolution image processing method, system, electronic device, and storage medium |
CN113992618B (en) * | 2021-10-21 | 2023-09-15 | 北京明略软件系统有限公司 | Super-resolution image processing method, system, electronic device and storage medium |
CN115982418A (en) * | 2023-03-17 | 2023-04-18 | 亿铸科技(杭州)有限责任公司 | Method for improving super-division operation performance of AI (Artificial Intelligence) computing chip |
Also Published As
Publication number | Publication date |
---|---|
CN113222813B (en) | 2024-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113222813B (en) | Image super-resolution reconstruction method and device, electronic equipment and storage medium | |
CN110413812B (en) | Neural network model training method and device, electronic equipment and storage medium | |
CN110377740B (en) | Emotion polarity analysis method and device, electronic equipment and storage medium | |
CN110929865B (en) | Network quantification method, service processing method and related product | |
CN106855952B (en) | Neural network-based computing method and device | |
CN110298851B (en) | Training method and device for human body segmentation neural network | |
CN113327599B (en) | Voice recognition method, device, medium and electronic equipment | |
CN112270200B (en) | Text information translation method and device, electronic equipment and storage medium | |
US20190378001A1 (en) | Neural network hardware acceleration with stochastic adaptive resource allocation | |
CN110009101B (en) | Method and apparatus for generating a quantized neural network | |
CN113190872B (en) | Data protection method, network structure training method, device, medium and equipment | |
CN112966754B (en) | Sample screening method, sample screening device and terminal equipment | |
CN117894038A (en) | Method and device for generating object gesture in image | |
CN115456167B (en) | Lightweight model training method, image processing device and electronic equipment | |
CN112561050B (en) | Neural network model training method and device | |
CN111429388B (en) | Image processing method and device and terminal equipment | |
CN112800276B (en) | Video cover determining method, device, medium and equipment | |
CN110555861A (en) | optical flow calculation method and device and electronic equipment | |
CN113780534B (en) | Compression method, image generation method, device, equipment and medium of network model | |
KR102722476B1 (en) | Neural processing elements with increased precision | |
CN110222777B (en) | Image feature processing method and device, electronic equipment and storage medium | |
CN113593527A (en) | Acoustic feature generation, voice model training and voice recognition method and device | |
CN118331716B (en) | Intelligent migration method for calculation force under heterogeneous calculation force integrated system | |
CN112561778B (en) | Image stylization processing method, device, equipment and storage medium | |
CN114021010A (en) | Training method, device and equipment of information recommendation model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |