WO2023174416A1 - 视频的超分辨率方法及装置 - Google Patents
视频的超分辨率方法及装置 Download PDFInfo
- Publication number
- WO2023174416A1 WO2023174416A1 PCT/CN2023/082228 CN2023082228W WO2023174416A1 WO 2023174416 A1 WO2023174416 A1 WO 2023174416A1 CN 2023082228 W CN2023082228 W CN 2023082228W WO 2023174416 A1 WO2023174416 A1 WO 2023174416A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- super
- image block
- resolution
- image frame
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000012545 processing Methods 0.000 claims abstract description 43
- 230000003287 optical effect Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 17
- 230000004927 fusion Effects 0.000 claims description 15
- 230000008569 process Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000000354 decomposition reaction Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 11
- 230000000694 effects Effects 0.000 description 9
- 230000002123 temporal effect Effects 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0102—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving the resampling of the incoming video signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0127—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20216—Image averaging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Definitions
- the present disclosure relates to the field of image processing technology, and in particular, to a video super-resolution method and device.
- Video super-resolution technology also known as video super-resolution technology, is a technology that restores high-resolution video from low-resolution video. Since video super-resolution business has become a key business in video quality enhancement, video super-resolution technology is one of the current research hotspots in the field of image processing.
- embodiments of the present disclosure provide a video super-resolution method, including:
- the super-resolution image frame of the t-th image frame is generated according to each super-resolution image block of the t-th image frame.
- embodiments of the present disclosure provide a video super-resolution device, including:
- the image decomposition module is used to decompose the t-th image frame of the video to be super-resolved and the neighbor image frame of the t-th image frame into N image blocks respectively;
- t and N are both positive integer sequence generation modules, using Generate N image block sequences based on the image blocks obtained by decomposing the t-th image frame and the neighborhood image frame, and each image block in the image block sequence is located at the same position in different image frames;
- the parameter calculation module is used to calculate the motion parameters of each image block sequence; the motion parameters of any image block sequence are used to characterize the optical flow between the image blocks of each adjacent image frame in the image block sequence;
- a model determination module used to determine the super-resolution network model corresponding to each image block sequence based on the motion parameters of each image block sequence
- the image super-resolution module is used to use the super-resolution network model corresponding to each image block sequence to perform super-resolution on the image block of the t-th image frame in each image block sequence, and obtain each super-resolution of the t-th image frame. Divide image into blocks;
- An image generation module generates a super-resolution image frame of the t-th image frame based on each super-resolution image block of the t-th image frame.
- embodiments of the present disclosure provide an electronic device, including: a memory and a processor, the memory is used to store a computer program; the processor is used to cause the electronic device to implement the first step when calling the computer program.
- a memory and a processor the memory is used to store a computer program; the processor is used to cause the electronic device to implement the first step when calling the computer program.
- embodiments of the present disclosure provide a computer-readable storage medium, which when the computer program is executed by a computing device, causes the computing device to implement the first aspect or any optional implementation of the first aspect.
- embodiments of the present disclosure provide a computer program product.
- the computer program product When the computer program product is run on a computer, the computer implements the method described in the first aspect or any optional implementation manner of the first aspect. Super-resolution methods for video.
- Figure 1 is a step flow chart of a video super-resolution method provided by an embodiment of the present disclosure
- Figure 2 is a schematic diagram of image blocks obtained by decomposing image frames according to an embodiment of the present disclosure
- Figure 3 is a schematic diagram of an image block sequence provided by an embodiment of the present disclosure.
- Figure 4 is a schematic diagram of the image block method provided by an embodiment of the present disclosure.
- Figure 5 is a schematic diagram of a model for implementing a super-resolution method provided by an embodiment of the present disclosure
- Figure 6 is a schematic diagram of an adaptive super-resolution module provided by an embodiment of the present disclosure.
- Figure 7 is a schematic diagram of the first super-resolution network model provided by an embodiment of the present disclosure.
- Figure 8 is a schematic diagram of a second super-resolution network model provided by an embodiment of the present disclosure.
- Figure 9 is a schematic diagram of a third super-resolution network model provided by an embodiment of the present disclosure.
- Figure 10 is a schematic diagram of a video super-resolution device provided by an embodiment of the present disclosure.
- FIG. 11 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present disclosure.
- words such as “first” and “second” are used to describe the same or similar items that have basically the same functions and effects.
- words such as “first” and “second” do not limit the quantity and execution order.
- the first feature image set and the second feature image set are only for distinguishing different feature image sets, rather than limiting the order of the feature image sets.
- words such as “exemplary” or “such as” are used to represent examples, illustrations or explanations. Any embodiment or design described as “exemplary” or “such as” in the present disclosure is not intended to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words “exemplary” or “such as” is intended to present the concept in a concrete manner. Furthermore, in the description of the embodiments of the present disclosure, unless otherwise specified, the meaning of “plurality” means two or more.
- sliding window video super-resolution network models take advantage of the fact that most image frames of the video are in motion.
- each image frame in the video its neighboring image frames can provide a large number of
- the time domain information is used by the video super-resolution network model to super-resolve the current image frame.
- some areas are always still Objects or backgrounds.
- using neighborhood image frames as input is often unable to obtain a more ideal video super-resolution effect, or even super-resolution.
- the sub-resolution effect is not as good as the super-resolution effect based on a single image frame.
- when there is temporal redundant information in the video how to improve the super-resolution effect of the video is an urgent problem to be solved.
- the present disclosure provides a video super-resolution method and device for improving the super-resolution effect of videos.
- the embodiment of the present disclosure provides a video super-resolution method.
- the video super-resolution method provided by the embodiment of the disclosure includes the following steps S11 to S16:
- t and N are both positive integers.
- the implementation of decomposing any image frame into N image blocks includes: sliding with a preset step size starting from the first pixel point of the image frame through a sampling window whose size is the size of one image block. Each position of the image frame is sampled, and each sampling area of the sampling window is regarded as an image block, thereby decomposing the image frame into N image blocks.
- the t-th image frame of the video to be super-resolved includes 1024*512 pixels.
- the size of the sampling window is 72*72 and the step size is 64
- the video to be super-resolved can be
- the t-th image frame is decomposed into 16*8 image blocks. Each image block includes 72*72 pixels, and there is an overlapping area between adjacent image blocks. The width of the overlapping area is 8 pixels.
- each image block in the image block sequence is located at the same position in different image frames.
- the neighborhood image frames of the t-th image frame 33 include: the t-2th image frame 31, the t-1th image frame 32, and the t+1th image frame.
- 34 and the t+2-th image frame 35 are shown as an example.
- Each image library sequence includes 5 image blocks, which are respectively the t-2th image frame 31, the t-1th image frame 32, the tth image frame 33, and the t+1th image.
- the image blocks of frame 34 and the t+2-th image frame 35, and each image block located in the same image block sequence has the same position in the image frame to which it belongs.
- the motion parameters of any image block sequence are used to characterize the optical flow between image blocks of adjacent image frames in the image block sequence.
- Exemplary image block sequence Includes: image block of t-2th image frame Image block of t-1th image frame Image block of the tth image frame Image block of the t+1th image frame And the image block of the t+2th image frame Then the image block sequence
- the motion parameters of are used to characterize image patches with image blocks
- Optical flow between image blocks with image blocks Optical flow between image blocks with image blocks
- calculating the motion parameters of each image block sequence includes: Perform the following steps a to c for each image block sequence:
- Step a Calculate the optical flow between the image blocks of each adjacent image frame in the image block sequence.
- the image block sequence Includes: image block of t-2th image frame Image block of t-1th image frame Image block of the tth image frame Image block of the t+1th image frame And the image block of the t+2th image frame Then calculate the image block with image blocks Optical flow between image blocks with image blocks Optical flow between image blocks with image blocks Optical flow between image blocks with image blocks optical flow between.
- the optical flow between the image blocks of each adjacent image frame in the image block sequence can be calculated based on a dense inverse search (Dense Inverse Search, DIS) optical flow algorithm.
- DIS Densive Inverse Search
- Step b For the optical flow between the image blocks of each pair of adjacent image frames, calculate the average value of the absolute value of the optical flow corresponding to each pixel point, and obtain the motion parameters between the image blocks of the adjacent image frames. .
- image blocks with image blocks The motion parameters between are:
- image blocks with image blocks The motion parameters between are:
- the neighbor image frames of the t-th image frame include the t-2th image frame, the t-1th image frame, and the t+1th image frame.
- the above step S14 (determining the super-resolution network model corresponding to each image block sequence according to the motion parameters of each image block sequence) includes performing the following steps 1 to 5 for each image block sequence:
- Step 1 Determine whether the first motion parameter and the second motion parameter of the image block sequence are both smaller than a preset threshold.
- the first motion parameter is the motion parameter between the image block of the t-th image frame and the image block of the t-1th image frame
- the second motion parameter is the motion parameter of the t-th image frame.
- the image block of the tth image frame The image block with the t-1th image frame
- the motion parameters between Image block of the t+1th image frame image block with the tth image frame The motion parameters between The preset threshold is ⁇ , then the above step 1 is to judge separately and Whether it is less than ⁇ .
- step 1 if both the first motion parameter and the second motion parameter are less than the preset threshold, the following step 2 is performed.
- Step 2 Determine the super-resolution network model corresponding to the image block sequence as the first super-resolution network model.
- the first super-resolution network model is a single-frame super-resolution network model.
- step 1 if the first motion parameter and/or the second motion parameter is greater than or equal to the preset threshold, the following step 3 is performed.
- Step 3 Determine whether the third motion parameter and the fourth motion parameter of the image block sequence are both less than a preset threshold.
- the third motion parameter is the motion parameter between the image block of the t-2th image frame and the image block of the t-1th image frame
- the fourth motion parameter is the motion parameter of the t-2th image frame.
- step 3 above is to judge separately and Whether it is less than ⁇ .
- step 3 if the third motion parameter and the fourth motion parameter are both smaller than the preset threshold, the following step 4 is performed.
- Step 4 Determine the super-resolution network model corresponding to the image block sequence as the second super-resolution network model.
- the second super-resolution network model is used based on the image block of the t-1th image frame, the image block of the tth image frame and the The image block of the t+1-th image frame is super-resolved for the image block of the t-th image frame.
- step 3 if the third motion parameter and/or the fourth motion parameter is greater than or equal to the preset threshold, the following step 5 is performed.
- Step 5 Determine the super-resolution network model corresponding to the image block sequence as the third super-resolution network model.
- the third super-resolution network model is used to perform super-resolution on the image block of the t-th image frame based on all image blocks in the image block sequence.
- the image block of the tth image frame The image block with the t-1th image frame
- the motion parameters between Image block of the t+1th image frame image block with the tth image frame The motion parameters between Image block of t-2th image frame
- the image block with the t-1th image frame The motion parameters between Image block of the t+1th image frame with the image block of the t+2th image frame
- the motion parameters between The default threshold is ⁇ .
- the first super-resolution network model is
- the second super-resolution network model is
- the third super-resolution network model is Then the above steps 1 to 5 can be expressed as follows:
- one super-resolution image block of the t-th image frame can be obtained according to each image block sequence, and a total of N image block sequences are included, a total of N super-resolution image blocks of the t-th image frame can be obtained.
- Super resolution image blocks since one super-resolution image block of the t-th image frame can be obtained according to each image block sequence, and a total of N image block sequences are included, a total of N super-resolution image blocks of the t-th image frame can be obtained.
- the above step S16 when decomposing the t-th image frame and the neighborhood image frame If adjacent image blocks in the obtained image blocks do not have overlapping areas, the above step S16 (generating a super-resolution image frame of the t-th image frame based on each super-resolution image block of the t-th image frame) include:
- Each super-resolution image block of the t-th image frame is spliced into a super-resolution image frame of the t-th image frame.
- the above step S16 (according to Each super-resolution image block of the t-th image frame generates a super-resolution image frame of the t-th image frame) including:
- the pixel value of each pixel point in the overlapping area of each super-resolution image block in the spliced image is set to the average of the pixel values of the corresponding pixel points in each super-resolution image block, and a super-resolution image of the t-th image frame is generated. divided into image frames.
- the starting pixel column of the super-resolution image block 41 in FIG. 4 is the P-th column
- the ending pixel column of the super-resolution image block 41 is the P+m column
- the super-resolution image block 42 The starting pixel column of is the P+nth column
- the ending pixel column of the super-resolution image block 42 is the P+m+nth column.
- the super-resolution image block When splicing the super-resolution image block 41 and the super-resolution image block 42, the super-resolution image block The area 411 of 41 overlaps the area 421 of the super-resolution image block 42, so the pixel value of any pixel in the overlapping area 400 is the average of the pixel values corresponding to the pixel in the area 411 and the area 421.
- the pixel value of the pixel point (x1, y1) in the overlapping area 400 is the average of the pixel value of the pixel point (x1, y1) in the area 411 and the pixel value of the pixel point (x1, y1) in the area 412. .
- FIG. 5 is a schematic structural diagram of a video super-resolution network used to implement the above video super-resolution method.
- the video super-resolution network used to implement the above video super-resolution method includes: image decomposition module 51, sequence generation module 52, redundant information detection module 53, adaptive super-resolution module 54 and image stitching module 55.
- the image decomposition module 51 is used to decompose the t-2th image frame I t-2 into N image blocks. Decompose the t-1th image frame I t-1 into N image blocks Decompose the t-th image frame I t into N image blocks Decompose the t+1th image frame I t+1 into N image blocks Decompose the t-th image frame I t+2 into N image blocks
- the sequence generation module 52 is used according to Generate a sequence of N image patches
- the redundant information monitoring module 53 is used to calculate each image block sequence motion parameters, and according to each image block sequence motion parameters to determine the super-resolution network model of each image block sequence
- the adaptive super-resolution module 54 includes a super-resolution network model corresponding to each image block sequence, and is used to use the super-resolution network model corresponding to each image block sequence to perform image processing on the image block of the t-th image frame in each image block sequence.
- Super-resolution obtain each super-resolution image block of the t-th image frame
- the image splicing module 55 is used to perform each super-resolution image block according to the t-th image frame. Generate the super-resolution image frame O t of the t-th image frame.
- the video super-resolution method When the video super-resolution method provided by the embodiment of the present disclosure performs super-resolution on the t-th image frame, first the t-th image frame of the video to be super-resolved and the neighborhood image of the t-th image frame are respectively The frame is decomposed into N image blocks, and N image block sequences are generated based on the image blocks obtained by decomposing the t-th image frame and the neighbor image frame, and then the motion parameters of each image block sequence are calculated, and according to each The motion parameters of the image block sequence determine the sequence of each image block The super-resolution network model corresponding to each column is determined, and the super-resolution network model corresponding to each image block sequence is determined based on the motion parameters of each image block sequence.
- the super-resolution network model corresponding to each image block sequence is used to calculate the first image block sequence in each image block sequence. Perform super-resolution on the image blocks of t image frames, obtain each super-resolution image block of the t-th image frame, and generate the t-th image frame based on each super-resolution image block of the t-th image frame. Super resolution image frames.
- the video super-resolution method provided by the embodiment of the present disclosure performs super-resolution on the t-th image frame
- the super-resolution network model corresponding to each image block sequence can be determined according to the motion parameters of each image block sequence, and the super-resolution network model corresponding to each image block sequence can be used for different Different super-resolution network models are used to perform super-resolution according to the situation. Therefore, the video super-resolution method provided by the embodiments of the present disclosure can improve the super-resolution effect of the video.
- the adaptive super-resolution module 54 shown in FIG. 5 includes a first super-resolution network model 541 , a second super-resolution network model 542 , and a third super-resolution network model 543 .
- the first super-resolution network model 541 will use the image block of the t-th image frame when performing super-resolution on the image block of the t-th image frame.
- the second super-resolution network model 542 will use the image block of the t-1th image frame, the image block of the t-th image frame and the image block of the t-th image frame to perform super-resolution.
- the third super-resolution network model 543 will use all image blocks in the image block sequence when super-resolving the image blocks of the t-th image frame.
- the image block of the t-th image frame is processed through the first super-resolution network model.
- the implementation method of super-resolution includes the following steps I to step IV:
- Step I Align the image block of the t-th image frame through the cascaded deformable convolution (Pyramid Cascading and Deformable Convolutions) PCD alignment module 71 Processing is performed to obtain the first feature T 1 .
- the input of the PCD alignment module 71 is two image blocks, and the input of the PCD alignment module in the above step I only includes one image block (the image block of the t-th image frame), so the t-th image block can be
- the image blocks of the image frame are copied and used together with the original image blocks as the input of the PCD alignment module.
- Step II Process the first feature through the feature fusion module 72 to obtain the second feature T 2 .
- the second feature is a feature obtained by splicing five of the first features in the channel dimension.
- the second feature may be a feature obtained by splicing a plurality of first features in the channel dimension, and there is no limit to the number of first features used for splicing.
- the tensor of the first feature is C*H*W
- the tensor of the second feature is 5*C*H*W.
- C is the number of channels of the first feature
- H is the length of the first feature
- W is the width of the first feature.
- the feature fusion module 72 may include a temporal attention unit 721 , a feature copy unit 722 , a feature fusion unit 723 and a spatial attention unit 724 .
- the feature copying unit 722 is used to copy the first feature 4 times and splice it with the original first feature.
- Step III Reconstruct the second feature T 2 through the reconstruction module 73 to obtain the first image block B 1 .
- Step IV Upsample the first image block B 1 through the upsampling module 74 to obtain the super-resolution image block corresponding to the image block of the t-th image frame
- the image block of the t-th image frame is processed through the second super-resolution network model.
- the implementation of super-resolution includes the following steps i to iv:
- Step i Align the image block of the t-1th image frame through the PCD alignment module 81 The image block of the t-th image frame and the image block of the t+1th image frame Perform processing to obtain the third feature T 3 .
- the third feature T 3 is a feature obtained by splicing the fourth feature T 4 , the fifth feature T 5 and the sixth feature T 6 in the channel dimension, and the fourth feature T 4 is aligned by the PCD module for the image block of the t-1th image frame and the image block of the t-th image frame Features obtained by processing, the fifth feature T 5 is the image block of the t-th image frame through the PCD alignment module Features obtained by processing, the sixth feature is the image block of the t-th image frame through the PCD alignment module and the image block of the t+1th image frame Features obtained through processing.
- the PCD alignment module 81 includes a first PCD alignment unit 811 , a second PCD alignment unit 812 , a third PCD alignment unit 813 and a splicing unit 814 .
- the first PCD alignment unit 811 is used to align the image block of the t-1th image frame and the image block of the t-th image frame Perform processing to obtain the fourth feature T 4
- the second PCD alignment unit 812 is used to align the image block of the t-th image frame Perform processing to obtain the fifth feature T 5
- the third PCD alignment unit 813 is used to align the image block of the t-th image frame and the image block of the t+1th image frame Processing is performed to obtain the fifth feature T 6
- the splicing unit 814 is used to splice the fourth feature T 4 , the fifth feature T 5 and the sixth feature T 6 to obtain the third feature T 3 .
- Step ii Process the third feature T 3 through the feature fusion module 82 to obtain the seventh feature T 7 .
- the seventh feature T 7 is a feature obtained by splicing the fourth feature T 4 , the third feature T 3 and the fifth feature T 5 in the channel dimension.
- the feature fusion module 82 may include a temporal attention unit 821 , a feature copy unit 822 , a feature fusion unit 823 and a spatial attention unit 824 .
- the feature copying unit 822 is used to copy the fourth feature T 4 and the fifth feature T 5 in the third feature T 3 once, and splice them with the third feature T 3 .
- Step iii Reconstruct the seventh feature T 7 through the reconstruction module 83 to obtain the second image block B 2 .
- Step iv Upsample the second image block B 2 through the upsampling module 84 to obtain the super-resolution image block corresponding to the image block of the t-th image frame
- the t-th image block sequence in the image block sequence is processed through the third super-resolution network model.
- Super-resolving the image blocks of the image frame includes the following steps 1 to step 4:
- Step 1 Pair the image block sequence through the PCD alignment module 91 All image blocks in are processed to obtain the eighth feature T 8 .
- the eighth feature T 8 is obtained by splicing the ninth feature T 9 , the tenth feature T 10 , the eleventh feature T 11 , the twelfth feature T 12 and the thirteenth feature T 13 in the channel dimension.
- the ninth feature T 9 is the image block of the t-2th image frame through the PCD alignment module and the image block of the t-1th image frame Features obtained by processing
- the tenth feature is the image block of the t-1th image frame through the PCD alignment module and the image block of the t-th image frame Features obtained by processing
- the eleventh feature is the image block of the t-th image frame through the PCD alignment module
- the twelfth feature is The image block of the t-th image frame is processed by the PCD alignment module and the image block of the t+1th image frame Features obtained by processing
- the thirteenth feature is the image block of the t+1th image frame through the PCD alignment module and the image block of the t+2th image frame Features obtained through processing.
- the PCD alignment module 91 includes a first PCD alignment unit 911 , a second PCD alignment unit 912 , a third PCD alignment unit 913 , a fourth PCD alignment unit 914 , a fifth PCD alignment unit 915 and Splicing unit 916.
- the first PCD alignment unit 911 is used to align the image block of the t-2th image frame and the image block of the t-1th image frame Perform processing to obtain the ninth feature T 9 ;
- the second PCD alignment unit 912 is used to align the image block of the t-1th image frame and the image block of the t-th image frame Perform processing to obtain the tenth feature T 10 ;
- the third PCD alignment unit 913 is used to align the image block of the t-th image frame Perform processing to obtain the eleventh feature T 11 ;
- the fourth PCD alignment unit 915 is used to align the image block of the t-th image frame and the image block of the t+1th image frame Perform processing to obtain the twelfth feature T 12 ;
- the fifth PCD alignment unit 915 is used to align the image block of the t+1th image frame and the image block of the t+2th image frame Processing is performed to obtain the thirteenth feature T 13 ;
- the splicing unit 916 is used to splice the ninth
- Step 2 Process the eighth feature T 8 through the feature fusion module 92 to obtain the fourteenth feature T 14 .
- the feature fusion module 92 may include a temporal attention unit 921 , a feature fusion unit 922 and a spatial attention unit 923 .
- Step 3 Reconstruct the fourteenth feature T 14 through the reconstruction module 93 to obtain the third image block B 3 .
- Step 4 Upsample the third image block B 3 through the upsampling module 94 to obtain the super-resolution image block corresponding to the image block of the t-th image frame in the image block sequence.
- an embodiment of the present disclosure also provides a video super-resolution device.
- the device embodiment corresponds to the foregoing method embodiment.
- the device embodiment is no longer The details in the foregoing method embodiments will be described one by one, but it should be clear that the video super-resolution device in this embodiment can correspondingly implement all the contents in the foregoing method embodiments.
- FIG. 10 is a schematic structural diagram of the video super-resolution device. As shown in Figure 10, the video super-resolution device 100 includes:
- the image decomposition module 101 is used to decompose the t-th image frame of the video to be super-resolved and the neighbor image frame of the t-th image frame into N image blocks respectively; t and N are both positive integers;
- the sequence generation module 102 is configured to generate N image block sequences based on the image blocks obtained by decomposing the t-th image frame and the neighborhood image frame. Each image block in the image block sequence is located in a different image frame. the same position;
- the parameter calculation module 103 is used to calculate the motion parameters of each image block sequence; the motion parameters of any image block sequence are used to characterize the optical flow between the image blocks of each adjacent image frame in the image block sequence;
- the model determination module 104 is used to determine the super-resolution network model corresponding to each image block sequence according to the motion parameters of each image block sequence;
- the image super-resolution module 105 is used to use the super-resolution network model corresponding to each image block sequence to analyze each image block sequence. Perform super-resolution on the image blocks of the t-th image frame in , and obtain each super-resolution image block of the t-th image frame;
- the image generation module 106 generates a super-resolution image frame of the t-th image frame based on each super-resolution image block of the t-th image frame.
- the parameter calculation module 103 is specifically configured to calculate, for each image block sequence, the optical flow between the image blocks of each adjacent image frame in the image block sequence. ; For the optical flow between the image blocks of each pair of adjacent image frames, calculate the average value of the absolute value of the optical flow corresponding to each pixel point, and obtain the motion parameters between the image blocks of the adjacent image frames; according to Motion parameters between image blocks of adjacent image frames in the image block sequence are used to obtain motion parameters of the image block sequence.
- the neighbor image frames of the t-th image frame include:
- the t-2th image frame, the t-1th image frame, the t+1th image frame and the t+2th image frame of the video to be super-resolved.
- the model determination module 104 is specifically configured to determine, for each image block sequence, whether the first motion parameter and the second motion parameter of the image block sequence are both smaller than a predetermined value.
- Set a threshold the first motion parameter is the motion parameter between the image block of the t-th image frame and the image block of the t-1th image frame
- the second motion parameter is the motion parameter of the t-th image frame.
- the fourth motion parameter is the motion parameter between the image block of the t+1th image frame and the image block of the t+2th image frame; if the third motion parameter and the fourth motion parameter are both less than the preset threshold, then the super-resolution network model corresponding to the image block sequence is determined to be the second super-resolution network model; if the third motion parameter and/or the fourth If the motion parameter is greater
- the first super-resolution network model is a single-frame super-resolution network model
- the second super-resolution network model is used to calculate the image block based on the image block of the t-1th image frame, the image block of the tth image frame, and the image block of the t+1th image frame.
- the image block of the t-th image frame is super-resolved;
- the third super-resolution network model is used to perform super-resolution on the image block of the t-th image frame based on all image blocks in the image block sequence.
- the image super-resolution module 105 is specifically configured to process the image block of the t-th image frame through a cascaded deformable convolution PCD alignment module to obtain the first Features;
- the first feature is processed through the feature fusion module to obtain the second feature, which is a feature obtained by splicing the five first features in the channel dimension;
- the third feature is processed through the reconstruction module The two features are reconstructed to obtain the first image block; the first image block is upsampled through the upsampling module to obtain the super-resolution image block corresponding to the image block of the t-th image frame.
- the image super-resolution module 105 is specifically configured to use the PCD alignment module to compare the image blocks of the t-1th image frame, the tth image frame The image blocks and the image blocks of the t+1th image frame are processed to obtain the third feature; the third feature is a feature obtained by splicing the fourth feature, the fifth feature and the sixth feature in the channel dimension.
- the fourth feature is a feature obtained by processing the image block of the t-1th image frame and the image block of the tth image frame by the PCD alignment module
- the fifth feature is obtained by processing The features obtained by the PCD alignment module processing the image block of the t-th image frame
- the sixth feature is the image block of the t-th image frame and the t-th image block processed by the PCD alignment module.
- the third feature is processed through the feature fusion module to obtain a seventh feature, which is the fourth feature, the third feature and the The fifth feature is spliced in the channel dimension; the seventh feature is reconstructed through the reconstruction module to obtain the second image block; the second image block is obtained through the upsampling module The image block is upsampled to obtain a super-resolution image block corresponding to the image block of the t-th image frame.
- the image super-resolution module 105 is specifically used to process all image blocks in the image block sequence through the PCD alignment module to obtain the eighth feature; the eighth feature
- the feature is a feature obtained by splicing the ninth feature, the tenth feature, the eleventh feature, the twelfth feature and the thirteenth feature in the channel dimension.
- the ninth feature is the alignment of the first feature through the PCD alignment module.
- the feature obtained by processing the image blocks of t-2 image frames and the image blocks of the t-1th image frame, and the tenth feature is the processing of the t-1th image frame through the PCD alignment module
- the feature obtained by processing the image block of the t-th image frame and the image block of the t-th image frame, and the eleventh feature is a feature obtained by processing the image block of the t-th image frame by the PCD alignment module
- the twelfth feature is a feature obtained by processing the image block of the t-th image frame and the image block of the t+1-th image frame by the PCD alignment module
- the thirteenth feature is Features obtained by processing the image block of the t+1th image frame and the image block of the t+2th image frame through the PCD alignment module; processing the eighth feature through the feature fusion module , obtain the fourteenth feature; reconstruct the fourteenth feature through the reconstruction module to obtain the third image block; upsample the third image block through the upsampling module to obtain the first image block sequence
- adjacent image blocks in the image blocks obtained by decomposing the t-th image frame and the neighbor image frame have overlapping areas
- the image generation module 106 is specifically used to splice each super-resolution image block of the t-th image frame to generate a spliced image; and generate the pixel value of each pixel point in the overlapping area of each super-resolution image block in the spliced image. Set as the average value of the pixel values of the corresponding pixels in each super-resolution image block, and generate the super-resolution image frame of the t-th image frame.
- modules may be implemented as software components executing on one or more general-purpose processors, or as hardware such as programmable logic devices and/or application-specific integrated circuits that perform certain functions or combinations thereof.
- these modules may be embodied in the form of software products, which may be stored in non-volatile storage media that enable computer devices (e.g., personal computers, servers, networks, etc.) devices, mobile terminals, etc.) to implement the methods described in the embodiments of the present disclosure.
- the above module can also be implemented on a single device or distributed on multiple devices. The functionality of these modules can be combined with each other or further split into sub-modules.
- the video super-resolution device provided in this embodiment can perform the video super-resolution method provided in the above method embodiment. Its implementation principles and technical effects are similar and will not be described again here.
- FIG 11 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
- the electronic device provided by this embodiment includes: a memory 111 and a processor 112.
- the memory 111 is used to store computer programs; the processing
- the processor 112 is configured to execute the video super-resolution method provided by the above embodiment when calling a computer program.
- embodiments of the present disclosure also provide a computer-readable storage medium.
- the computer-readable storage medium stores a computer program.
- the computing device implements the above embodiments.
- embodiments of the present disclosure also provide a computer program product.
- the computing device implements the video super-resolution method provided by the above embodiments.
- embodiments of the present disclosure also provide a computer program, which includes instructions that, when executed by a processor, cause the processor to execute the video super-resolution method provided by the above embodiments.
- embodiments of the present disclosure may be provided as methods, systems, or computer program products. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
- the processor can be a Central Processing Unit (CPU), other general-purpose processors, Digital Signal Processor (DSP), Application Specific Integrated Circuit (ASIC), or off-the-shelf programmable processors. Gate array (Field-Programmable Gate Array, FPGA) or other programmable Logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
- Memory may include non-volatile memory in computer-readable media, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
- RAM random access memory
- ROM read-only memory
- flash RAM flash memory
- Computer-readable media includes permanent and non-permanent, removable and non-removable storage media.
- Storage media can be implemented by any method or technology to store information, and information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
- PRAM phase change memory
- SRAM static random access memory
- DRAM dynamic random access memory
- RAM random access memory
- read-only memory read-only memory
- ROM read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technology
- compact disc read-only memory CD-ROM
- DVD digital versatile disc
- Magnetic tape cassettes disk storage or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by a computing device.
- computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Television Systems (AREA)
Abstract
本公开实施例提供了一种视频的超分辨率方法及装置,涉及图像处理技术领域。该方法包括:分别将待超分视频的第t个图像帧和第t个图像帧的邻域图像帧分解为N个图像块;生成N个图像块序列,图像块序列中的各个图像块分别位于不同图像帧的相同位置;计算各个图像块序列的运动参数;图像块序列的运动参数包括该图像块序列中各个相邻图像帧的图像块之间的运动参数;根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;采用对应的超分网络模型对各个图像块序列中的第t个图像帧的图像块进行超分,获取第t个图像帧的各个超分图像块;根据第t个图像帧的各个超分图像块生成第t个图像帧的超分图像帧。
Description
相关申请的交叉引用
本公开是以中国申请号为202210265124.7,申请日为2022年03月17日的申请为基础,并主张其优先权,该中国申请的公开内容在此作为整体引入本公开中。
本公开涉及图像处理技术领域,尤其涉及一种视频的超分辨率方法及装置。
视频的超分辨率技术又称为视频超分技术,是一种由低分辨率视频恢复出高分辨率视频的技术。由于视频超分辨率业务目前已成为视频画质增强中的重点业务,因此视频超分技术是当前图像处理领域的研究热点之一。
近年来,随着深度学习技术的发展,基于深度学习神经网络的视频超分网络模型实现了许多突破,包括更好的超分效果以及更好的实时性。目前,主流的滑动窗口型视频超分网络模型均是利用视频的大多数图像帧都处于运动之中,在对视频中的每一个图像帧进行超分时,其邻域图像帧都可以提供大量的时域信息以供视频超分网络模型对当前图像帧进行超分。
发明内容
第一方面,本公开的实施例提供了一种视频的超分辨率方法,包括:
分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块;t、N均为正整数;
根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置;
计算各个图像块序列的运动参数;任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流;
根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;
采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块;
根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
第二方面,本公开的实施例提供了一种视频的超分辨率装置,包括:
图像分解模块,用于分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块;t、N均为正整数序列生成模块,用于根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置;
参数计算模块,用于计算各个图像块序列的运动参数;任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流;
模型确定模块,用于根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;
图像超分模块,用于采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块;
图像生成模块,根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
第三方面,本公开实施例提供了一种电子设备,包括:存储器和处理器,所述存储器用于存储计算机程序;所述处理器用于在调用计算机程序时,使得所述电子设备实现第一方面或第一方面任一种可选的实施方式所述的视频的超分辨率方法。
第四方面,本公开实施例提供一种计算机可读存储介质,当所述计算机程序被计算设备执行时,使得所述计算设备实现第一方面或第一方面任一种可选的实施方式所述的视频的超分辨率方法。
第五方面,本公开实施例提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机实现第一方面或第一方面任一种可选的实施方式所述的视频的超分辨率方法。
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的视频的超分辨率方法的步骤流程图;
图2为本公开实施例提供的图像帧分解得到的图像块的示意图;
图3为本公开实施例提供的图像块序列的示意图;
图4为本公开实施例提供的图像块方式的示意图;
图5为本公开实施例提供的实现超分辨率方法的模型的示意图;
图6为本公开实施例提供的自适应超分模块的示意图;
图7为本公开实施例提供的第一超分网络模型的示意图;
图8为本公开实施例提供的第二超分网络模型的示意图;
图9为本公开实施例提供的第三超分网络模型的示意图;
图10为本公开实施例提供的视频的超分辨率装置的示意图;
图11为本公开实施例提供的电子设备的硬件结构示意图。
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
需要说明的是,为了便于清楚描述本公开实施例的技术方案,在本公开的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分,本领域技术人员可以理解“第一”、“第二”等字样并不是在对数量和执行次序进行限定。例如:第一特征图像集合和第二特征图像集合仅仅是为了区分不同的特征图像集合,而不是在对特征图像集合的顺序等进行限定。
在本公开实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本公开实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。此外,在本公开实施例的描述中,除非另有说明,“多个”的含义是指两个或两个以上。
相关技术中,滑动窗口型视频超分网络模型均是利用视频的大多数图像帧都处于运动之中,在对视频中的每一个图像帧进行超分时,其邻域图像帧都可以提供大量的时域信息以供视频超分网络模型对当前图像帧进行超分。然而,在一些视频中部分区域始终为静止
的物体或背景,在对这类视频进行超分时,由于静止的物体或背景带来的时域冗余信息,采用邻域图像帧作为输入常常无法获取较为理想的视频超分效果,甚至超分效果不及基于单个图像帧进行超分的超分效果。综上,当视频中存在时域冗余信息时,如何提升视频的超分效果是一个亟待解决的问题。
有鉴于此,本公开提供了一种视频的超分辨率方法及装置,用于提升视频的超分辨率效果。
本公开实施例提供了一种视频的超分辨率方法,参照图1所示的步骤流程图,本公开实施例提供的视频的超分辨率方法包括如下步骤S11至S16:
S11、分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块。
其中,t、N均为正整数。
在一些实施例中,将任一图像帧分解为N个图像块的实现方式包括:通过尺寸为一个图像块的尺寸的采样窗口,从该图像帧的第一像素点开始以预设步长滑动对该图像帧的各个位置进行采样,并将采样窗口的每一个采样区域作为一个图像块,从而将该图像帧分解为N个图像块。
示例性的,参照图2所示,待超分视频的第t个图像帧包括1024*512个像素点,当采样窗口的尺寸为72*72、步长为64时,可以将待超分视频的第t个图像帧分解为16*8个图像块,每一个图像块包括72*72个像素点,且相邻图像块之间具有重叠区域,重叠区域的宽度为8个像素点。
S12、根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列。
其中,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置。
示例性的,参照图3所示,所述第t个图像帧33的邻域图像帧包括:第t-2个图像帧31、第t-1个图像帧32、第t+1个图像帧34以及第t+2个图像帧35为例示出。每一个图像库序列中包括5个图像块,该5个图像块分别为第t-2个图像帧31、第t-1个图像帧32、第t个图像帧33、第t+1个图像帧34以及第t+2个图像帧35的图像块,且位于同一图像块序列中的各个图像块在各自所属的图像帧中的位置相同。
S13、计算各个图像块序列的运动参数。
其中,任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流。
示例性的,图像块序列包括:第t-2个图像帧的图像块第t-1个图像帧的图像块第t个图像帧的图像块第t+1个图像帧的图像块以及第t+2个图像帧的图像块则图像块序列的运动参数用于表征图像块与图像块之间的光流、图像块与图像块之间的光流、图像块与图像块之间的光流以及图像块与图像块之间的光流。
作为本公开实施例一种可选的实施方式,所述计算各个图像块序列的运动参数,包括
针对每一个图像块序列执行如下步骤a至步骤c:
步骤a、计算所述图像块序列中各个相邻图像帧的图像块之间的光流。
承上所述,图像块序列包括:第t-2个图像帧的图像块第t-1个图像帧的图像块第t个图像帧的图像块第t+1个图像帧的图像块以及第t+2个图像帧的图像块则计算图像块与图像块之间的光流、图像块与图像块之间的光流、图像块与图像块之间的光流以及图像块与图像块之间的光流。
示例性的,可以基于稠密逆搜索(Dense Inverse Search,DIS)光流算法计算所述图像块序列中各个相邻图像帧的图像块之间的光流。
步骤b、针对每一对相邻图像帧的图像块之间的光流,计算各个像素点对应的光流的绝对值的平均值,获取所述相邻图像帧的图像块之间的运动参数。
将图像块序列中相邻图像帧的图像块之间的运动参数表示为:光流算法表示为f(…)、对各个像素点对应的光流求平均值求表示为mean(…)、求绝对值表示为|…|,则有:
例如,图像块与图像块之间的运动参数为:
再例如,图像块与图像块之间的运动参数为:
S14、根据各个图像块序列的运动参数,确定各个图像块序列对应的超分网络模型。
作为本公开实施例一种可选的实施方式,所述第t个图像帧的邻域图像帧,包括第t-2个图像帧、第t-1个图像帧、第t+1个图像帧以及第t+2个图像帧,上述步骤S14(根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型)包括针对每一个图像块序列执行如下步骤1至步骤5:
步骤1、确定所述图像块序列的第一运动参数和第二运动参数是否均小于预设阈值。
其中,所述第一运动参数为所述第t个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第二运动参数为所述第t个图像帧的图像块与所述第t+1个图像帧的图像块之间的运动参数。
设:第t个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数为第t+1个图像帧的图像块与所述第t个图像帧的图像块之间的运动参数为预设阈值为γ,则上述步骤1为分别判断和是否小于γ。
在上述步骤1中,若所述第一运动参数和所述第二运动参数均小于所述预设阈值,则执行如下步骤2。
步骤2、确定所述图像块序列对应的超分网络模型为第一超分网络模型。
作为本公开实施例一种可选的实施方式,所述第一超分网络模型为单帧超分网络模型。
在上述步骤1中,若所述第一运动参数和/或所述第二运动参数大于或等于所述预设阈值,则执行如下步骤3。
步骤3、确定所述图像块序列的第三运动参数和第四运动参数是否均小于预设阈值。
其中,所述第三运动参数为所述第t-2个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第四运动参数为所述第t+1个图像帧的图像块与所述第t+2个图像帧的图像块之间的运动参数。
设:第t-2个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数为第t+1个图像帧的图像块与所述第t+2个图像帧的图像块之间的运动参数为预设阈值为γ,则上述步骤3为分别判断和是否小于γ。
在上述步骤3中,若所述第三运动参数和所述第四运动参数均小于所述预设阈值,则执行如下步骤4。
步骤4、确定所述图像块序列对应的超分网络模型为第二超分网络模型。
作为本公开实施例一种可选的实施方式,所述第二超分网络模型用于基于所述第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块,对所述第t个图像帧的图像块进行超分。
在上述步骤3中,若所述第三运动参数和/或所述第四运动参数大于或等于所述预设阈值,则执行如下步骤5。
步骤5、确定所述图像块序列对应的超分网络模型为第三超分网络模型。
作为本公开实施例一种可选的实施方式,所述第三超分网络模型用于基于图像块序列中的所有图像块对所述第t个图像帧的图像块进行超分。
设:第t个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数为第t+1个图像帧的图像块与所述第t个图像帧的图像块之间的运动参数为第t-2个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数为第t+1个图像帧的图像块与所述第t+2个图像帧的图像块之间的运动参数为预设阈值为γ。第一超分网络模型为第二超分网络模型为第三超分网络模型为则上述步骤1至5可以表示如下:
S15、采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块。
具体的,由于根据每一个图像块序列均可以获取一个所述第t个图像帧的超分图像块,且共包括N个图像块序列,因此共可以获取N个所述第t个图像帧的超分图像块。
S16、根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
作为本公开实施例一种可选的实施方式,当分解所述第t个图像帧和所述邻域图像帧
得到的图像块中的相邻图像块不具有重叠区域的情况下,上述步骤S16(根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧)包括:
将所述第t个图像帧的各个超分图像块拼接为所述第t个图像帧的超分图像帧。
作为本公开实施例一种可选的实施方式,当分解所述第t个图像帧和所述邻域图像帧得到的图像块中相邻图像块具有重叠区域的情况下,上述步骤S16(根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧)包括:
拼接所述第t个图像帧的各个超分图像块,生成拼接图像;
将所述拼接图像中各个超分图像块的重叠区域的各个像素点的像素值设置为各个超分图像块中对应的像素点的像素值的平均值,生成所述第t个图像帧的超分图像帧。
示例性的,参照图4所示,图4中的超分图像块41的起始像素列为第P列,超分图像块41的终止像素列为第P+m列,超分图像块42的起始像素列为第P+n列,超分图像块42的终止像素列为第P+m+n列,对超分图像块41和超分图像块42进行拼接时,超分图像块41的区域411会与超分图像块42的区域421重叠,则因此重叠区域400的任一像素点的像素值为区域411与区域421中对应该像素点的像素值的平均值。例如:对于重叠区域400中的像素点(x1,y1)的像素值,为区域411中像素点(x1,y1)的像素值与区域412中像素点(x1,y1)的像素值的平均值。
参照图5所示,图5为用于实现上述视频超分方法的视频超分网络的结构示意图。用于实现上述视频超分方法的视频超分网络包括:图像分解模块51、序列生成模块52、冗余信息检测模块53、自适应超分模块54以及图像拼接模块55。
其中,图像分解模块51用于将第t-2个图像帧It-2分解为N个图像块将第t-1个图像帧It-1分解为N个图像块将第t个图像帧It分解为N个图像块将第t+1个图像帧It+1分解为N个图像块将第t个图像帧It+2分解为N个图像块
序列生成模块52用于根据生成N个图像块序列
冗余信息监测模块53用于计算各个图像块序列的运动参数,并根据各个图像块序列的运动参数,确定各个图像块序列的超分网络模型
自适应超分模块54包括各个图像块序列对应的超分网络模型,用于采用各个图像块序列对应的超分网络模型,对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块
图像拼接模块55用于根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧Ot。
本公开实施例提供的视频的超分辨率方法在对第t个图像帧图像进行超分时,首先分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块,并根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,然后计算各个图像块序列的运动参数,以及根据各个图像块序列的运动参数确定各个图像块序
列对应的超分网络模型,再根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型,采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块,以及根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。由于本公开实施例提供的视频的超分辨率方法在对第t个图像帧图像进行超分时,可以根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型,并针对不同情况适应性的采用不同的超分网络模型进行超分,因此本公开实施例提供的视频的超分辨率方法可以提升视频的超分辨率效果。
示例性的,参照图6所示,图5所示自适超分模块54中包括第一超分网络模型541、第二超分网络模型542以及第三超分网络模型543。其中,第一超分网络模型541在对所述第t个图像帧的图像块进行超分时会使用第t个图像帧的图像块第二超分网络模型542在对所述第t个图像帧的图像块进行超分时会使用第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块第三超分网络模型543在对所述第t个图像帧的图像块进行超分时会使用图像块序列中的所有图像块
作为本公开实施例一种可选的实施方式,参照图7所示的第一超分网络模型的模型结构示意图所示,通过第一超分网络模型对所述第t个图像帧的图像块进行超分的实现方式包括如下步骤Ⅰ至步骤Ⅳ:
步骤Ⅰ、通过级联可形变卷积(Pyramid Cascading and Deformable Convolutions)PCD对齐模块71对所述第t个图像帧的图像块进行处理获取第一特征T1。
参照图7所示,PCD对齐模块71的输入为两个图像块,而上述步骤Ⅰ中PCD对齐模块的输入仅包括一个图像块(第t个图像帧的图像块),因此可以将第t个图像帧的图像块复制一份,并与原图像块共同作为PCD对齐模块的输入。
步骤Ⅱ、通过特征融合模块72对所述第一特征进行处理,获取第二特征T2。
其中,所述第二特征为将五个所述第一特征在通道维度上拼接得到的特征。本领域技术人员应当知晓,第二特征可以是多个所述第一特征在通道维度上拼接得到的特征,在此对用于拼接的第一特征的数量不做限制。
即,设第一特征的张量为C*H*W,则第二特征的张量为5*C*H*W。其中,C为第一特征的通道数、H为第一特征的长度、W为第一特征的宽度。
示例性的,参照图7所示,特征融合模块72可以包括时间注意力单元721、特征复制单元722、特征融合单元723以及空间注意力单元724。其中,特征复制单元722用于将第一特征复制4次,并与原第一特征进行拼接。
步骤Ⅲ、通过重建模块73对所述第二特征T2进行重建,获取第一图像块B1。
步骤Ⅳ、通过上采样模块74对所述第一图像块B1进行上采样,获取所述第t个图像帧的图像块对应的超分图像块
作为本公开实施例一种可选的实施方式,参照图8所示的第二超分网络模型的模型结构示意图所示,通过第二超分网络模型对所述第t个图像帧的图像块进行超分的实现方式包括如下步骤ⅰ至步骤ⅳ:
步骤ⅰ、对通过PCD对齐模块81对所述第t-1个图像帧的图像块所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块进行处理,获取第三特征T3。
其中,所述第三特征T3为将第四特征T4、第五特征T5以及第六特征T6在通道维度上拼接得到的特征,所述第四特征T4为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第五特征T5为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第六特征为通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征。
参照图8所示,PCD对齐模块81包括第一PCD对齐单元811、第二PCD对齐单元812、第三PCD对齐单元813以及拼接单元814。其中,第一PCD对齐单元811用于对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理,获取第四特征T4;第二PCD对齐单元812用于对第t个图像帧的图像块进行处理,获取第五特征T5;第三PCD对齐单元813用于对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理,获取第五特征T6;拼接单元814用于拼接第四特征T4、第五特征T5以及第六特征T6,获取第三特征T3。
步骤ⅱ、通过特征融合模块82对所述第三特征T3进行处理,获取第七特征T7。
其中,所述第七特征T7为所第四特征T4、所述第三特征T3以及所述第五特征T5在通道维度上拼接得到的特征。
示例性的,参照图8所示,特征融合模块82可以包括时间注意力单元821、特征复制单元822、特征融合单元823以及空间注意力单元824。其中特征复制单元822用于将所述第三特征T3中的所第四特征T4和第五特征T5复制一次,并与第三特征T3进行拼接。
步骤ⅲ、通过重建模块83对所述第七特征T7进行重建,获取第二图像块B2。
步骤ⅳ、通过上采样模块84对所述第二图像块B2进行上采样,获取所述第t个图像帧的图像块对应的超分图像块
作为本公开实施例一种可选的实施方式,参照图9所示的第三超分网络模型的模型结构示意图所示,通过第三超分网络模型对图像块序列中的所述第t个图像帧的图像块进行超分包括如下步骤①至步骤④:
步骤①、对通过PCD对齐模块91对图像块序列中的全部图像块进行处理,获取第八特征T8。
其中,所述第八特征T8为将第九特征T9、第十特征T10、第十一特征T11、第十二特征T12以及第十三特征T13在通道维度上拼接得到的特征,所述第九特征T9为通过所述PCD对齐模块对所述第t-2个图像帧的图像块和所述第t-1个图像帧的图像块进行处理得到的特征,所述第十特征为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第十一特征为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第十二特征为
通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征,所述第十三特征为通过所述PCD对齐模块对所述第t+1个图像帧的图像块和所述第t+2个图像帧的图像块进行处理得到的特征。
示例性的,参照图9所示,PCD对齐模块91包括第一PCD对齐单元911、第二PCD对齐单元912、第三PCD对齐单元913,第四PCD对齐单元914、第五PCD对齐单元915以及拼接单元916。第一PCD对齐单元911用于对所述第t-2个图像帧的图像块和所述第t-1个图像帧的图像块进行处理,获取第九特征T9;第二PCD对齐单元912用于对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理,获取第十特征T10;第三PCD对齐单元913用于对第t个图像帧的图像块进行处理,获取第十一特征T11;第四PCD对齐单元915用于对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理,获取第十二特征T12;第五PCD对齐单元915用于对所述第t+1个图像帧的图像块和所述第t+2个图像帧的图像块进行处理,获取第十三特征T13;拼接单元916用于拼接第九特征T9、第十特征T10、第十一特征T11、第十二特征T12以及第十三特征T13,获取第八特征T8。
步骤②、通过特征融合模块92对所述第八特征T8进行处理,获取第十四特征T14。
示例性的,参照图9所示,特征融合模块92可以包括时间注意力单元921、特征融合单元922以及空间注意力单元923。
步骤③、通过重建模块93对所述第十四特征T14进行重建,获取第三图像块B3。
步骤④、通过上采样模块94对所述第三图像块B3进行上采样,获取图像块序列中的所述第t个图像帧的图像块对应的超分图像块
基于同一发明构思,作为对上述方法的实现,本公开实施例还提供了一种视频的超分辨率装置,该装置实施例与前述方法实施例对应,为便于阅读,本装置实施例不再对前述方法实施例中的细节内容进行逐一赘述,但应当明确,本实施例中的视频的超分辨率装置能够对应实现前述方法实施例中的全部内容。
本公开实施例提供了一种视频的超分辨率装置,图10为该视频的超分辨率装置的结构示意图,如图10所示,该视频的超分辨率装置100包括:
图像分解模块101,用于分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块;t、N均为正整数;
序列生成模块102,用于根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置;
参数计算模块103,用于计算各个图像块序列的运动参数;任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流;
模型确定模块104,用于根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;
图像超分模块105,用于采用各个图像块序列对应的超分网络模型对各个图像块序列
中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块;
图像生成模块106,根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
作为本公开实施例一种可选的实施方式,所述参数计算模块103,具体用于针对每一个图像块序列,计算所述图像块序列中各个相邻图像帧的图像块之间的光流;针对每一对相邻图像帧的图像块之间的光流,计算各个像素点对应的光流的绝对值的平均值,获取所述相邻图像帧的图像块之间的运动参数;根据所述图像块序列中各个相邻图像帧的图像块之间的运动参数,获取所述图像块序列的运动参数。
作为本公开实施例一种可选的实施方式,所述第t个图像帧的邻域图像帧,包括:
所述待超分视频的第t-2个图像帧、第t-1个图像帧、第t+1个图像帧以及第t+2个图像帧。
作为本公开实施例一种可选的实施方式,所述模型确定模块104,具体用于针对每一个图像块序列,确定所述图像块序列的第一运动参数和第二运动参数是否均小于预设阈值;所述第一运动参数为所述第t个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第二运动参数为所述第t个图像帧的图像块与所述第t+1个图像帧的图像块之间的运动参数;若所述第一运动参数和所述第二运动参数均小于所述预设阈值,则确定所述图像块序列对应的超分网络模型为第一超分网络模型;若所述第一运动参数和/或所述第二运动参数大于或等于所述预设阈值,则确定所述图像块序列的第三运动参数和第四运动参数是否均小于预设阈值;所述第三运动参数为所述第t-2个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第四运动参数为所述第t+1个图像帧的图像块与所述第t+2个图像帧的图像块之间的运动参数;若所述第三运动参数和所述第四运动参数均小于所述预设阈值,则确定所述图像块序列对应的超分网络模型为第二超分网络模型;若所述第三运动参数和/或所述第四运动参数大于或等于所述预设阈值,则确定所述图像块序列对应的超分网络模型为第三超分网络模型。
作为本公开实施例一种可选的实施方式,
所述第一超分网络模型为单帧超分网络模型;
所述第二超分网络模型用于基于所述第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块,对所述第t个图像帧的图像块进行超分;
所述第三超分网络模型用于基于图像块序列中的所有图像块对所述第t个图像帧的图像块进行超分。
作为本公开实施例一种可选的实施方式,所述图像超分模块105,具体用于通过级联可形变卷积PCD对齐模块对所述第t个图像帧的图像块进行处理获取第一特征;通过特征融合模块对所述第一特征进行处理,获取第二特征,所述第二特征为将五个所述第一特征在通道维度上拼接得到的特征;通过重建模块对所述第二特征进行重建,获取第一图像块;通过上采样模块对所述第一图像块进行上采样,获取所述第t个图像帧的图像块对应的超分图像块。
作为本公开实施例一种可选的实施方式,所述图像超分模块105,具体用于对通过PCD对齐模块对所述第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块进行处理,获取第三特征;所述第三特征为将第四特征、第五特征以及第六特征在通道维度上拼接得到的特征,所述第四特征为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第五特征为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第六特征为通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征;通过特征融合模块对所述第三特征进行处理,获取第七特征,所述第七特征为所第四特征、所述第三特征以及所述第五特征在通道维度上拼接得到的特征;通过重建模块对所述第七特征进行重建,获取第二图像块;通过上采样模块对所述第二图
像块进行上采样,获取所述第t个图像帧的图像块对应的超分图像块。
作为本公开实施例一种可选的实施方式,所述图像超分模块105,具体用于对通过PCD对齐模块对图像块序列中的全部图像块进行处理,获取第八特征;所述第八特征为将第九特征、第十特征、第十一特征、第十二特征以及第十三特征在通道维度上拼接得到的特征,所述第九特征为通过所述PCD对齐模块对所述第t-2个图像帧的图像块和所述第t-1个图像帧的图像块进行处理得到的特征,所述第十特征为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第十一特征为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第十二特征为通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征,所述第十三特征为通过所述PCD对齐模块对所述第t+1个图像帧的图像块和所述第t+2个图像帧的图像块进行处理得到的特征;通过特征融合模块对所述第八特征进行处理,获取第十四特征;通过重建模块对所述第十四特征进行重建,获取第三图像块;通过上采样模块对所述第三图像块进行上采样,获取图像块序列中的所述第t个图像帧的图像块对应的超分图像块。
作为本公开实施例一种可选的实施方式,分解所述第t个图像帧和所述邻域图像帧得到的图像块中相邻图像块具有重叠区域;
所述图像生成模块106,具体用于拼接所述第t个图像帧的各个超分图像块,生成拼接图像;将所述拼接图像中各个超分图像块的重叠区域的各个像素点的像素值设置为各个超分图像块中对应的像素点的像素值的平均值,生成所述第t个图像帧的超分图像帧。
上述模块可以被实现为在一个或多个通用处理器上执行的软件组件,也可以被实现为诸如执行某些功能或其组合的硬件,诸如可编程逻辑设备和/或专用集成电路。在一些实施例中,这些模块可以体现为软件产品的形式,该软件产品可以存储在非易失性存储介质中,这些非易失性存储介质中包括使得计算机设备(例如个人计算机、服务器、网络设备、移动终端等)实现本公开实施例中描述的方法。在另一些实施例中,上述模块还可以在单个设备上实现,也可以分布在多个设备上。这些模块的功能可以相互合并,也可以进一步拆分为多个子模块。
本实施例提供的视频的超分辨率装置可以执行上述方法实施例提供的视频的超分辨率方法,其实现原理与技术效果类似,此处不再赘述。
基于同一发明构思,本公开实施例还提供了一种电子设备。图11为本公开实施例提供的电子设备的结构示意图,如图11所示,本实施例提供的电子设备包括:存储器111和处理器112,所述存储器111用于存储计算机程序;所述处理器112用于在调用计算机程序时执行上述实施例提供的视频的超分辨率方法。
基于同一发明构思,本公开实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,当计算机程序被处理器执行时,使得所述计算设备实现上述实施例提供的视频的超分辨率方法。
基于同一发明构思,本公开实施例还提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算设备实现上述实施例提供的视频的超分辨率方法。
基于同一发明构思,本公开实施例还提供了一种计算机程序,包括指令,指令当由处理器执行时使处理器执行上述实施例提供的视频的超分辨率方法。
本领域技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质上实施的计算机程序产品的形式。
处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程
逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。
计算机可读介质包括永久性和非永久性、可移动和非可移动存储介质。存储介质可以由任何方法或技术来实现信息存储,信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。根据本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
最后应说明的是:以上各实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述各实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开各实施例技术方案的范围。
Claims (16)
- 一种视频的超分辨率方法,包括:分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块,其中,t、N均为正整数;根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,其中,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置;计算各个图像块序列的运动参数,其中,任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流;根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块;根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
- 根据权利要求1所述的视频的超分辨率方法,其中,所述计算各个图像块序列的运动参数,包括:针对每一个图像块序列,计算所述图像块序列中各个相邻图像帧的图像块之间的光流;针对每一对相邻图像帧的图像块之间的光流,计算各个像素点对应的光流的绝对值的平均值,获取所述相邻图像帧的图像块之间的运动参数;根据所述图像块序列中各个相邻图像帧的图像块之间的运动参数,获取所述图像块序列的运动参数。
- 根据权利要求1或2所述的视频的超分辨率方法,其中,所述第t个图像帧的邻域图像帧,包括:第t-2个图像帧、第t-1个图像帧、第t+1个图像帧以及第t+2个图像帧。
- 根据权利要求3所述的视频的超分辨率方法,其中,所述根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型,包括:针对每一个图像块序列,确定所述图像块序列的第一运动参数和第二运动参数是否均小于预设阈值,其中,所述第一运动参数为所述第t个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第二运动参数为所述第t个图像帧的图像块与所述第t+1个图像帧的图像块之间的运动参数;在所述第一运动参数和所述第二运动参数均小于所述预设阈值的情况下,确定所述图像块序列对应的超分网络模型为第一超分网络模型;在所述第一运动参数和/或所述第二运动参数大于或等于所述预设阈值的情况下,确定所述图像块序列的第三运动参数和第四运动参数是否均小于预设阈值,其中,所述第三运动参数为所述第t-2个图像帧的图像块与所述第t-1个图像帧的图像块之间的运动参数,所述第四运动参数为所述第t+1个图像帧的图像块与所述第t+2个图像帧的图像块之间的运动参数;在所述第三运动参数和所述第四运动参数均小于所述预设阈值的情况下,确定所述图像块序列对应的超分网络模型为第二超分网络模型;在所述第三运动参数和/或所述第四运动参数大于或等于所述预设阈值的情况下,确定所述图像块序列对应的超分网络模型为第三超分网络模型。
- 根据权利要求4所述的视频的超分辨率方法,其中,所述第一超分网络模型为单帧超分网络模型;所述第二超分网络模型用于基于所述第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块,对所述第t个图像帧的图像块进行超分;所述第三超分网络模型用于基于图像块序列中的所有图像块对所述第t个图像帧的图像块进行超分。
- 根据权利要求5所述的视频的超分辨率方法,其中,通过所述第一超分网络模型对所述第t个图像帧的图像块进行超分,包括:通过级联可形变卷积PCD对齐模块对所述第t个图像帧的图像块进行处理获取第一特征;通过特征融合模块对所述第一特征进行处理,获取第二特征,所述第二特征为将多个所述第一特征在通道维度上拼接得到的特征;通过重建模块对所述第二特征进行重建,获取第一图像块;通过上采样模块对所述第一图像块进行上采样,获取所述第t个图像帧的图像块对应的超分图像块。
- 根据权利要求6所述的视频的超分辨率方法,其中,所述第二特征为将五个所述第一特征在通道维度上拼接得到的特征。
- 根据权利要求5-7任一项所述的视频的超分辨率方法,其中,通过所述第二超分 网络模型对所述第t个图像帧的图像块进行超分,包括:通过PCD对齐模块对所述第t-1个图像帧的图像块、所述第t个图像帧的图像块以及所述第t+1个图像帧的图像块进行处理,获取第三特征,其中,所述第三特征为将第四特征、第五特征以及第六特征在通道维度上拼接得到的特征,所述第四特征为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第五特征为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第六特征为通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征;通过特征融合模块对所述第三特征进行处理,获取第七特征,其中,所述第七特征为所述第四特征、所述第三特征以及所述第五特征在通道维度上拼接得到的特征;通过重建模块对所述第七特征进行重建,获取第二图像块;通过上采样模块对所述第二图像块进行上采样,获取所述第t个图像帧的图像块对应的超分图像块。
- 根据权利要求5-8任一项所述的视频的超分辨率方法,其中,通过所述第三超分网络模型对图像块序列中的所述第t个图像帧的图像块进行超分,包括:通过PCD对齐模块对图像块序列中的全部图像块进行处理,获取第八特征,其中,所述第八特征为将第九特征、第十特征、第十一特征、第十二特征以及第十三特征在通道维度上拼接得到的特征,所述第九特征为通过所述PCD对齐模块对所述第t-2个图像帧的图像块和所述第t-1个图像帧的图像块进行处理得到的特征,所述第十特征为通过所述PCD对齐模块对所述第t-1个图像帧的图像块和所述第t个图像帧的图像块进行处理得到的特征,所述第十一特征为通过所述PCD对齐模块对所述第t个图像帧的图像块进行处理得到的特征,所述第十二特征为通过所述PCD对齐模块对所述第t个图像帧的图像块和所述第t+1个图像帧的图像块进行处理得到的特征,所述第十三特征为通过所述PCD对齐模块对所述第t+1个图像帧的图像块和所述第t+2个图像帧的图像块进行处理得到的特征;通过特征融合模块对所述第八特征进行处理,获取第十四特征;通过重建模块对所述第十四特征进行重建,获取第三图像块;通过上采样模块对所述第三图像块进行上采样,获取图像块序列中的所述第t个图像帧的图像块对应的超分图像块。
- 根据权利要求1-9任一项所述的视频的超分辨率方法,其中,分解所述第t个图 像帧和所述邻域图像帧得到的图像块中相邻图像块具有重叠区域,所述根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧,包括:拼接所述第t个图像帧的各个超分图像块,生成拼接图像;将所述拼接图像中各个超分图像块的重叠区域的各个像素点的像素值设置为各个超分图像块中对应的像素点的像素值的平均值,生成所述第t个图像帧的超分图像帧。
- 根据权利要求1-10任一项所述的视频的超分辨率方法,所述将待超分视频的第t个图像帧分解为N个图像块,包括:通过尺寸为一个图像块的尺寸的采样窗口,从所述第t个图像帧的第一个像素点开始,以预设步长滑动对所述第t个图像帧的各个位置采样,并将采样窗口的每个采样区域作为一个图像块,得到N个图像块。
- 一种视频的超分辨率装置,包括:图像分解模块,用于分别将待超分视频的第t个图像帧和所述第t个图像帧的邻域图像帧分解为N个图像块,其中,t、N均为正整数;序列生成模块,用于根据分解所述第t个图像帧和所述邻域图像帧得到的图像块,生成N个图像块序列,其中,所述图像块序列中的各个图像块分别位于不同图像帧的相同位置;参数计算模块,用于计算各个图像块序列的运动参数,其中,任一图像块序列的运动参数用于表征该图像块序列中各个相邻图像帧的图像块之间的光流;模型确定模块,用于根据各个图像块序列的运动参数确定各个图像块序列对应的超分网络模型;图像超分模块,用于采用各个图像块序列对应的超分网络模型对各个图像块序列中的所述第t个图像帧的图像块进行超分,获取所述第t个图像帧的各个超分图像块;图像生成模块,根据所述第t个图像帧的各个超分图像块生成所述第t个图像帧的超分图像帧。
- 一种电子设备,包括:存储器和处理器;所述存储器用于存储指令;所述处理器用于在执行所述指令时,使得所述电子设备实现权利要求1-11任一项所述的视频的超分辨率方法。
- 一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现权利要求1-11任一项所述的视频的超分辨率方法。
- 一种计算机程序产品,其中,当所述计算机程序产品在计算机上运行时,使得所述计算机实现如权利要求1-11任一项所述的视频的超分辨率方法。
- 一种计算机程序,包括:指令,所述指令当由处理器执行时使所述处理器执行如权利要求1-11任一项所述的视频的超分辨率方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210265124.7 | 2022-03-17 | ||
CN202210265124.7A CN116797452A (zh) | 2022-03-17 | 2022-03-17 | 一种视频的超分辨率方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023174416A1 true WO2023174416A1 (zh) | 2023-09-21 |
Family
ID=88022419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/082228 WO2023174416A1 (zh) | 2022-03-17 | 2023-03-17 | 视频的超分辨率方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116797452A (zh) |
WO (1) | WO2023174416A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100034477A1 (en) * | 2008-08-06 | 2010-02-11 | Sony Corporation | Method and apparatus for providing higher resolution images in an embedded device |
CN103632359A (zh) * | 2013-12-13 | 2014-03-12 | 清华大学深圳研究生院 | 一种视频超分辨率处理方法 |
CN111489292A (zh) * | 2020-03-04 | 2020-08-04 | 北京思朗科技有限责任公司 | 视频流的超分辨率重建方法及装置 |
CN112950471A (zh) * | 2021-02-26 | 2021-06-11 | 杭州朗和科技有限公司 | 视频超分处理方法、装置、超分辨率重建模型、介质 |
CN113592709A (zh) * | 2021-02-19 | 2021-11-02 | 腾讯科技(深圳)有限公司 | 图像超分处理方法、装置、设备及存储介质 |
-
2022
- 2022-03-17 CN CN202210265124.7A patent/CN116797452A/zh active Pending
-
2023
- 2023-03-17 WO PCT/CN2023/082228 patent/WO2023174416A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100034477A1 (en) * | 2008-08-06 | 2010-02-11 | Sony Corporation | Method and apparatus for providing higher resolution images in an embedded device |
CN103632359A (zh) * | 2013-12-13 | 2014-03-12 | 清华大学深圳研究生院 | 一种视频超分辨率处理方法 |
CN111489292A (zh) * | 2020-03-04 | 2020-08-04 | 北京思朗科技有限责任公司 | 视频流的超分辨率重建方法及装置 |
CN113592709A (zh) * | 2021-02-19 | 2021-11-02 | 腾讯科技(深圳)有限公司 | 图像超分处理方法、装置、设备及存储介质 |
CN112950471A (zh) * | 2021-02-26 | 2021-06-11 | 杭州朗和科技有限公司 | 视频超分处理方法、装置、超分辨率重建模型、介质 |
Also Published As
Publication number | Publication date |
---|---|
CN116797452A (zh) | 2023-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Guo et al. | Mambair: A simple baseline for image restoration with state-space model | |
CN109064399B (zh) | 图像超分辨率重建方法和系统、计算机设备及其存储介质 | |
US20230206396A1 (en) | Image super-resolution reconstructing | |
Xiao et al. | Single UHD image dehazing via interpretable pyramid network | |
Zhao et al. | Iterative projection reconstruction for fast and efficient image upsampling | |
Muhammad et al. | Multi-scale Xception based depthwise separable convolution for single image super-resolution | |
Han et al. | TSR-VFD: Generating temporal super-resolution for unsteady vector field data | |
Yu et al. | A smoothing proximal gradient algorithm for matrix rank minimization problem | |
Liu et al. | Sparse representation based image super-resolution on the KNN based dictionaries | |
WO2023174416A1 (zh) | 视频的超分辨率方法及装置 | |
Lian et al. | LG-Net: Local and global complementary priors induced multi-stage progressive network for compressed sensing | |
Liu et al. | Cross-resolution feature attention network for image super-resolution | |
Li et al. | Example based single-frame image super-resolution by support vector regression | |
Miao et al. | Quaternion matrix completion using untrained quaternion convolutional neural network for color image inpainting | |
WO2023174355A1 (zh) | 视频的超分辨率方法及装置 | |
Singh et al. | A content adaptive method of de-blocking and super-resolution of compressed images | |
Gao et al. | CSTrans: Correlation-guided Self-Activation Transformer for Counting Everything | |
Yin et al. | Image super-resolution via 2D tensor regression learning | |
Holčpek et al. | Discrete multivariate F-transform of higher degree | |
Zha et al. | Conditional invertible image re-scaling | |
WO2023217270A1 (zh) | 图像超分方法、超分网络参数调整方法、相关装置及介质 | |
Qiao et al. | Hi-Mamba: Hierarchical Mamba for Efficient Image Super-Resolution | |
Allwin Devaraj et al. | VLSI Implementation of color demosaicing algorithm for real-time image applications | |
Li et al. | Multi-attention fusion transformer for single-image super-resolution | |
US20240087089A1 (en) | Reconstructing linear gradients |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23769918 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2024539756 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023769918 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2023769918 Country of ref document: EP Effective date: 20241017 |