Background technology
Storage that digital video consumption is a large amount of and transmission capacity, generally speaking, video compression technology is by the information redundancy of compression stroke and time two dimensions, to reach the target of digital video bit rate.The interframe encode compress technique with reference to previous frame and subsequent frames with compressed encoding predictive frame, thereby eliminate redundancy on the time dimension.
In block-based video coding and decoding system, engineers uses compression coding technology to reduce the bit rate of digital video, and decompress(ion) is the inverse process of compression.Inter-frame compression techniques is compressed a plurality of frames (claim that generally current being compressed (decompress(ion)) frame is a predictive frame, P frame or B frame claim that the frame in order to reference is a reference frame) with reference to the frame of a plurality of fronts or back.
Their interframe compression all is to use block-based motion compensated predictive coding, coded residual thereupon.The common ground of their P frame/B frame coding techniques is: adopt motion vector estimation to obtain motion vector, be that unit carries out inter prediction based on motion compensation with macro block or inferior macro block then, then the residual block that inter prediction is obtained carries out two-dimensional transform, coefficient in transform domain is quantized, last entropy coding becomes code stream thereupon.Adopt the linear filter of various taps to carry out the interpolation of sub-pixel location pixel value,, in movement compensation process, obtain lower residual energy, thereby improve video compression efficiency to improve the accuracy of estimation.General pixel interpolating precision is 1/2nd, 1/4th luminance pixels, 1/8th colourity pixels.Eighth chroma pixel precision is a theoretical value, and its motion vector is inferred with the vector of corresponding brightness location of pixels, so only explain the interpolation process of luminance pixel in detail.
As shown in Figure 1, identified all possible sub-pixel location have been arranged.Dash box A to D represents integer pixel, and blank box a, b1, b2, c1, c2, d1 to d4, e1 to e4, s, m represent interpolated pixel, i.e. sub-pix.
To obtaining of each sub-pixel values of Fig. 1, more typical embodiment is explained as follows.Explain its interpolation process in conjunction with Fig. 2.As shown in Figure 2, shown sub-pix example location p1, p2, a, its value is put A to T interpolation by integer as numerical digit and is obtained.Six tap filters that are used for these three positions are:
b1=((0-5G+20B+20A-5L+T)+16)>>5(1)
b2=((N-5F+20B+20C-5I+Q)+16)>>5(2)
Its s, m, the interpolation of a0 to a7 position is same as b1, b2.
A=((a1-5a0+20b1+20m-5a2+a3)+16)〉〉 5 or
((a5-5a4+20b2+20s-5a6+a7)+16)>>5(3)
In above-mentioned three formulas, add 16 move to right then 5 be to round control.If p1 to p3, a0 to a7 temporarily do not do and round control, directly in order to (3) formula, then formula (3) becomes:
A=((a1-5a0+20b1+20m-5a2+a3)+512)〉〉 10 or
((a5-5a4+20b2+20s-5a6+a7)+512)>>10(4)
If with the integer representation integer pixel values of 0 to 255 scope, so equally should be with the integer representation sub-pix of 0 to 255 scope.So the value that formula (1)-(4) obtain should be adjusted to 0 to 255 this scope, the clamp computing can realize this task.
Clip (0,255, x)=0; If x<0.
=255; If x〉255.
=x; Other
Except that above narration, also have several and video compression to conciliate the international standard that is pressed with the pass.Although generally have nothing in common with each other on the compress technique details that they use, all be under given pre-determined factor condition, use the filter of different taps, thereby the different sub-pix adjacent pixel values of input obtains interpolation result.
Just in order effectively to improve the accuracy of motion prediction and compensation, and the interpolation method of concrete enforcement has determined the degree of fitting of sub-pixel location pixel value to the purpose of sub-pixel location interpolation.In general the degree of fitting of interpolation algorithm is good more, so based on the motion estimation and compensation accuracy also just high more.In addition, the computational complexity of interpolation algorithm can not and increase considerably along with its degree of fitting raising, otherwise this method and system does not just have too high practical value, does not have pervasive practical value in other words, and only can require very high and useful under the environment that amount of calculation is very tolerant in compression ratio.
The sub-pixel interpolation technology of existing application in motion estimation and compensation therefore has several significant disadvantage as mentioned above, comprising:
(1) degree of fitting of its interpolation method quite is subject to video content.When carrying out sub-pixel interpolation, use the predetermined linear filter that provides coefficient.Generally after the concrete enforcement of their interpolation method, its coefficient is also just fixing can not to be changed.Yet the handled video content of digital video technology is ever-changing, and its statistical model is abundant unusually.On information theory, a kind of statistical model can have higher degree of fitting to the data that respective data sources generates, and to the data of another kind of statistical nature, its degree of fitting might rapid decline.So, to all video contents pretty good degree of fitting all arranged and coefficient fixed linear filter, the video content of substantial deviation Gaussian statistics model must have bigger room for improvement for notable feature is arranged so.
(2) input of the filter in the interpolation method is specific to interpolating pixel point level or vertical direction neighbor pixel.And video content is not specific to level and vertical direction.Good video frame interpolation value method should be transparent for the direction on the plane.Be subject to the digital video content expression way, diagonally opposing corner direction neighbor pixel also can be imported to improve degree of fitting at least.
(3) to the interpolation of 1/4th sub-pix a, computational complexity sharply raises.In the typical embodiments of giving an example as described above, b1, b2 needs 6 multiplication, 6 sub-additions, a shift operation.And interpolation a needs 42 multiplication, 36 sub-additions, a shift operation.Complexity has more nearly 6 times.In the embodiment that has, need so calculate twice from vertical direction and horizontal direction, then computational complexity has more nearly 12 times.
Summary of the invention
The object of the present invention is to provide that a kind of estimation matching degree is good, compression ratio is high, amount of calculation is little and the sub-pixel interpolation method of video content adaptive; Another object of the present invention provides a kind of sub-pixel interpolation device of realizing the video content adaptive of said method.
Purpose of the present invention can realize by following technical measures: a kind of sub-pixel interpolation method of video content adaptive is characterized in that may further comprise the steps:
(1), according to the distance of importing integer pixel positions and interpolation sub-pixel location, importing the basic coefficients that integer pixel divides three classes acquisition linear filter;
(2), according to the distance of input integer pixel values and this filtering interpolation weighted mean, the basic coefficients of correction step (1);
(3), use revised basic coefficients that the target sub-pixel location is done filtering interpolation.
If when the level of sub-pixel interpolation position or vertical direction had 1/4th or 3/4ths pixels, its filter factor was constant.
The input integer pixel of step of the present invention (1) is divided three classes by following: what the distance of input integer pixel positions and interpolation sub-pixel location was minimum is first kind pixel, secondly is the second class pixel, and what distance was maximum is the 3rd class pixel.
The basic coefficients of step of the present invention (1) satisfies following two constraintss:
(1)
Wherein n represents to import the classification number of integer pixel, to first kind pixel n=1, and the second class pixel n=2, the 3rd class pixel n=3, σ represents variance, d is the distance of current input location of pixels to sub-pixel location to be inserted;
(2) w<i, j, 0〉be integer, the value of approximate last constraints, and all w<i, j, 0〉sum is 2 power.
The weighted mean of step of the present invention (2) is: (∑ Co<i, j〉* L<i, j 〉+1/2* ∑ Co<i, j 〉)/∑ Co<i, j〉(i, j traversal input location of pixels)
L<i wherein, j〉be input pixel value, Co<i, j〉be the weighted mean coefficient.
Weighted mean coefficient Co<i of the present invention, j〉satisfy Gauss model.
Step of the present invention (2) when input integer pixel values and this interpolation be that the distance of filtration combined weighted average is more than or equal to the distance threshold values, basic coefficients to first and second and three class integer pixel corrections of input is respectively: 1/9*W<i, j, 0 〉, 1/3*W<i, j, 0〉and 0, W<i wherein, j, 0〉correction preceding basic coefficients; When input integer pixel values and this interpolation be the distance of filtration combined weighted average less than the distance threshold values, constant to the revised basic coefficients of importing of all kinds of integer pixels.
Of the present inventionly be made as 16 apart from threshold values apart from threshold values is described.
The filtering interpolation of step of the present invention (3) is output as: (∑ W<i, j, 1〉* L<i, j 〉+1/2* ∑ W<i, j, 0 〉)/∑ W<i, j, 0〉(i, j traversal input location of pixels)
L<i wherein, j〉be input pixel value, W<i, j, 0 〉, W<i, j, 1〉be respectively the basic coefficients before and after revising.
Realize the sub-pixel interpolation device of a kind of video content adaptive of said method, comprise the filter factor maker that is used to generate when time interpolation coefficient, interpolation filter and sub-pixel location interpolation result memory, the sub-pixel interpolation position and the integer pixel matrix of input are imported described filter factor maker, the filter factor maker generates the filter factor of working as time interpolation according to the difference of location of interpolation, the filter factor of output inputs to interpolation filter with the integer pixel matrix of input again and carries out filtering interpolation, and the output result behind the filtering interpolation is stored in the sub-pixel location interpolation result memory and output at last.
Compared with prior art, the present invention has following advantage:
1, the interpolation fitting degree can be along with video content changes and sharply changes, technical method efficient adaptive of the present invention is in the statistical nature of video content, thereby the estimation matching degree, the corresponding variation abnormality rich video of compression rates content remains on a stable more excellent level; Computation complexity is not corresponding yet simultaneously sharply increases.
2, interpolation method of the present invention is a two-dimensional interpolation, not on level or vertical direction, so have better interpolation fitting degree.
3, for a sub-pixel interpolation position, computational complexity is not than b1, and b2 sub-pixel interpolation position sharply increases, and its computational complexity is less than the computational complexity of prior art in a sub-pixel interpolation position.
4, be applied in the decoder system as the present invention, to the piece with the 4X4 pixel is that unit carries out in the interpolation motion compensation, the each needed reference frame block size that obtains is 7X7,49 pixels, less than the needed 9X9 reference frame block of prior art, 81 pixels reduce by nearly 37% volume of transmitted data.If be applied in the codec chip, obtain reference frame data by dma mode, reduce by 37% DMA volume of transmitted data, efficient will be greatly improved.
Embodiment
The method that the present invention adopts is by the pixel to substantial deviation average in the filtering input pixel, and suitable its respective weights that reduces reduces it linear filter is produced the concussion effect, thereby improves the degree of fitting of output valve.Be specially:, provide the basic coefficients of linear filter according to the correlation that adjacent sub-pix point distance is determined; The statistical method by being adaptive to video content that continues is come the modified basis coefficient; Come target sub-pixel location interpolation with the filter behind the correction factor.
As a among Fig. 1, b1, b2, s, five sub-pixs such as m, filter factor will change and change along with video content during their interpolation.Wherein, be respectively the schematic diagram of a sub-pixel interpolation, b1 sub-pixel interpolation and b2 sub-pixel interpolation as Fig. 3, Fig. 4 and shown in Figure 5.Wherein these method concrete steps comprise:
(1), according to the distance of importing integer pixel positions and interpolation sub-pixel location, divide the input integer pixel three classes to obtain the basic coefficients of linear filter, wherein importing integer pixel is divided three classes by following: what the distance of input integer pixel positions and interpolation sub-pixel location was minimum is first kind pixel, next is the second class pixel, and what distance was maximum is the 3rd class pixel.
For a, b1, during three sub-pixel interpolations of b2, pixel classification such as form 1.
|
First kind pixel |
The second class pixel |
The 3rd class pixel |
The a sub-pixel interpolation |
A,B,C,D |
E,F,G,H,I,J,K,L |
U,V,W,X |
The b1 sub-pixel interpolation |
A,B |
E,F,C,D |
G,L |
The b2 sub-pixel interpolation |
B,C |
A,G,H,D |
F,I |
Form 1a, b1, during three sub-pixel interpolations of b2, the pixel classification
Basic coefficients W<i, j, 0〉matrix can draw by the method for off-line statistics training.The video database that is used for training study should not be confined to specific several videodensity statistical models, should fully contain the video content with various statistical natures.The basic coefficients matrix is to all video contents the filter factor of stablizing the interpolation fitting degree to be arranged.Should satisfy the mixed Gauss model constraint at the initial value of basic coefficients matrix training study, promptly import integer pixel positions and interpolation sub-pixel location apart from d with to the basic coefficients w<i of this input integer pixel positions, j, 0 should satisfy:
Here n represents to import the classification number of integer pixel, to first kind pixel n=1, and the second class pixel n=2, the 3rd class pixel n=3; σ represents variance, and its value is generally first kind pixel to sub-pixel location distance to be inserted
D represents the distance of current input location of pixels to sub-pixel location to be inserted.
For the practical application computation complexity reduces as far as possible, basic coefficients W<i, j, 0〉its value of matrix another constraints that should satisfy is: all coefficients are integer, and select the value of approximate last constraints, and making all coefficient summations is 2 power.
For instance, follow above-mentioned two constraintss, can be as follows and optimize the coefficient matrix get by a video database training study:
During to a sub-pixel location interpolation, input integer pixel matrix is: [V, F, E, U; G, B, A, L; H, C, D, K; W, I, J, X], basic coefficients W<i, j, 0〉matrix is: [1 or 2 ,-9 or-8 ,-9 or-8,1 or 2;-9 or-8,81 or 64,81 or 64 ,-9 or-8;-9 or-8,81 or 64,81 or 64 ,-9 or-8; 1 or 2 ,-9 or-8 ,-9 or-8,1 or 2].
During to b1 sub-pixel location interpolation, input integer pixel matrix is: [*, F, E, *; G, B, A, L; *, C, D, *] (* represents not input), basic coefficients W<i, j, 0〉matrix is: [0 ,-9 or-8 ,-9 or-8,0; 1 or 2,81 or 64,81 or 64,1 or 2; 0 ,-9 or-8 ,-9 or-8,0].
During to b2 sub-pixel location interpolation, input integer pixel matrix is: [*, G, H, *; F, B, C, I; *, A, D, *] (* represents not input), basic coefficients W<i, j, 0〉matrix is: [0 ,-9 or-8 ,-9 or-8,0; 1 or 2,81 or 64,81 or 64,1 or 2; 0 ,-9 or-8 ,-9 or-8,0].
(2), according to the distance of input integer pixel values and this filtering interpolation weighted mean, the basic coefficients of correction step (1).
Wherein, coefficient Co<the i of weighted mean, j〉the obtaining of the similar basic coefficients matrix of acquisition of matrix, promptly import integer pixel positions and interpolation sub-pixel location apart from d with to the coefficient Co<i of the weighted mean of this input integer pixel positions, j should satisfy Gauss model:
Here σ represents variance, and its value is generally first kind pixel to sub-pixel location distance to be inserted
D represents the distance of current input location of pixels to sub-pixel location to be inserted.
For example class is said, follows constraints and the coefficient matrix of the weighted mean that can obtain by off-line training study optimization can be as follows:
During for a sub-pixel location interpolation, the coefficient Co<i of weighted mean, j〉matrix is: [1,11,11,1; 11,105,105,11; 11,105,105,11; 1,11,11,1].
For b1, during b2 sub-pixel location interpolation, the coefficient Co<i of weighted mean, j〉matrix is all mutually: [0,11,11,0; 1,105,105,1; 0,11,11,0].
The weighted mean formula of this filtering interpolation is:
MO=(∑ Co<i, j〉* L<i, j 〉+1/2* ∑ Co<i, j 〉)/∑ Co<i, j〉(i, j traversal input location of pixels)
For instance, the weighted mean coefficient matrix that application of aforementioned provides by training study, the weighted mean that can implement is calculated as:
During for a sub-pixel location interpolation, weighted mean is:
MO=(V+11F+11E+U+11G+105B+105A+11L+11H+105C+105D+11K+W+11I+11J+X+256)>>9。
During for b1 sub-pixel location interpolation, weighted mean is:
MO=(11F+11E+G+105B+105A+L+11C+11D+128)>>8;
During for b2 sub-pixel location interpolation, weighted mean is:
MO=(11G+11H+F+105B+105C+I+11A+11D+128)>>8。
When importing integer pixel values and this interpolation is the distance H<i of filtration combined weighted average, j 〉=| L<i, j 〉-MO| (i, j traversal input location of pixels, L<i, j〉represent this input pixel value) more than or equal to the distance threshold values, the basic coefficients of first and second and three class integer pixel corrections of input is respectively:
W<i,j,1>=1/9*W<i,j,0>;
W<i,j,1>=1/3*W<i,j,0>;
W<i,j,1>=0;
As certain pixel value L<i, j〉with the weighted mean distance H<i of this time input, j〉when being lower than threshold values Ht, then:
W<i,j,1>=W<i,j,0>。
(i, index position in the j representing matrix, W<i, j, 0〉be the preceding basic coefficients of correction).
Of the present inventionly depend on edge sharpening degree according to coder parameters, the video content that is encoded apart from threshold values, in having only the encoder of forward prediction, to each I frame, calculate one apart from threshold values, all the P frames with reference to this I frame all adopt this same as threshold values subsequently; In the encoder that forward direction and back forecast are arranged, to each I frame, calculate one apart from threshold values, and the B frame employed apart from threshold values should from this frame with reference to a plurality of I frames apart from obtaining the threshold values.Each I frame should be coded into video code flow apart from threshold values Ht, in decoder end, all P frames and B frame use respective thresholds Ht according to corresponding relation.Wherein in order to reduce computation complexity, encoder can be used unique apart from threshold values to a video sequence, and an experience can be made as Ht=16 apart from threshold values.
(3), use revised basic coefficients that the target sub-pixel location is done filtering interpolation.
According to revised coefficient W<i, j, 1〉matrix does conventional linear interpolation.Represent this interpolation output with Lo, then have:
Lo=(∑ W<i, j, 1〉* L<i, j 〉+1/2* ∑ W<i, j, 0 〉)/∑ W<i, j, 0〉(i, j traversal input location of pixels)
For instance, a kind of possible final interpolation scheme is:
During for a sub-pixel location interpolation, interpolation is output as:
Lo=(∑ W<i, j, 1〉* L<i, j 〉+128) 8; (i, j traversal input location of pixels)
For b1, during b2 sub-pixel location interpolation, interpolation is output as:
Lo=(∑ W<i, j, 1〉* L<i, j 〉+64) 7; (i, j traversal input location of pixels)
In addition, for sub-pixel interpolation position shown in Figure 1, if when level or vertical direction have 1/4th or 3/4ths pixels, its filter factor is constant.Obtain by the following method:
c2=(B+b1+1)>>1;
c1=(A+b1+1)>>1;
c3=(B+b2+1)>>1;
c4=(C+b2+1)>>1;
d1=(a+b1+1)>>1;
d2=(a+b2+1)>>1;
d3=(a+m+1)>>1;
d4=(a+s+1)>>1;
e1=(A+a+1)>>1;
e2=(B+a+1)>>1;
e3=(C+a+1)>>1;
e4=(D+a+1)>>1。
These sub-pixel location interpolation depend on a, b1, b2, s, the interpolation in advance of five sub-pixs such as m.
As shown in Figure 8, sub-pixel interpolation device for a kind of video content adaptive of realizing said method, comprise the filter factor maker that is used to generate when time interpolation coefficient, interpolation filter and sub-pixel location interpolation result memory, the sub-pixel interpolation position and the integer pixel matrix of input enter described filter factor maker, the filter factor maker generates the filter factor of working as time interpolation according to the difference of location of interpolation, the filter factor of output inputs to interpolation filter with the integer pixel matrix of input again and carries out filtering interpolation, and the output result behind the filtering interpolation is stored in the sub-pixel location interpolation result memory and output at last.Codec calls this interpolating apparatus, is responsible for input integer pixel matrix and sub-pixel interpolation position, adopts filter factors to change and can not change two kinds according to different classification of location of interpolation, but all be in filter factor production device; After filter factor generates,, do actual interpolation arithmetic, send interpolation result at last together with the input of the integer pixel matrix that begins as interpolation filter.
As shown in Figure 6, be interpolating apparatus schematic diagram in the video encoder in the prior art, motion compensator compensates the motion vector error of picture motion data among the figure, this motion compensator can use interpolating apparatus of the present invention, carries out interpolation to improve image data resolution and then to improve motion vector error compensation precision on view data.As shown in Figure 7, be interpolating apparatus schematic diagram in another video encoder in the prior art, motion compensator is that the motion of view data compensates among the figure, can use interpolating apparatus of the present invention, on reference image frame, carry out interpolation with the raising image resolution ratio, and then improve compensation precision.