CN111698502A - VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium - Google Patents
VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium Download PDFInfo
- Publication number
- CN111698502A CN111698502A CN202010566975.6A CN202010566975A CN111698502A CN 111698502 A CN111698502 A CN 111698502A CN 202010566975 A CN202010566975 A CN 202010566975A CN 111698502 A CN111698502 A CN 111698502A
- Authority
- CN
- China
- Prior art keywords
- coding unit
- current coding
- motion estimation
- optimal
- affine motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/567—Motion estimation based on rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses an affine motion estimation acceleration method, equipment and a storage medium based on VVC (variable value code), which comprises the following steps of: if RDcost is satisfiedAffineMerge>λ*RDcostMergeOr, when the current coding unit constructs the affinity Merge mode candidate list, if the optimal prediction mode of the adjacent coding unit has no Affine mode, skipping the Affine motion estimation process of the current coding unit; otherwise, the current coding unit continues affine motion estimation, wherein the RDcostAffineMergeRate distortion representing the execution of affinity Merge mode by the current coding unitA cost; RDcostMergeRepresents the rate distortion cost of the current coding unit executing the Merge mode, wherein lambda is a threshold value, and lambda is more than or equal to 1. According to the method and the device, unnecessary Affine motion estimation is skipped by executing the Affinine Merge mode information according to the current coding unit, so that the time complexity of the coder can be reduced, the efficiency of the coder is effectively improved, and the method and the device are favorable for being put into practical application.
Description
Technical Field
The present invention relates to the field of video coding technologies, and in particular, to a method, an apparatus, and a storage medium for accelerating affine motion estimation based on VVC coding.
Background
Currently, High Efficiency Video Coding (HEVC) is widely used in commercial applications, but this still does not meet the increasing Video demand. Therefore, the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) established joint Video experts group jmet (joint Video exhibition team) to study a new generation of Video coding technology. In a conference held by jvt in san diego, 10, japan and usa in 2018, a draft of a new generation of video coding technology is released, a new video standard is named as multifunctional video coding (VVC for short), and by 4 months 2020, the jvt conference has already proceeded to the seventeenth time, and the official test model VTM is updated to the version 8.0. Its design has two main goals, firstly to specify a video coding technique whose compression capacity is far beyond the previous generations of such standards, and secondly that the technique is highly versatile and can be effectively used in a wider range of applications than those involved in previous standards. The new generation of standards introduces many new coding tools, for example, QTMT (quad tree with Multi-type tree) partition structure is adopted to replace the Quadtree partition of HEVC, adaptive motion vector precision, and Affine (Affine) -based motion compensation technology, which significantly improve coding efficiency, but also greatly increase the time complexity of the encoder. Such high complexity is not favorable for future standard use and popularization, so it is very important to reduce the encoding time of the encoder.
The motion compensation technology based on Affinine (Affine) is a displacement transformation model used for irregular motions such as fade-in/fade-out, rotation, scaling and the like, and solves the problem that the motion compensation of a translation transformation model is inaccurate. Two affine motion models are provided in the VVC standard: 4-parameter models and 6-parameter models. The 4-parameter model needs to obtain the top left corner motion vector and the top right corner motion vector of the coding unit as the motion control points thereof, obtain the motion vector of each 4 × 4 small block in the coding unit according to a parameter model formula, and perform motion compensation on each 4 × 4 small block. The 6-parameter model has one more lower left corner motion vector than the 4-parameter model as a motion control point, and the subsequent process is the same as that of the 4-parameter model. The Motion compensation technology based on affinity comprises an affinity Merge mode (Affine Merge mode) and an affinity Motion Estimation mode (affinity Motion Estimation), the affinity Merge mode and the affinity Motion Estimation mode are contained in the process of selecting an inter-frame mode, the affinity Merge mode calculates a rate distortion cost RDCost (comprehensive evaluation of code rate and distortion) before a common Merge mode, and the affinity Motion Estimation and the common Motion Estimation together calculate the rate distortion cost RDCost. The Affini Merge mode is a coding mode introduced by the VVC standard, an Affini Merge candidate list is constructed by utilizing the motion vector information of adjacent coding units, the optimal control point motion vector is searched in the candidate list, and Affine motion compensation is further carried out; the Merge mode extends the characteristics of the former standard HEVC, a Merge candidate list is constructed by using the motion vectors of adjacent coding units, and the optimal motion vector is selected for motion compensation.
Affine motion estimation is mainly divided into the following steps:
the method comprises the following steps: an affine AMVP (Advanced Motion Vector Prediction) list is constructed, two groups of CPMVs (control point Motion vectors which can be used for Prediction) are filled, and the optimal group is taken as a starting point of affine Motion estimation.
Step two: and unidirectional prediction, namely performing forward prediction and backward prediction respectively, and performing step four on each reference frame in a forward reference List0 and each reference frame in a backward List1 to obtain a forward optimal control point MV (motion vector) and a backward optimal control point MV.
Step three: and (4) bidirectional prediction, namely performing step four on the reference lists List0 and List1 to obtain the optimal control point MV combined by the bidirectional prediction, and carrying out weighted average on the optimal control point MV and the List 1.
Step four: and performing motion estimation and motion compensation according to the CPMV. And obtaining a group of optimal control points MV by using a gradient descent search method, obtaining the motion vector of each 4 x 4 small block in the coding unit CU according to the control points MV, performing motion compensation on the motion vectors, and calculating to obtain the total RD Cost.
Step five: and comparing the forward prediction optimal rate-distortion Cost RD Cost, and the backward prediction optimal rate-distortion Cost RDcost and the bidirectional prediction optimal rate-distortion Cost RD Cost. And setting the mode with the minimum cost as the optimal mode of the current affine motion estimation, and storing information such as a prediction direction, a reference frame, a control point MV and the like.
In the encoder of the current VVC standard, affine motion estimation needs to be performed once for all encoding units; moreover, affine motion estimation is performed on each reference frame in each prediction direction, which brings the defect of overlong coding time.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art. Therefore, the invention provides an affine motion estimation acceleration method based on VVC coding, equipment and a storage medium.
In a first aspect of the present invention, an affine motion estimation acceleration method based on VVC coding is provided, including the following steps:
if RDcost is satisfiedAffineMerge>λ*RDcostMergeOr, when the current coding unit constructs an affinity Merge mode candidate list, if the optimal prediction mode of the adjacent coding unit has no Affine mode, skipping the Affine motion estimation process of the current coding unit; otherwise, the current coding unit continues affine motion estimation, wherein RDcostAffineMergeRepresenting the rate distortion cost of the current coding unit executing the affinity Merge mode; RDcostMergeAnd representing the rate distortion cost of the Merge mode executed by the current coding unit, wherein lambda is a threshold value, and lambda is more than or equal to 1.
According to the embodiment of the invention, at least the following beneficial effects are achieved:
according to the method and the device, unnecessary Affine motion estimation is skipped by executing the Affinine Merge mode information according to the current coding unit, so that the time complexity of the coder can be reduced, the efficiency of the coder is effectively improved, and the method and the device are favorable for being put into practical application.
According to some embodiments of the invention: if RDcost is satisfiedAffineMerge>λ*RDcostMergeAnd if the optimal prediction mode of the adjacent coding unit has no affine mode, skipping the affine motion estimation of the current coding unitThe process.
According to some embodiments of the invention, the current coding unit proceeds with affine motion estimation, comprising the steps of:
if the optimal prediction mode of the parent CU of the current coding unit is affine motion estimation, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the parent CU.
According to some embodiments of the present invention, if the optimal prediction mode of the parent CU of the current coding unit is not affine motion estimation, further comprising:
and if the current coding unit is the lower child CU divided in the horizontal binary manner, and the optimal prediction direction and the optimal reference frame of the father CU are consistent with the optimal prediction direction and the optimal reference frame of the upper child CU divided in the horizontal binary manner, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the father CU.
According to some embodiments of the present invention, if the optimal prediction mode of the parent CU of the current coding unit is not affine motion estimation, further comprising:
if the current coding unit is a right child CU in vertical binary division, and the optimal prediction direction of the father CU, the optimal reference frame and the optimal prediction direction and the optimal reference frame of the left child CU in vertical binary division are all consistent, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the father CU;
according to some embodiments of the present invention, if the optimal prediction mode of the parent CU of the current coding unit is not affine motion estimation, further comprising:
if the current coding unit is the first sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the sub-CU on the horizontal two-fork division;
if the current coding unit is the second sub-CU divided horizontally into three branches, the current coding unit simultaneously searches the optimal prediction direction and the optimal reference frame of the sub-CU divided horizontally into two branches, and selects the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the sub-CU divided horizontally into two branches;
and if the current coding unit is the third sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the lower sub-CU under the horizontal two-fork division.
According to some embodiments of the present invention, if the optimal prediction mode of the parent CU of the current coding unit is not affine motion estimation, further comprising:
if the current coding unit is the first sub-CU divided vertically in three forks, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the left sub-CU divided vertically in two forks;
if the current coding unit is the second sub-CU divided vertically in three forks, the current coding unit simultaneously searches the optimal prediction direction and the optimal reference frame of the left sub-CU divided vertically in two forks, and the optimal prediction direction and the optimal reference frame of the right sub-CU divided vertically in two forks, and selects the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the left sub-CU divided vertically in two forks and the rate distortion cost of the right sub-CU divided vertically in two forks;
and if the current coding unit is the third sub-CU of the vertical trigeminal division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the right sub-CU of the vertical binary division.
According to some embodiments of the invention, λ is 1.05.
In a second aspect of the present invention, an affine motion estimation acceleration apparatus based on VVC coding is provided, including: at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform a VVC coding based affine motion estimation acceleration method as described above.
Embodiments of the present invention provide a computer-readable storage medium storing computer-executable instructions for causing a computer to execute an affine motion estimation acceleration method based on VVC encoding as described above.
The VVC code-based affine motion estimation accelerating device and the readable storage medium provided by the embodiment of the invention can achieve the same beneficial effects as the method.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a diagram illustrating a block partitioning method provided in the prior art;
fig. 2 is a schematic flowchart of an affine motion estimation acceleration method based on VVC encoding according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the positions of inheritable affine motion predictors provided by the prior art;
FIG. 4 is a diagram illustrating control point motion vector inheritance provided by the prior art;
fig. 5 is a schematic flowchart of an affine motion estimation acceleration method based on VVC encoding according to an embodiment of the present invention;
fig. 6 is a schematic flowchart of an affine motion estimation acceleration method based on VVC encoding according to an embodiment of the present invention;
fig. 7 is a schematic flowchart of an affine motion estimation acceleration method based on VVC encoding according to an embodiment of the present invention;
fig. 8 is a schematic flowchart of an affine motion estimation acceleration method based on VVC encoding according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an affine motion estimation acceleration apparatus based on VVC coding according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the prior art, a VVC standard encoder specifies that, during inter-mode selection, each Coding Unit (Coding Unit, CU for short) needs to traverse each mode, and select a mode with the smallest rate distortion cost as an optimal prediction mode, however, some video sequences have no or few affine motion, and at this time, the probability of affine motion estimation cannot be selected as the optimal prediction mode, so that affine motion estimation with a large amount of redundancy limits the efficiency of the encoder, and thus the encoding time is too long.
Affine motion estimation is divided into unidirectional prediction and bidirectional prediction, the unidirectional prediction is further divided into forward prediction and backward prediction, the forward prediction traverses each reference frame in a candidate reference frame List0 (a List0 List is filled with a certain number of previously encoded frames), a forward optimal reference frame is selected according to a rate-distortion cost, the backward prediction traverses each reference frame in a candidate reference frame List1 (a List1 List is filled with a certain number of previously encoded frames), a backward optimal reference frame is selected, and the bidirectional prediction selects an optimal group of reference frames in a combination of the forward List and the backward List. Traversing all prediction directions and reference frames will take a lot of time, resulting in an increased coding burden.
The new standard employs a more flexible block partitioning approach: based on the quad tree and the structure of the nested multi-type tree. A coding unit firstly carries out quadtree division, and then tries to divide into horizontal binary division, vertical binary division, horizontal three-fork division and vertical three-fork division in sequence. As shown in fig. 1, the three-way partition is a horizontal binary partition, a vertical binary partition, a horizontal three-way partition, and a vertical three-way partition. For example: the 64 x 64 block will try quad tree splitting first (resulting in 4 32 x 32 blocks), followed by horizontal binary splitting (upper sub-CU block 64 x 32 and lower sub-CU block 64 x 32), vertical binary splitting (left sub-CU block 32 x 64 and right sub-CU block 32 x 64), horizontal three-fork splitting (upper sub-CU block 64 x 16 and sub-CU block 64 x 32 and lower sub-CU block 64 x 16), and vertical three-fork splitting in the same way. At this point the 64 x 64 block is the parent node of all the divided sub-CU blocks. And calculating rate distortion cost for each division mode, comparing the rate distortion cost of the non-division mode with the rate distortion cost of the division modes, and selecting the mode with the minimum value as the optimal prediction mode.
The first embodiment:
referring to fig. 2, an affine motion estimation acceleration method based on VVC coding is provided, including the following steps:
s100, obtaining rate distortion cost RDcost of the current coding unit executing the affinity Merge modeAffineMergeAnd rate-distortion cost RDcost for executing Merge modeMerge(ii) a Acquiring the optimal prediction mode of an adjacent coding unit when a current coding unit constructs an affinity Merge mode candidate list;
s200, if RDcostAffineMerge>λ*RDcostMergeAnd if the optimal prediction mode of the adjacent coding unit has no affine mode, skipping the affine motion estimation process of the current coding unit, wherein lambda is a threshold value, and lambda is more than or equal to 1.
In this embodiment, two motion affine estimation skipping methods are included, specifically as follows:
first, because both motion Affine estimation and affinity Merge belong to Affine motion methods, the rate-distortion cost changes of the two are synchronized to some extent. Rate distortion cost RDcost when Affini MergeAffineMergeRate distortion cost RDcost greater than that of the ordinary Merge modeMergeThe rate distortion cost of affine motion estimation is more likely to be greater than the rate distortion cost RDcost of the ordinary Merge modeMergeAt this time, the affine motion estimation is not selected as the optimal prediction mode of the current coding unit, and the affine motion estimation process of the current coding unit can be skipped to shorten the encoding time of the encoder.
Secondly, before the current coding unit performs the Affine motion estimation mode, the Affine Merge mode and the common Merge mode are performed, whether the Affine mode exists in the optimal prediction mode of the adjacent coding unit is detected when the Affine Merge candidate list is constructed, if the Affine motion estimation mode does not exist in the optimal prediction mode of the adjacent coding unit, the probability that the current coding unit selects Affine motion estimation is low, and the Affine motion estimation process of the current coding unit can be skipped, so that the coding time of the coder is shortened.
The affinity Merge candidate list and adjacent coding units are briefly introduced below;
the following three CPMV candidate types are used to form the affinity Merge candidate list:
(1) an inherited affinity Merge candidate derived from the CPMVs of neighboring CUs;
(2) a constructed affine merging candidate CPMV derived using the panning MVs of the neighboring CUs;
(3)0 vector MV;
in VVC coding, there are at most two inherited affine candidates, which are derived from affine motion models of neighboring coding units, one from the left neighboring CU and the other from the upper neighboring CU. The candidate blocks are shown in fig. 3. For left predictor, the scan order is A0- > A1, and for top predictor, the scan order is B0- > B1- > B2, where A0, A1, B0, B1, B2 are all neighboring coding units. Only the first inherited candidate from both sides (one from the left and one from the top) is selected, and no pruning is performed between the two inherited candidates. When an adjacent Affine CU is identified, its control point motion vector is used for the Affine mean candidate list of the current CU. As shown in fig. 4, if the lower left neighboring block a is encoded as an Affine mode, its upper left motion vector v2, upper right motion vector v3 and lower right motion vector v4 are obtained as a set of Affine Merge candidate lists for the current block. An object that moves affine can be seen as a whole, whose area is continuous within a frame. Therefore, when the adjacent coding units have sustainable affine motion, the probability of affine motion performed by the current coding unit becomes large. On the contrary, when the adjacent coding units cannot search the coding unit which performs affine motion, the probability of the affine motion of the current coding unit is small.
In this embodimentIn step S200, the two skipping methods are combined, specifically: if RDcost is satisfied at the same timeAffineMerge>λ*RDcostMergeAnd the optimal prediction mode of the adjacent coding unit has no two conditions of affine mode, then the affine motion estimation of the current coding unit can be selected to be skipped.
In the embodiment, the two skipping modes are combined to selectively skip the affine motion estimation process, so that compared with the prior art, the encoding time of an encoder is reduced. Compared with the method of only using any one of the above methods to execute the process of skipping affine motion estimation, the accuracy of the advance skipping of affine motion estimation is undoubtedly improved, and the encoding time of the encoder is reduced.
As a preferred embodiment. When the threshold λ is 1.05, a better balance between coding quality and time saving of coding can be achieved, and specific experimental results refer to the fifth embodiment.
Second embodiment:
referring to fig. 5, there is provided an affine motion estimation acceleration method based on WC encoding, comprising the steps of:
s300, obtaining rate distortion cost RDcost of the current coding unit executing the affinity Merge modeAffineMergeAnd rate-distortion cost RDcost for executing Merge modeMerge;
S400, if RDcostAffineMerge>λ*RDcostMergeThen the affine motion estimation process of the current coding unit is skipped, where λ is 1.05.
As described in the first embodiment, it is not described herein again, and the scheme of this embodiment can reduce the encoding time of the encoder, but it should be noted that although the scheme can shorten the encoding time of the encoder, the accuracy is not as good as that of the scheme of the first embodiment.
The third embodiment:
referring to fig. 6, an affine motion estimation acceleration method based on VVC coding is provided, including the following steps:
s500, acquiring the optimal prediction mode of an adjacent coding unit when the current coding unit constructs an affinity Merge mode candidate list;
s600, if the optimal prediction modes of the adjacent coding units have no affine mode, skipping the affine motion estimation process of the current coding unit.
As described in the first embodiment, it is not described herein again, and the scheme of this embodiment can reduce the encoding time of the encoder, but it should be noted that although the scheme can shorten the encoding time of the encoder, the accuracy is not as good as that of the scheme of the first embodiment.
The fourth embodiment:
referring to fig. 7, an affine motion estimation accelerating method based on VVC coding is provided, where if the skip condition of the above embodiment is not satisfied, and the current coding unit continues affine motion estimation, the method includes the following steps:
s700, if the optimal prediction mode of the parent CU of the current coding unit is affine motion estimation, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the parent CU.
Because the image content of the parent CU contains the image content of the child CU in the VVC encoding, the parent CU can well characterize the motion direction and texture of the child CU to some extent, and thus only the optimal prediction direction and the optimal reference frame of the previously encoded parent CU need to be multiplexed, and no other reference direction and reference frame need to be searched.
Therefore, in the present embodiment, when neither of the two skipping methods described in the first embodiment is satisfied, and the optimal prediction mode of the parent CU of the current coding unit is affine motion estimation, the current coding unit searches and multiplexes only the optimal prediction direction and the optimal reference frame of the parent CU.
Fifth embodiment:
referring to fig. 8, an affine motion estimation acceleration method based on VVC coding is provided, including the following steps:
a100, when the current coding unit executes the affinity Merge mode, saving the affinity mode number of the adjacent coding unit constructing the affinity Merge mode candidate list, and marking as NumNeighborFaffine.
A200, saving the rate distortion cost of the current coding unit executing the affinity Merge mode, and recording as RDcostAffineMergeAnd saving the rate distortion cost of the current coding unit executing the common Merge mode, and recording the rate distortion cost as RDcostMerge。
A300 if RDcostAffineMerge>1.05*RDcostMergeAnd NumNeighborFaffine is equal to 0, skipping the affine motion estimation process; otherwise, step A400 is entered.
As in the first embodiment, the description is omitted here.
A400, if the optimal prediction mode of the parent CU of the current coding unit is affine motion estimation, the current coding unit only searches the optimal prediction direction DIR of the parent CUparAnd the optimal reference frame REFpar(ii) a Otherwise, step A500 is entered.
Like the fourth embodiment, the description is omitted here.
A500, judging the division type and the position of the current coding unit, and dividing the current coding unit into the following conditions:
because the sub-block distributions of the horizontal trifurcation and the horizontal binary subdivision overlap for a large part, for example: half areas of the upper sub-CU of the horizontal binary division and the upper sub-CU of the horizontal three-fork division are overlapped, the motion relevance of the upper sub-CU and the upper sub-CU of the horizontal three-fork division is strong, and the horizontal binary division is executed before the horizontal three-fork division, so that the prediction direction and the reference frame of the horizontal binary division sub-block only need to be saved for being used by the subsequent horizontal three-fork division sub-block, and the vertical three-fork division can also multiplex the information of the vertical binary division in the same way, so that the reference frame searching time of the horizontal/vertical three-fork division can be reduced.
A501, if the current coding unit is a lower child CU divided in a horizontal binary manner, and the optimal prediction direction of a father CU, the optimal reference frame and the optimal prediction direction of an upper child CU divided in the horizontal binary manner are all consistent with each other, the current coding unit only searches and multiplexes the optimal prediction direction DIR of the father CUparAnd the optimal reference frame REFpar。
It should be noted that, in this step a501, only the affine motion estimation process of the lower sub-CU of the horizontal binary division can be accelerated, and the upper sub-CU of the horizontal binary division still normally performs affine motion estimation.
A502, ifThe front coding unit is a right child CU divided vertically in binary, and the optimal prediction direction and the optimal reference frame of the father CU are consistent with those of a left child CU divided vertically in binary, so that the current coding unit only searches and multiplexes the optimal prediction direction DIR of the father CUparAnd the optimal reference frame REFpar。
It should be noted that, in this step a502, only the affine motion estimation process of the vertically binary-divided right sub-CU can be accelerated, and the vertically binary-divided left sub-CU still normally performs affine motion estimation.
A503, if the current coding unit is the first sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction DIR of the sub-CU on the horizontal two-fork divisionBT_UPAnd the optimal reference frame REFBT_UP。
A504, if the current coding unit is the second sub-CU of the horizontal three-fork division, the current coding unit simultaneously searches the optimal prediction direction DIR of the sub-CU on the horizontal two-fork divisionBT_UPAnd the optimal reference frame REFBT_UPAnd the optimal prediction direction DIR of the sub-CU under the horizontal binary divisionBT_DOWNAnd the optimal reference frame REFBT_DOWNAnd selecting the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the upper sub-CU and the lower sub-CU in the horizontal binary division.
Note that, in this step a504, the smaller rate-distortion cost is selected. And since the horizontal three-fork divided second sub-CU region contains a part of the upper sub-CU of the horizontal two-fork division and a part of the lower sub-CU of the horizontal two-fork division, where it is uncertain whether the second sub-CU is motion-consistent with the upper sub-CU or with the lower sub-CU, the search for the optimal one is performed.
A505, if the current coding unit is the third sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction DIR of the lower sub-CU under the horizontal two-fork divisionBT_DOWNAnd the optimal reference frame REFBT_DOWN。
A506, if the current coding unit is the first sub-CU divided vertically into three branches, the current coding unit only searches and multiplexesOptimal prediction direction DIR of vertical binary division left sub-CUBT_LEFTAnd the optimal reference frame REFBT_LEFT。
A507, if the current coding unit is the second sub-CU divided vertically into three forks, the current coding unit simultaneously searches the optimal prediction direction DIR of the left sub-CU divided vertically into two forksBT_LEFTAnd the optimal reference frame REFBr_LEFTAnd the optimal prediction direction DIR of the vertical binary division right sub-CUBT_RIGHTAnd the optimal reference frame REFBT_RIGHTAnd selecting the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the vertical binary division left sub-CU and the vertical binary division right sub-CU.
Note that, in this step a507, the rate distortion cost is selected to be smaller. And since the second sub-CU region of the vertical trifurcated partition contains a part of the left sub-CU of the vertical binary partition and a part of the right sub-CU of the vertical binary partition, where it is uncertain whether the second sub-CU is motion-consistent with the left sub-CU or with the right sub-CU, the search is for a selection optimum.
A508, if the current coding unit is the third sub-CU of the vertical trigeminal division, the current coding unit only searches and multiplexes the optimal prediction direction DIR of the right sub-CU of the vertical trigeminal divisionBT_RIGHTAnd the optimal reference frame REFBT_RIGHT。
Step A100 to step A300, skipping unnecessary Affine motion estimation according to the Affinine Merge mode information, thereby reducing the time complexity of the video encoder; step a400 to step a500, the range of the prediction direction and the reference frame is reduced according to the partition characteristics of the current coding unit, thereby reducing the time complexity of the video encoder.
According to the method, firstly, unnecessary Affine motion estimation is skipped according to Affinine Merge mode information, and secondly, the range of a prediction direction and a reference frame is reduced by combining the dividing characteristics of the coding unit, so that the time complexity of a video coder is reduced, the efficiency of the coder is effectively improved, and the method is favorable for being put into practical application.
The method is realized on the basis of a VVC official standard reference encoder VTM8.0, and by using an encoder _ random _ vtm.cfg configuration file, three sequences with violent motion, including BasketbalDrive, Captus and RaceHorses, are selected as test sequences, wherein the Captus has a large amount of affine motion and can better reflect the effect of the algorithm. The coding performance is evaluated by two indexes of BDBR (BjotegaardDelta Bit rate) and TS (transport stream), the BDBR represents the code rate difference of the two coding methods under the same objective quality, the BDBR can comprehensively reflect the code rate and the quality of a video, and the larger the value is, the higher the code rate of the proposed scheme is compared with that of the original algorithm; TS represents the reduction degree of the coding time of the scheme on the basis of the original algorithm, and the calculation formula is as follows:
wherein, TpTo add the proposed algorithm to the total encoding time, T, after the encoder VTM8.0OIs the total encoding time of the original encoder VTM 8.0. The results obtained by simulation experiments are shown in the following table 1:
TABLE 1
As can be seen from the data in table 1, compared with the original encoder, the encoder added with the scheme has the advantages that the average BDBR is only increased by 0.39%, and the average time of the encoder is reduced by 11.69%, which indicates that the time of the encoder is greatly increased without significantly increasing the code rate. Therefore, the invention reduces the coding time and improves the coding efficiency on the premise of ensuring the subjective quality and the compression ratio of the video.
The following description of the threshold value, the results obtained by experimental analysis of the two sequences are shown in table 2:
TABLE 2
Wherein, Δ T represents the time reduction rate of the present scheme compared with the original algorithm. The hit rate represents the probability of correctness for the coding unit to skip affine motion estimation under the current threshold. From table 2 it can be found that as the threshold λ increases, Δ T decreases and hit rate increases, indicating that the larger λ the less the encoding time decreases, but the higher the accuracy of skipping affine motion estimation. In order to balance the coding quality and the time saving of the coding, λ is preferably 1.05.
Sixth embodiment:
referring to fig. 9, an affine motion estimation acceleration device based on VVC encoding is provided, and the device may be any type of intelligent terminal, such as a mobile phone, a tablet computer, a personal computer, and the like.
Specifically, the apparatus includes: one or more control processors and memory, one control processor being exemplified in fig. 9. The control processor and the memory may be connected by a bus or other means, as exemplified by the bus connection in fig. 9.
The memory, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the affine motion estimation accelerating device based on the VVC code in the embodiment of the present invention, and the control processor implements the affine motion estimation accelerating method based on the VVC code in the above method embodiment by operating the non-transitory software programs, instructions, and modules stored in the memory.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function. Further, the memory may include high speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes a memory remotely located from the control processor, and the remote memories may be connected to the VVC code based affine motion estimation acceleration device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The one or more modules are stored in the memory and, when executed by the one or more control processors, perform the above-described method embodiment affine motion estimation acceleration method based on VVC encoding, for example, perform the above-described method steps S100 to S200 in fig. 2.
A computer-readable storage medium is also provided, which stores computer-executable instructions that are executed by one or more control processors, e.g., by one of the control processors in fig. 9, and that cause the one or more control processors to perform the affine motion estimation acceleration method based on VVC encoding in the above-described method embodiment, e.g., to perform the above-described method steps S100 to S200 in fig. 2.
Through the above description of the embodiments, those skilled in the art can clearly understand that the embodiments can be implemented by software plus a general hardware platform. Those skilled in the art will appreciate that all or part of the processes of the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims (10)
1. An affine motion estimation acceleration method based on VVC coding is characterized by comprising the following steps:
if RDcost is satisfiedAffineMerge>λ*RDcostMergeOr, when the current coding unit constructs an affinity Merge mode candidate list, if the optimal prediction mode of the adjacent coding unit has no Affine mode, skipping the Affine motion estimation process of the current coding unit; otherwise, the current coding unit continues affine motion estimation, wherein RDcostAffineMergeRepresenting the rate distortion cost of the current coding unit executing the affinity Merge mode; RDcostMergeAnd representing the rate distortion cost of the Merge mode executed by the current coding unit, wherein lambda is a threshold value, and lambda is more than or equal to 1.
2. The method of claim 1, wherein the method comprises:
if RDcost is satisfiedAffineMerge>λ*RDcostMergeAnd if the optimal prediction mode of the adjacent coding unit has no affine mode, skipping the affine motion estimation process of the current coding unit.
3. The method as claimed in claim 1, wherein the current coding unit continues affine motion estimation, and comprises the following steps:
if the optimal prediction mode of the parent CU of the current coding unit is affine motion estimation, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the parent CU.
4. The method of claim 3, wherein if the optimal prediction mode of the parent CU of the current CU is not affine motion estimation, the method further comprises:
and if the current coding unit is the lower child CU divided in the horizontal binary manner, and the optimal prediction direction and the optimal reference frame of the father CU are consistent with the optimal prediction direction and the optimal reference frame of the upper child CU divided in the horizontal binary manner, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the father CU.
5. The method of claim 4, wherein if the optimal prediction mode of the parent CU of the current CU is not affine motion estimation, the method further comprises:
and if the current coding unit is the right child CU in vertical binary division, and the optimal prediction direction of the father CU, the optimal reference frame and the optimal prediction direction and the optimal reference frame of the left child CU in vertical binary division are all consistent, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the father CU.
6. The method of claim 5, wherein if the optimal prediction mode of the parent CU of the current CU is not affine motion estimation, the method further comprises:
if the current coding unit is the first sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the sub-CU on the horizontal two-fork division;
if the current coding unit is the second sub-CU divided horizontally into three branches, the current coding unit simultaneously searches the optimal prediction direction and the optimal reference frame of the sub-CU divided horizontally into two branches, and selects the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the sub-CU divided horizontally into two branches;
and if the current coding unit is the third sub-CU of the horizontal three-fork division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the lower sub-CU under the horizontal two-fork division.
7. The method of claim 6, wherein if the optimal prediction mode of the parent CU of the current coding unit is not affine motion estimation, the method further comprises:
if the current coding unit is the first sub-CU divided vertically in three forks, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the left sub-CU divided vertically in two forks;
if the current coding unit is the second sub-CU divided vertically in three forks, the current coding unit simultaneously searches the optimal prediction direction and the optimal reference frame of the left sub-CU divided vertically in two forks, and the optimal prediction direction and the optimal reference frame of the right sub-CU divided vertically in two forks, and selects the optimal prediction direction and the optimal reference frame according to the rate distortion cost of the left sub-CU divided vertically in two forks and the rate distortion cost of the right sub-CU divided vertically in two forks;
and if the current coding unit is the third sub-CU of the vertical trigeminal division, the current coding unit only searches and multiplexes the optimal prediction direction and the optimal reference frame of the right sub-CU of the vertical binary division.
8. The acceleration method for affine motion estimation based on VVC coding of claim 1 or 7, wherein λ is 1.05.
9. An affine motion estimation accelerating device based on VVC coding, characterized in that: comprises at least one control processor and a memory for communicative connection with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the VVC code based affine motion estimation acceleration method of any one of claims 1 to 8.
10. A computer-readable storage medium characterized by: the computer-readable storage medium stores computer-executable instructions for causing a computer to perform the method for accelerating affine motion estimation based on VVC coding according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010566975.6A CN111698502A (en) | 2020-06-19 | 2020-06-19 | VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010566975.6A CN111698502A (en) | 2020-06-19 | 2020-06-19 | VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111698502A true CN111698502A (en) | 2020-09-22 |
Family
ID=72482238
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010566975.6A Pending CN111698502A (en) | 2020-06-19 | 2020-06-19 | VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111698502A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112565750A (en) * | 2020-12-06 | 2021-03-26 | 浙江大华技术股份有限公司 | Video coding method, electronic equipment and storage medium |
CN112911308A (en) * | 2021-02-01 | 2021-06-04 | 重庆邮电大学 | H.266/VVC fast motion estimation method and storage medium |
CN113489994A (en) * | 2021-05-28 | 2021-10-08 | 杭州博雅鸿图视频技术有限公司 | Motion estimation method, motion estimation device, electronic equipment and medium |
CN114157868A (en) * | 2022-02-07 | 2022-03-08 | 杭州未名信科科技有限公司 | Video frame coding mode screening method and device and electronic equipment |
CN115190299A (en) * | 2022-07-11 | 2022-10-14 | 杭州电子科技大学 | VVC affine motion estimation fast algorithm |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107046645A (en) * | 2016-02-06 | 2017-08-15 | 华为技术有限公司 | Image coding/decoding method and device |
CN107147911A (en) * | 2017-07-05 | 2017-09-08 | 中南大学 | LIC quick interframe coding mode selection method and device is compensated based on local luminance |
CN107396102A (en) * | 2017-08-30 | 2017-11-24 | 中南大学 | A kind of inter-frame mode fast selecting method and device based on Merge technological movement vectors |
CN110087087A (en) * | 2019-04-09 | 2019-08-02 | 同济大学 | VVC interframe encode unit prediction mode shifts to an earlier date decision and block divides and shifts to an earlier date terminating method |
WO2020103944A1 (en) * | 2018-11-22 | 2020-05-28 | Beijing Bytedance Network Technology Co., Ltd. | Sub-block based motion candidate selection and signaling |
-
2020
- 2020-06-19 CN CN202010566975.6A patent/CN111698502A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107046645A (en) * | 2016-02-06 | 2017-08-15 | 华为技术有限公司 | Image coding/decoding method and device |
CN107147911A (en) * | 2017-07-05 | 2017-09-08 | 中南大学 | LIC quick interframe coding mode selection method and device is compensated based on local luminance |
CN107396102A (en) * | 2017-08-30 | 2017-11-24 | 中南大学 | A kind of inter-frame mode fast selecting method and device based on Merge technological movement vectors |
WO2020103944A1 (en) * | 2018-11-22 | 2020-05-28 | Beijing Bytedance Network Technology Co., Ltd. | Sub-block based motion candidate selection and signaling |
CN110087087A (en) * | 2019-04-09 | 2019-08-02 | 同济大学 | VVC interframe encode unit prediction mode shifts to an earlier date decision and block divides and shifts to an earlier date terminating method |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112565750A (en) * | 2020-12-06 | 2021-03-26 | 浙江大华技术股份有限公司 | Video coding method, electronic equipment and storage medium |
CN112565750B (en) * | 2020-12-06 | 2022-09-06 | 浙江大华技术股份有限公司 | Video coding method, electronic equipment and storage medium |
CN112911308A (en) * | 2021-02-01 | 2021-06-04 | 重庆邮电大学 | H.266/VVC fast motion estimation method and storage medium |
CN112911308B (en) * | 2021-02-01 | 2022-07-01 | 重庆邮电大学 | H.266/VVC fast motion estimation method and storage medium |
CN113489994A (en) * | 2021-05-28 | 2021-10-08 | 杭州博雅鸿图视频技术有限公司 | Motion estimation method, motion estimation device, electronic equipment and medium |
CN114157868A (en) * | 2022-02-07 | 2022-03-08 | 杭州未名信科科技有限公司 | Video frame coding mode screening method and device and electronic equipment |
WO2023147780A1 (en) * | 2022-02-07 | 2023-08-10 | 杭州未名信科科技有限公司 | Video frame coding mode screening method and apparatus, and electronic device |
CN115190299A (en) * | 2022-07-11 | 2022-10-14 | 杭州电子科技大学 | VVC affine motion estimation fast algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111698502A (en) | VVC (variable visual code) -based affine motion estimation acceleration method and device and storage medium | |
JP7497948B2 (en) | Video processing method and device | |
JP7572487B2 (en) | Adaptive motion vector resolution for changing motion vectors | |
CN111357294B (en) | Reduced entropy coding and decoding based on motion information lists of sub-blocks | |
CN111093075B (en) | Motion candidate derivation based on spatial neighboring blocks in sub-block motion vector prediction | |
CN110662036B (en) | Limitation of motion information sharing | |
US20230131933A1 (en) | Method and apparatus for candidate list pruning | |
JP2022508177A (en) | Interaction between intra-block copy mode and inter-prediction tool | |
EP3566447A1 (en) | Method and apparatus for encoding and decoding motion information | |
CN116708814A (en) | Video encoding and decoding method and apparatus performed by video encoder and decoder | |
TW202029773A (en) | Method and apparatus of simplified triangle merge mode candidate list derivation | |
TW202021360A (en) | Extension of look-up table based motion vector prediction with temporal information | |
CN113424535A (en) | History update based on motion vector prediction table | |
TW202021359A (en) | Extension of look-up table based motion vector prediction with temporal information | |
CN113242427B (en) | Rapid method and device based on adaptive motion vector precision in VVC | |
CN111466116B (en) | Method and device for affine interframe prediction of video coding and decoding system | |
JP2022513492A (en) | How to derive a constructed affine merge candidate | |
TW202021358A (en) | Extension of look-up table based motion vector prediction with temporal information | |
US20230388529A1 (en) | Method and apparatus for temporal interpolated prediction in video bitstream | |
TWI854996B (en) | Calculating motion vector predictors | |
JP2024519848A (en) | Geometric partitioning mode with motion vector refinement | |
CN116366839A (en) | Prediction mode decision method, device, equipment and storage medium | |
CN116962697A (en) | Motion search processing method, system, equipment and storage medium for video coding | |
TW202017377A (en) | Affine mode in video coding and decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200922 |