[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2020133115A1 - 编码预测方法、装置及计算机存储介质 - Google Patents

编码预测方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2020133115A1
WO2020133115A1 PCT/CN2018/124504 CN2018124504W WO2020133115A1 WO 2020133115 A1 WO2020133115 A1 WO 2020133115A1 CN 2018124504 W CN2018124504 W CN 2018124504W WO 2020133115 A1 WO2020133115 A1 WO 2020133115A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
coding
encoding
motion
parameter
Prior art date
Application number
PCT/CN2018/124504
Other languages
English (en)
French (fr)
Inventor
梁凡
韩海阳
曹思琪
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN201880100324.9A priority Critical patent/CN113287309A/zh
Priority to EP18945171.9A priority patent/EP3902257A4/en
Priority to PCT/CN2018/124504 priority patent/WO2020133115A1/zh
Publication of WO2020133115A1 publication Critical patent/WO2020133115A1/zh
Priority to US17/357,621 priority patent/US11632553B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • the embodiments of the present application relate to the technical field of video encoding and decoding, and in particular, to an encoding prediction method, device, and computer storage medium.
  • High-efficiency video coding (High Efficiency Video Coding, HEVC) is to adapt to high-resolution and high frame rate video, and the efficiency of typical screen content coding is not very high. Therefore, in view of the characteristics of high contrast, limited color data, and more repeated areas, which have the screen content, on the basis of HEVC, an extended coding standard-Screen Content Coding (SCC) has been proposed.
  • SCC Standard-Screen Content Coding
  • the basic block unit used for video coding is called a coding unit (Coding Unit, CU), and pixels in the CU share the same coding parameters to improve coding efficiency.
  • a coding unit Coding Unit, CU
  • affine motion estimation and compensation which can effectively track more complex motions, such as rotation, scaling, and deformation of moving objects.
  • IBC Intra Block Copy
  • the embodiments of the present application desire to provide a coding prediction method, device, and computer storage medium.
  • the number of coding bits can be further reduced, thereby saving the coding rate.
  • an embodiment of the present application provides a coding prediction method, and the method includes:
  • the coding mode corresponding to the coding block includes at least an intra-block copy IBC coding mode and an affine motion model-based intra-block copy IBCAffine coding mode, and in the coding block for the image intra prediction type, Before determining the motion vector prediction values of at least two control points associated with the coding block, the method further includes:
  • the encoding block selects the IBCAffine encoding mode
  • the encoding block selects the IBC encoding mode.
  • the obtaining the best block vector of the coding block includes:
  • the rate-distortion costs corresponding to different block vectors are calculated respectively; the block vector corresponding to the minimum rate-distortion cost among the rate-distortion costs is taken as the best block vector of the coding block.
  • the motion estimation of the affine motion model is performed on the coding block based on the motion vector prediction values of the at least two control points to obtain the coding block
  • the first An encoding parameter is used to indicate a set of encoding parameters with the least rate distortion cost obtained by the encoding block through motion estimation of the affine motion model.
  • the method when the coding block selects the IBC coding mode, the method further includes:
  • the method further includes:
  • the motion estimation of the affine motion model on the coding block based on the motion vector prediction values of the at least two control points to obtain the first coding parameter of the coding block includes:
  • the first encoding parameter of the encoding block is obtained.
  • the method further includes:
  • an embodiment of the present application provides an encoding prediction apparatus, the encoding prediction apparatus includes: a determination unit, a motion estimation unit, and a prediction unit, wherein,
  • the determining unit is configured to determine the motion vector prediction value of at least two control points associated with the coding block for the coding block of the image intra prediction type;
  • the motion estimation unit is configured to perform affine motion model motion estimation on the coding block based on the motion vector prediction values of the at least two control points to obtain the first coding parameter of the coding block; wherein, the The first encoding parameter is used to indicate a set of encoding parameters with the smallest rate-distortion cost obtained by motion estimation of the encoding block in a non-translation motion mode;
  • the prediction unit is configured to perform predictive coding on the coding block based on the first coding parameter.
  • the encoding prediction apparatus further includes an acquisition unit and a judgment unit, wherein,
  • the obtaining unit is configured to obtain the best block vector of the coding block
  • the judging unit is configured to calculate the first motion estimation result based on the IBCAffine coding mode of the coding block and the second motion estimation result based on the IBC coding mode of the coding block when the optimal block vector is equal to 0; And if the first motion estimation result is not greater than a preset multiple of the second motion estimation result, the encoding block selects the IBCAffine encoding mode; and if the first motion estimation result is greater than the second motion estimation result The preset multiple, the coding block selects the IBC coding mode.
  • the acquiring unit is specifically configured to perform an intra-block copy search on the encoding block, select at least one reference block that matches the hash key value of the encoding block, and based on the at least one reference block Establishing a first candidate block list of the coding block; and traversing the first candidate block list, calculating a block vector between the coding block and each reference block in the first candidate list; based on the block vector Calculate the rate-distortion costs corresponding to different block vectors separately; use the block vector corresponding to the minimum rate-distortion cost among the rate-distortion costs as the best block vector of the coding block.
  • the motion estimation unit is specifically configured to perform motion estimation of the affine motion model on the coding block through the IBCAffine coding mode based on the motion vector prediction values of the at least two control points, to obtain A first encoding parameter of the encoding block; wherein the first encoding parameter is used to indicate a set of encoding parameters with the least rate-distortion cost obtained by the encoding block through motion estimation of the affine motion model.
  • the motion estimation unit is further configured to perform motion estimation of the translation motion model of the coding block based on the IBC coding mode to obtain a second coding parameter of the coding block; wherein, the second coding parameter A set of coding parameters that is used to instruct the coding block to obtain the least rate-distortion cost through motion estimation of the translational motion model;
  • the prediction unit is further configured to perform predictive coding on the coding block based on the second coding parameter.
  • the motion estimation unit is further configured to directly use the optimal block vector as the third coding parameter corresponding to the coding block if the optimal block vector is not equal to 0;
  • the prediction unit is further configured to perform predictive coding on the coding block based on the third coding parameter.
  • the motion estimation unit is specifically configured to calculate the prediction value of at least one pixel of the coding block in the corresponding reference frame based on the motion vector prediction values of the at least two control points; Performing an iterative operation on the matching error of at least one pixel of the coding block between the original frame and the reference frame and the gradient matrix of the predicted value, and updating the motion vector according to the iterative operation; and when the number of the iterative operations meets a preset When the threshold is the number of times, the updated motion vector is obtained; and based on the updated motion vector, the first coding parameter of the coding block is obtained.
  • the motion estimation unit is further configured to establish a second candidate block list of the encoding block for the encoding block of the image intra prediction type; wherein, the reference block in the second candidate block list is The coding blocks are spatially adjacent, and the reference blocks are coded using the IBCAffine coding mode; and traversing the second candidate block list, according to the motion of at least two control points of each reference block in the second candidate block list Vector, respectively calculating the motion vector of the position control point corresponding to the coding block; and obtaining the fourth coding parameter corresponding to the coding block from the motion vector; wherein the fourth coding parameter is used to indicate the coding A set of coding parameters with the lowest rate-distortion cost among the motion vectors obtained by the block;
  • the prediction unit is further configured to perform predictive coding on the coding block based on the fourth coding parameter.
  • an embodiment of the present application provides an encoding prediction apparatus, the encoding prediction apparatus includes: a memory and a processor; wherein,
  • the memory is used to store a computer program that can run on the processor
  • the processor is configured to execute the steps of the method according to any one of the first aspects when running the computer program.
  • an embodiment of the present application provides a computer storage medium that stores an encoding prediction program, which is implemented when being executed by at least one processor as described in any one of the first aspects Steps of the method.
  • Embodiments of the present application provide an encoding prediction method, device, and computer storage medium.
  • a motion vector prediction value of at least two control points associated with the encoding block is determined; then Performing an affine motion model motion estimation on the coding block based on the motion vector prediction values of the at least two control points to obtain a first coding parameter corresponding to the coding block, where the first coding parameter is used to indicate the A set of coding parameters with the least rate-distortion cost obtained by the motion estimation of the coding block through non-translation motion; finally, based on the first coding parameter, the coding block is predictively coded; due to the increased motion of the affine motion model It is estimated that it can solve the non-translation motion such as scaling, rotation and deformation existing in the intra prediction type coding block in the screen image, which further reduces the number of coding bits, thereby saving the coding rate.
  • FIG. 1 is a schematic structural diagram of an IBC encoding mode provided by related technical solutions
  • FIG. 2 is a schematic flowchart of establishing a MV candidate list provided by related technical solutions
  • FIG. 3 is a schematic flowchart of the Non-Merge mode in an IBC encoding mode provided by the related technical solutions;
  • FIG. 4 is a schematic structural diagram of an adjacent block configuration in a Merge mode in an IBC coding mode provided by a related technical solution
  • FIG. 5 is a schematic flowchart of the Merge mode in an IBC encoding mode provided by the related technical solutions
  • FIG. 6 is a schematic flowchart of a coding prediction method provided by an embodiment of the present application.
  • FIG. 7(a) and 7(b) are schematic structural diagrams of a non-translation movement of an encoding block provided by an embodiment of the present application.
  • FIG. 8(a) is a schematic diagram of motion vectors of at least two control points associated with an encoding block according to an embodiment of the present application
  • FIG. 8(b) is a schematic diagram of motion vector samples of each sub-block in an encoding block provided by an embodiment of the present application;
  • 9(a) is a schematic structural diagram of constructing a candidate list based on five adjacent blocks according to an embodiment of the present application.
  • 9(b) is a schematic structural diagram of deriving a motion vector corresponding to a coding block based on a neighboring block (A) according to an embodiment of the present application;
  • FIG. 10 is a detailed flowchart of a coding prediction method provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a coding prediction device according to an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of another encoding prediction device provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a specific hardware structure of an encoding prediction apparatus provided by an embodiment of the present application.
  • JVET Joint Video Research Group
  • VCEG Video Coding Experts Group
  • MPEG Moving Picture Experts Group
  • JEM Joint Exploration Test Model
  • VVC Versatile Video Coding
  • JEVT established the algorithm description and coding method of VVC working draft 2 and VTM2 at the 11th meeting.
  • JVET added many new tools to it, for example: mixed tree structure (Quadtree with multi-type tree, MT; it is composed of quadtree (Quad Tree, QT), Binary Tree (BT) and Trigeminal Tree (TT) structure), affine motion compensation, sub-block-based time domain motion vector prediction (Sub-BlockBasedTemporalMotionVectorPrediction , SbTMVP), adaptive motion vector accuracy (Advanced Motion Vector Resolution, AMVR), etc.
  • mixed tree structure Quadtree with multi-type tree, MT; it is composed of quadtree (Quad Tree, QT), Binary Tree (BT) and Trigeminal Tree (TT) structure
  • affine motion compensation sub-block-based time domain motion vector prediction
  • SBTMVP Sub-BlockBasedTemporalMotionVectorPrediction
  • AMVR Advanced Motion Vector Resolution
  • the basic principle of video coding and compression is to use the correlation between the space domain, time domain and codeword to remove redundancy as much as possible.
  • the current popular coding method is to use a block-based hybrid video coding framework, which realizes video coding compression through steps such as prediction (including intra prediction and inter prediction), transformation, quantization, and entropy coding.
  • This coding framework has strong vitality, and HEVC still uses this block-based hybrid video coding framework.
  • the SCC coding standard has been extended on the basis of HEVC, and its standardization work has basically been completed in 2016.
  • IBC Intra Block Copy
  • PHT Palette Mode
  • ACT Adaptive Color Transform
  • AMVR Adaptive Motion Vector Analysis
  • IBC Intra Block Copy
  • PHT Palette Mode
  • ACT Adaptive Color Transform
  • AMVR Adaptive Motion Vector Analysis
  • IBC coding mode is a method similar to motion compensation, which is found in the current frame.
  • the reference block matching the current coding block is represented by a block vector (BV).
  • BV block vector
  • FIG. 1 shows a schematic structural diagram of an IBC encoding mode provided by the related technical solutions.
  • the slash-filled area is the search area (that is, the encoded area of the current image frame), and the black shaded blocks are respectively It is the current coding block (Current) and the best prediction block (Best Block Predictor) matching it.
  • the distance between the current coding block and the best prediction block matching it is called the block vector (Block Vector, BV).
  • an intra block copy (Intra Block Copy (IBC)) technology is proposed for screen content coding.
  • the basic idea is similar to the traditional inter-frame motion estimation.
  • the reference block that matches the block to be coded is searched in the coded area of the current frame to obtain the distance between the two blocks. This distance is called a block vector, and then based on the block The vector obtains the prediction residual, and finally encodes the block to be encoded.
  • the encoding of the screen content mainly adopts the IBC encoding mode, which can also be called the coding image reference (Coding Picture Reference, CPR) mode.
  • the IBC coding mode can be divided into the first sub-coding mode (such as IBC Non-Merge mode) and the second sub-coding mode (such as IBC Merge mode), and both coding modes are applicable to less than or equal to 16 ⁇ 16 Coding block.
  • the IBC encoding mode will be described in detail through these two encoding modes.
  • the first sub-coding mode uses the IBC Non-Merge mode as an example.
  • the coding mode includes two search methods: Hash-based Search and Inter-Frame Search.
  • FIG. 2 it shows a schematic diagram of a process for establishing an MV candidate list provided by related technical solutions.
  • the process may include:
  • MV Predictor two motion vector predictors (MV Predictor, MVP) can be obtained. After getting 2 MVPs, you can start searching.
  • the first search method used is Hash-based Search, this search method is to speed up the search for the entire image.
  • the Hash Key matching between the current coding block (Curblock) and the reference block (Refblock) is performed with 4 ⁇ 4 blocks as the basic unit; here, the selection of the reference block can be extended to the whole All allowable dimensions of the image.
  • FIG. 3 shows a schematic flow chart of the Non-Merge mode in an IBC coding mode provided by the related technical solutions; the flow may include:
  • S302 Establish a reference block candidate list based on Hash-based Search; wherein, each reference block in the reference block candidate list has a Hash Key matching relationship with the encoding block;
  • establishing a reference block candidate list specifically, first of all, in the current entire image, a 4 ⁇ 4 coding block is used as a basic unit, and the mapping between the coding block and the Hash Key is established by the position of the coding block Table; then perform Hash Key matching on the encoding block and the reference block, if and only if the Hash Key value of the encoding block and the Hash Key value of the reference block are equal, it is considered that the reference block and the encoding block can match; based on these matching reference blocks , A reference block candidate list can be constructed.
  • RDcost is calculated based on BV and MVP; that is to say, there is a correspondence between RDcost and BV and MVP.
  • traversing the reference block candidate list there will be a BV between the coding block and each reference block in the reference block candidate list; then according to the two MVPs, each MVP is used as the search starting point to perform each BV Calculation of the corresponding rate-distortion cost (Rate-Distortion Cost, RDcost), so that a RDcost set can be obtained; in the RDcost set, the minimum value of RDcost can be selected, and finally the coding parameter corresponding to this minimum value of RDcost can be used as the first One parameter is reserved.
  • step S305 when the traversal has not ended, return to step S303 to continue the process; when the traversal ends, step S306 is executed. Since the first parameter corresponding to the minimum Rate-Distortion Cost (RDcost) is retained, after step S307, the coding block can be motion compensated according to the first parameter to determine the prediction residual of the coding block Difference, thereby predictively encoding the encoded block.
  • the first parameter may include not only the first BV and the first MVP index (MVP index), but also the first result (cost), which is not specifically limited in the embodiment of the present application.
  • step S306 if the first BV corresponding to the minimum RDcost is equal to 0, or the Hash-based Search of step S302 does not find a reference block that matches the coding block, you can also execute Pattern Search Search method.
  • the search method is to use the current image as a P frame and set it as the inter prediction type, and to place the current image at the end of REF_PLC_LIST_0 as the reference frame for inter prediction. Still referring to FIG. 3, the search process may include:
  • S310 Create a BV candidate list of the coding block based on the neighboring blocks of the coding block;
  • S311 Create a BV array of the encoding block and a corresponding cost array based on the BV candidate list;
  • S312 Traverse the BV candidate list, calculate the SAD corresponding to each BV, and update the BV array and the cost array in ascending order of SAD costs;
  • S315 Determine whether the second BV is equal to 0;
  • each MVP is used as a search starting point to calculate the RDcost corresponding to the second BV; the second parameter corresponding to the minimum return RDcost is reserved; wherein, the The second parameter includes at least the second BV and the second MVP index;
  • establishing a BV candidate list of the encoding block specifically, first access the neighboring blocks (Neiblock) in the AMVP airspace candidate list corresponding to the current encoding block respectively, if the neighboring blocks use IBC encoding Mode, then the reference block (NeiRefblock) of the adjacent block can be multiplexed as the reference of the coding block, and then the BV between the coding block and the NeiRefblock can be calculated and added to the BV candidate list of the coding block, thereby constructing the coding block List of BV candidates.
  • a BV array and a corresponding cost array can be established, and the two arrays are arranged in ascending order according to the sum of absolute difference costs (Sum of Absolute Differences Cost, SAD cost); , Traverse the BV candidate list, and update the sorting of the two arrays in real time according to the SAD cost corresponding to each calculated BV; after the traversal ends, then only return the 0th element in the two arrays, which corresponds to the SAD the second cost and the second BV with the smallest cost; then, using each MVP as the search starting point, calculate the RDcost corresponding to the returned second BV, and save the second parameter corresponding to the smallest RDcost; so after step S316, according to The second parameter may also perform motion compensation on the coding block, determine the prediction residual of the coding block, and thereby perform predictive coding on the coding block.
  • the second parameter may include not only the second BV and the second MVP index, but
  • the prediction residual can be determined, so that the coding block can be predictively encoded.
  • the encoder only needs to transmit the corresponding MVP index, and the difference between BV and MVP (Motion Vector Difference, MVD).
  • the decoder first constructs the MV candidate list in the same way, and the BV can be obtained according to the MVP index and MVD obtained by transmission, so that the encoded block after predictive encoding can be decoded to obtain the Predicted pixel value; plus the prediction residual transmitted by the encoder, so that the reconstructed pixel value corresponding to each pixel in the encoding block can be obtained at the decoder.
  • the second sub-encoding mode uses the IBC Merge mode as an example.
  • This encoding mode also uses the current image as the P frame and sets it as the inter prediction type, and uses the current image as the reference frame during inter prediction.
  • FIG. 4 shows a schematic structural diagram of a configuration of adjacent blocks in the Merge mode in an IBC encoding mode provided by the related technical solution. As shown in FIG. 4, according to the priority order of A1->B1->B0->A0->B2 shown in FIG. 4, the MV candidate list of the coding block can be constructed.
  • FIG. 5 shows a schematic flowchart of the Merge mode in an IBC encoding mode provided by the related technical solution; the process may include:
  • S502 Traverse the MV candidate list, directly use the MVP in the MV candidate list as the BV of the coding block, and calculate the RDcost corresponding to each BV;
  • the spatial candidate MV is put into the MV candidate list, so that the MV candidate list corresponding to the coding block can be constructed.
  • the encoding end since BV and MVP are the same, there is no MVD described in the first sub-coding mode. In this way, on the encoder side, the encoding end only needs to transmit the MVP index (MVP index). On the decoder side, the decoding end first constructs the MV candidate list in the same manner, and then directly obtains the BV according to the MVP index obtained from the transmission, so that the encoded block after the predictive encoding can be decoded.
  • MVP index MVP index
  • the reference block matching the coding block is still obtained based on the translational motion model, resulting in this IBC coding mode is only for the coding efficiency of the block that appears repeatedly in the screen content is higher. ;
  • there are also complex movements such as zooming, rotation, and deformation, which leads to further optimization space for encoding.
  • An embodiment of the present application provides a coding prediction method.
  • an affine motion model is added.
  • the number of coding bits can be further reduced, thereby saving Encoding bit rate.
  • the coding prediction method provided by the embodiment of the present application can be applied not only to the intra prediction type on the encoder side, but also to the intra prediction type on the decoder side; that is, the embodiment of the application can be applied
  • the coding system can also be applied to the decoding system, which is not specifically limited in the embodiments of the present application.
  • FIG. 6 it shows an example of the flow of an encoding prediction method provided by an embodiment of the present application.
  • the method may include:
  • S601 For a coding block of an image intra prediction type, determine a motion vector prediction value of at least two control points associated with the coding block;
  • S602 Perform an affine motion model motion estimation on the coding block based on the motion vector prediction values of the at least two control points to obtain a first coding parameter of the coding block; wherein, the first coding parameter is used for A set of coding parameters that instructs the coding block to obtain the least rate-distortion cost through motion estimation in a non-translation motion mode;
  • S603 Perform predictive coding on the coding block based on the first coding parameter.
  • the motion estimation of the affine motion model of the coding block can obtain multiple sets of coding parameters; rate distortions are performed for different coding parameters respectively
  • the cost is calculated to obtain the rate-distortion costs corresponding to different coding parameters; the minimum rate-distortion cost is selected from these rate-distortion costs, and the set of coding parameters corresponding to the minimum rate-distortion cost is used as the first coding parameter.
  • the coding block is the coding block to be coded in the current image, and the intra prediction type is set.
  • the motion vector prediction values of at least two control points associated with the coding block are first determined; then the motion estimation of the affine motion model is performed on the coding block based on the motion vector prediction values of the at least two control points
  • the first encoding parameter is used to instruct the encoding block to obtain a set of encoding parameters with the least rate-distortion cost through motion estimation in a non-translation motion mode
  • the first encoding parameter is used to predictively encode the encoding block; since the motion estimation of the affine motion model is added, non-translation such as scaling, rotation, and deformation of the intra prediction type encoding block in the screen image can be solved
  • the complex motion situation further reduces the number of coding bits, thereby saving the coding rate.
  • inter prediction is based on the motion compensation of the coding block, usually using a translational motion model, that is, assuming that the motion vectors (Motion Vector, MV) of all pixels in a coding block are the same, generally by the upper left corner of the coding block
  • MV Motion Vector
  • the vertex block vector is replaced; however, in practical applications, there are still non-translational motions such as zooming motion, rotational motion, and deformation motion in the encoding block, as shown in Figure 7(a) and Figure 7(b), respectively.
  • the scaling motion of the coding block; where ⁇ is the scaling factor, and the value of ⁇ is specifically set according to the actual situation, which is not specifically limited in this embodiment of the present application.
  • the affine motion model is motion compensated by the MV of the control point.
  • control points such as two control points and three control points;
  • the encoding block can use the MV of the control point to find the corresponding reference block (or called mapping block) in the reference frame.
  • FIG. 8(a) shows a schematic diagram of motion vectors of at least two control points associated with an encoding block provided by an embodiment of the present application; when an encoding block uses two control points, that is, an upper left corner and an upper right corner The motion vectors of the two vertices, shown in Figure 8(a) with Also known as the four-parameter affine model; when the coding block uses three control points, that is, the motion vectors of the three vertices in the upper left corner, upper right corner, and lower left corner, as shown in Figure 8(a) with Also known as the six-parameter affine model.
  • the motion vector of each 4 ⁇ 4 sub-block in the coding block can be derived by equation (1);
  • the motion vector of each 4 ⁇ 4 sub-block in the coding block can be derived by equation (2);
  • w and h denote the width and height of the coding block, respectively, and the motion vector of the top left corner vertex The motion vector of the vertex in the upper right corner The motion vector of the bottom left corner vertex The motion vector corresponding to each pixel (x, y) in the coding block
  • FIG. 8(b) which shows a schematic diagram of a motion vector sample of each sub-block in an encoding block provided by an embodiment of the present application; wherein, for the motion vector samples of each sub-block shown in FIG. 8(b), First, apply a motion compensation difference filter, and then combine the motion vector derived from equation (1) or equation (2) to complete the coding prediction for each sub-block. Therefore, the use of affine motion compensation can better describe the complex motion situation.
  • the embodiments of the present application apply the affine motion model to the coding block based on the intra prediction type, which can further reduce the number of coding bits, thereby improving the coding code rate.
  • the coding mode corresponding to the coding block includes at least the intra-block copy IBC coding mode and the affine motion model-based intra-block copy IBCAffine coding mode; that is, the coding block can select the IBC coding mode for predictive coding, and The IBCAffine coding mode can be selected for predictive coding, but the embodiment of the present application is not specifically limited.
  • the method before determining the motion vector prediction values of at least two control points associated with the coding block for the coding block of the image intra prediction type, the method further includes:
  • the encoding block selects the IBCAffine encoding mode
  • the encoding block selects the IBC encoding mode.
  • the obtaining the best block vector of the coding block includes:
  • the rate-distortion costs corresponding to different block vectors are calculated respectively; the block vector corresponding to the minimum rate-distortion cost among the rate-distortion costs is taken as the best block vector of the coding block.
  • the preset multiple is a predetermined judgment value for measuring whether the coding block adopts the IBCAffine coding mode.
  • the preset multiple may be set to 1.05 times; however, in actual applications, the preset multiple is specifically set according to actual conditions, and the embodiment of the present application is not specifically limited.
  • the encoding block determines the optimal encoding mode (that is, the encoding mode currently selected by the encoding block)
  • the most The Flag value of the optimal coding mode is set to True, so that the coding mode adopted by the coding block is indicated in the coding system, so that predictive coding is performed according to the coding mode.
  • the motion estimation of the affine motion model is performed on the coding block based on the motion vector prediction values of the at least two control points to obtain
  • the first encoding parameter of the encoding block includes:
  • the first An encoding parameter is used to indicate a set of encoding parameters with the least rate distortion cost obtained by the encoding block through motion estimation of the affine motion model.
  • the method when the encoding block selects the IBC encoding mode, the method further includes:
  • the method further includes:
  • the Flag value of the optimal encoding mode can be set to True, thereby indicating The encoding mode used by the encoding block.
  • the IBC coding mode can be divided into the first sub-coding mode (such as IBC Non-Merge mode) and the second sub-coding mode (such as IBC Merge mode); for the IBCAffine coding mode, it can also be divided into the first Three sub-coding modes (such as IBCAffine Non-Merge mode) and fourth sub-coding modes (such as IBCAffine Merge mode); the first sub-coding mode and the second sub-coding mode have been described in detail above, and the third sub-coding mode will be described below The coding mode and the fourth sub-coding mode are described in detail.
  • the search process is similar to that of the first sub-coding mode
  • the search process is similar to that of the first sub-coding mode
  • Each reference block in the first candidate block list has a hash key matching relationship with the coding block; combining the two preset MVPs obtained in FIG.
  • the first candidate block list is traversed, and the two Let MVP be the starting point of the search and calculate the rate-distortion cost corresponding to each block vector; according to the calculated rate-distortion cost, the block vector corresponding to the minimum rate-distortion cost is taken as the best block vector of the coding block; thus, when When the best BV is not equal to 0, it means that the hash values at any position in the coding block are all equal, and the motion estimation of the IBCAffine coding mode will no longer be performed.
  • the best block vector can be directly used as the coding
  • performing motion estimation of the affine motion model on the coding block based on the motion vector prediction values of the at least two control points to obtain the first coding parameter of the coding block includes:
  • the first encoding parameter of the encoding block is obtained.
  • the current image also needs to be regarded as a P frame
  • the prediction mode is set to the inter prediction mode
  • the current image is placed at the end of REF_PIC_LIST_0 as a reference frame during inter prediction.
  • the coding block obtains two candidate MVPs through the inter-frame affine motion model-based advanced motion vector prediction technology (Affine Advanced Motion Vector Prediction, Affine AMVP), and each candidate MVP contains at least two control points; then each candidate MVP is the search starting point, and the affine motion search is performed.
  • ⁇ x i and ⁇ y i represent the motion vector of the coding block, which does not change linearly, but is determined by the four parameters (a, b, ⁇ 0 , ⁇ 1 ) in the following formula (4),
  • a and b respectively represent the translational component of the pixel coordinates in the reference frame after the rotation of the coding block; ⁇ 0 and ⁇ 1 represent the parameters in the rotation matrix after the coding block performs the scaling transformation of the coefficient ⁇ ; Express Transpose from row vector to column vector.
  • G xi and G yi represent gradient values, they are calculated by Sobel gradient.
  • the prediction error of any pixel (x i , y i ) in the coding block can be calculated by the following equation (6),
  • arbitrary pixel point (x i, y i) of 0 indicates a prediction error coding block
  • the second term represents the gradient matrix of the predicted value of the pixel at the corresponding position in the reference frame.
  • the iterative operation process is used to update the motion vector; when the number of iterative operations meets the preset number of times threshold, the prediction error at this time is the smallest, and the resulting motion vector It is the motion vector after the final demand update.
  • the preset number of times threshold is a preset number of iterations required to measure the minimum prediction error.
  • the preset number threshold may be 5; and for the six-parameter affine model, the preset number threshold may be 4; in practical applications, the preset number threshold may be based on actual The situation is specifically set, and the embodiments of the present application are not specifically limited.
  • the method further includes:
  • FIG. 9(a) shows a schematic structural diagram of a candidate list based on five neighboring blocks provided by an embodiment of the present application; thus, for A, B, C, D and 9 shown in 9(a) E five adjacent blocks, access in order of A->B->C->D->E; when adjacent blocks use IBCAffine coding mode, and the reference frame is the last image in the sequence REF_PIC_LIST_0 (ie the current Image), put the adjacent block as a reference block in the second candidate block list; then traverse the second candidate block list, select according to the number of control points, if the number of control points is 2, then select Use the above formula (1) to derive the MV of the position control point corresponding to the current coding block; if the number of control points is 3, then choose to use the above formula (2) to derive the MV of the position control point corresponding to the current coding block, For details, refer
  • the coding parameter can predictively encode the coding block. That is to say, on the encoder side, the encoding end needs to transmit the MVP index and prediction residual of the second candidate block list to the decoding end; while on the decoder side, the decoding end can establish a candidate block list consistent with the encoding end, according to The MVP index transmitted in the code stream is calculated by Equation (1) or Equation (2) to obtain the MV of the current encoding block, so that the encoding block after predictive encoding can be decoded.
  • the above embodiment provides an encoding prediction method.
  • For an encoding block of an image intra-prediction type first determine the motion vector prediction values of at least two control points associated with the encoding block; then based on the at least two controls The motion vector prediction value of the point performs affine motion model motion estimation on the coding block to obtain a first coding parameter of the coding block, and the first coding parameter is used to indicate that the coding block adopts a non-translation motion mode A set of coding parameters with the lowest rate-distortion cost obtained by motion estimation; finally predictively coding the coding block based on the first coding parameters; since the motion estimation of the affine motion model is added, the frame in the screen image can be solved Intra-prediction type coding blocks have non-translational motion such as scaling, rotation, and deformation, which further reduces the number of coding bits, thereby saving the coding rate.
  • FIG. 10 shows a detailed flow of an encoding prediction method provided by an embodiment of the present application.
  • the detailed flow may include:
  • S1001 Perform an IBC search on the coding block to establish a first candidate block list of the coding block;
  • S1002 Traverse the first candidate block list and calculate a block vector between the coding block and each reference block in the first candidate list;
  • S1003 Calculate the rate-distortion costs corresponding to different block vectors based on the block vectors respectively; use the block vector corresponding to the minimum rate-distortion cost among the rate-distortion costs as the best block vector;
  • the coding block selects the IBC coding mode to obtain the second coding parameter of the coding block;
  • step S1011 will be executed.
  • the coding prediction method provided in the above embodiments can be applied not only to the intra prediction type on the encoder side, but also to the intra prediction type on the decoder side. That is to say, the embodiments of the present application can be applied to both an encoding system and a decoding system, but the embodiments of the present application are not specifically limited.
  • FIG. 11 shows the composition of an encoding prediction apparatus 110 provided by an embodiment of the present application, it may include: a determination unit 1101, a motion estimation unit 1102, and a prediction unit 1103; wherein,
  • the determining unit 1101 is configured to determine a motion vector prediction value of at least two control points associated with the coding block for the coding block of the image intra prediction type;
  • the motion estimation unit 1102 is configured to perform affine motion model motion estimation on the coding block based on the motion vector prediction values of the at least two control points, to obtain the first coding parameter of the coding block; wherein, The first encoding parameter is used to indicate a set of encoding parameters with the smallest rate-distortion cost obtained by motion estimation of the encoding block through a non-translation motion mode;
  • the prediction unit 1103 is configured to perform predictive coding on the coding block based on the first coding parameter.
  • the encoding prediction device 110 further includes an obtaining unit 1104 and a judging unit 1105, where,
  • the obtaining unit 1104 is configured to obtain the best block vector of the coding block
  • the judging unit 1105 is configured to, when the optimal block vector is equal to 0, calculate the first motion estimation result of the coding block based on the IBCAffine coding mode and the second motion estimation result of the coding block based on the IBC coding mode, respectively ; And if the first motion estimation result is not greater than the preset multiple of the second motion estimation result, the encoding block selects the IBCAffine encoding mode; and if the first motion estimation result is greater than the second motion estimation result; The preset multiple of the result, the coding block selects the IBC coding mode.
  • the acquiring unit 1104 is specifically configured to perform an intra-block copy search on the encoding block, select at least one reference block matching the hash key value of the encoding block, and based on the at least one reference
  • the block builds a first candidate block list of the encoding block; and traverses the first candidate block list, calculating a block vector between the encoding block and each reference block in the first candidate list; and based on the For block vectors, the rate-distortion costs corresponding to different block vectors are calculated respectively; the block vector corresponding to the minimum rate-distortion cost among the rate-distortion costs is taken as the best block vector of the coding block.
  • the motion estimation unit 1102 is specifically configured to perform motion estimation of the affine motion model on the coding block through the IBCAffine coding mode based on the motion vector prediction values of the at least two control points, to obtain A first encoding parameter of the encoding block; wherein the first encoding parameter is used to indicate a set of encoding parameters with the smallest rate-distortion cost obtained by the encoding block through motion estimation of the affine motion model.
  • the motion estimation unit 1102 is further configured to perform motion estimation of the translation motion model of the coding block based on the IBC coding mode to obtain a second coding parameter of the coding block; wherein, the second coding The parameter is used to indicate a set of coding parameters with the smallest rate-distortion cost obtained by the coding block through motion estimation of the translational motion model;
  • the prediction unit 1103 is further configured to perform predictive coding on the coding block based on the second coding parameter.
  • the motion estimation unit 1102 is further configured to directly use the optimal block vector as the third coding parameter corresponding to the coding block if the optimal block vector is not equal to 0;
  • the prediction unit 1103 is further configured to perform predictive coding on the coding block based on the third coding parameter.
  • the motion estimation unit 1102 is specifically configured to calculate the prediction value of at least one pixel of the encoding block in the corresponding reference frame based on the motion vector prediction values of the at least two control points; and Performing an iterative operation on the matching error of at least one pixel of the coding block between the original frame and the reference frame and the gradient matrix of the predicted value, updating the motion vector according to the iterative operation; and when the number of iterative operations meets the pre- When the threshold value is set, an updated motion vector is obtained; and based on the updated motion vector, the first coding parameter of the coding block is obtained.
  • the motion estimation unit 1102 is further configured to establish a second candidate block list of the encoding block for the encoding block of the image intra prediction type; wherein, the reference block in the second candidate block list Spatially adjacent to the encoding block, and the reference block is encoded using the IBCAffine encoding mode; and traversing the second candidate block list, according to at least two control points of each reference block in the second candidate block list A motion vector, respectively calculating a motion vector of a position control point corresponding to the coding block; and obtaining a fourth coding parameter corresponding to the coding block from the motion vector; wherein the fourth coding parameter is used to indicate the A set of coding parameters with the lowest rate-distortion cost among the motion vectors obtained from the coding block;
  • the prediction unit 1103 is further configured to perform predictive coding on the coding block based on the fourth coding parameter.
  • the “unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc. Of course, it may also be a module or non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of this embodiment essentially or It is said that part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions to make a computer device (may It is a personal computer, a server, or a network device, etc.) or a processor (processor) that performs all or part of the steps of the method described in this embodiment.
  • the foregoing storage media include various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (Read Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk.
  • program codes such as a USB flash drive, a mobile hard disk, a read-only memory (Read Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disk.
  • this embodiment provides a computer storage medium that stores an encoding prediction program, and when the encoding prediction program is executed by at least one processor, the steps of the method described in the foregoing technical solution shown in FIG. 6 are implemented.
  • FIG. 13 shows a specific hardware structure example of the coding prediction device 110 provided by the embodiment of the present application, which may include: a network interface 1301, a memory 1302, and a processor 1303 ; Each component is coupled together via the bus system 1304. Understandably, the bus system 1304 is used to implement connection and communication between these components. In addition to the data bus, the bus system 1304 also includes a power bus, a control bus, and a status signal bus. However, for clarity, various buses are marked as the bus system 1304 in FIG. 13. Among them, the network interface 1301 is used to receive and send signals in the process of sending and receiving information with other external network elements;
  • the memory 1302 is used to store a computer program that can run on the processor 1303;
  • the processor 1303 is configured to execute:
  • the memory 1302 in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically Erasable programmable read only memory (Electrically, EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDRSDRAM
  • enhanced SDRAM ESDRAM
  • Synchlink DRAM SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the processor 1303 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method may be completed by an integrated logic circuit of hardware in the processor 1303 or an instruction in the form of software.
  • the processor 1303 may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an existing programmable gate array (Field Programmable Gate Array, FPGA), or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied and executed by a hardware decoding processor, or may be executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory 1302, and the processor 1303 reads the information in the memory 1302 and completes the steps of the above method in combination with its hardware.
  • the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processor (Digital Signal Processing, DSP), digital signal processing device (DSP Device, DSPD), programmable Logic device (Programmable Logic Device, PLD), field-programmable gate array (Field-Programmable Gate Array, FPGA), general-purpose processor, controller, microcontroller, microprocessor, others used to perform the functions described in this application Electronic unit or its combination.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device digital signal processing device
  • DSPD digital signal processing device
  • PLD programmable Logic Device
  • FPGA field-programmable gate array
  • controller microcontroller, microprocessor, others used to perform the functions described in this application Electronic unit or its combination.
  • the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein.
  • the software codes can be stored in memory and executed by the processor.
  • the memory may be implemented in the processor or external to the processor.
  • the processor 1303 is further configured to execute the steps of the method in the foregoing technical solution shown in FIG. 6 when the computer program is run.
  • the motion vector prediction values of at least two control points associated with the coding block are determined; then, the motion vector prediction based on the at least two control points Value performs motion estimation of the affine motion model on the coding block to obtain a first coding parameter corresponding to the coding block, where the first coding parameter is used to indicate that the coding block is obtained by motion estimation in a non-translational motion mode A set of coding parameters with the lowest rate-distortion cost; finally, based on the first coding parameters, predictive coding of the coding block; due to the increased motion estimation of the affine motion model, which can solve the intra-frame prediction type in the screen image
  • the non-translational motions such as scaling, rotation, and deformation of the coding block of the system further reduce the number of coding bits, thereby saving the coding rate.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了一种编码预测方法、装置及计算机存储介质,所述方法包括:针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;基于所述第一编码参数,对所述编码块进行预测编码。

Description

编码预测方法、装置及计算机存储介质 技术领域
本申请实施例涉及视频编解码技术领域,尤其涉及一种编码预测方法、装置及计算机存储介质。
背景技术
随着科学技术的快速发展,人们对视频观看和处理的要求越来越高,尤其是人们对屏幕内容视频(Screen Content Video,SCV)的需求日益增多。而高效率视频编码(High Efficiency Video Coding,HEVC)是为了适应高分辨率和高帧率的视频,对于典型的屏幕内容编码的效率提升并不是很高。因此,针对屏幕内容所具有的高对比度、有限色彩数据和较多重复区域等特点,在HEVC的基础上,已经提出了扩展的编码标准-屏幕内容编码(Screen Content Coding,SCC)。
在大部分编码标准中,自适应帧间/帧内预测是基于块来使用的。例如,SCC编码标准中,用于视频编码的基本块单元称为编码单元(Coding Unit,CU),CU中的像素共享相同的编码参数,以提高编码效率。针对最新提出的一个编解码技术是仿射运动估计与补偿,其能够有效地追踪更复杂的运动,比如旋转、缩放和运动对象的变形,目前主要应用于帧间预测类型的屏幕内容编码;而针对帧内预测类型,现有方案普遍采用的是帧内块复制(Intra Block Copy,IBC)编码模式,它仅考虑了二维(Two-Deimensional,2D)平移运动模型,导致编码还具有进一步的优化空间。
发明内容
有鉴于此,本申请实施例期望提供一种编码预测方法、装置及计算机存储介质,通过增加仿射运动模型可以进一步降低编码比特数,从而节省了编码码率。
本申请实施例的技术方案可以如下实现:
第一方面,本申请实施例提供了一种编码预测方法,所述方法包括:
针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
基于所述第一编码参数,对所述编码块进行预测编码。
在上述方案中,所述编码块对应的编码模式至少包括帧内块复制IBC编码模式和基于仿射运动模型的帧内块复制IBCAffine编码模式,在所述针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值之前,所述方法还包括:
获取所述编码块的最佳块向量;
当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;
若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块 选择IBCAffine编码模式;
若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
在上述方案中,所述获取所述编码块的最佳块向量,包括:
对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;
遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;
基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
在上述方案中,当所述编码块选择IBCAffine编码模式时,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
在上述方案中,当所述编码块选择IBC编码模式时,所述方法还包括:
基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
基于所述第二编码参数,对所述编码块进行预测编码。
在上述方案中,在所述获取所述编码块的最佳块向量之后,所述方法还包括:
若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块的第三编码参数;
基于所述第三编码参数,对所述编码块进行预测编码。
在上述方案中,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;
对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;
当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;
基于所述更新后的运动向量,得到所述编码块的第一编码参数。
在上述方案中,针对图像帧内预测类型的编码块,所述方法还包括:
建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;
遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计算所述编码块对应位置控制点的运动向量;
从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率失真代价最小的一组编码参数;
基于所述第四编码参数,对所述编码块进行预测编码。
第二方面,本申请实施例提供了一种编码预测装置,所述编码预测装置包括:确定单元、运动估计单元和预测单元,其中,
所述确定单元,配置为针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
所述运动估计单元,配置为基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
所述预测单元,配置为基于所述第一编码参数,对所述编码块进行预测编码。
在上述方案中,所述编码预测装置还包括获取单元和判断单元,其中,
所述获取单元,配置为获取所述编码块的最佳块向量;
所述判断单元,配置为当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;以及若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块选择IBCAffine编码模式;以及若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
在上述方案中,所述获取单元,具体配置为对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;以及遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
在上述方案中,所述运动估计单元,具体配置为基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
在上述方案中,所述运动估计单元,还配置为基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
所述预测单元,还配置为基于所述第二编码参数,对所述编码块进行预测编码。
在上述方案中,所述运动估计单元,还配置为若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块对应的第三编码参数;
所述预测单元,还配置为基于所述第三编码参数,对所述编码块进行预测编码。
在上述方案中,所述运动估计单元,具体配置为基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;以及对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;以及当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;以及基于所述更新后的运动向量,得到所述编码块的第一编码参数。
在上述方案中,所述运动估计单元,还配置为针对图像帧内预测类型的编码块,建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;以及遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计算所述编码块对应位置控制点的运动向量;以及从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率 失真代价最小的一组编码参数;
所述预测单元,还配置为基于所述第四编码参数,对所述编码块进行预测编码。
第三方面,本申请实施例提供了一种编码预测装置,所述编码预测装置包括:存储器和处理器;其中,
所述存储器,用于存储能够在所述处理器上运行的计算机程序;
所述处理器,用于在运行所述计算机程序时,执行如第一方面中任一项所述的方法的步骤。
第四方面,本申请实施例提供了一种计算机存储介质,所述计算机存储介质存储有编码预测程序,所述编码预测程序被至少一个处理器执行时实现如第一方面中任一项所述的方法的步骤。
本申请实施例提供了一种编码预测方法、装置及计算机存储介质,针对图像帧内预测类型的编码块,首先确定与所述编码块相关联的至少两个控制点的运动向量预测值;然后基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块对应的第一编码参数,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;最后基于所述第一编码参数,对所述编码块进行预测编码;由于增加了仿射运动模型的运动估计,从而可以解决屏幕图像中帧内预测类型的编码块所存在的缩放、旋转以及变形等非平移运动的情况,进一步降低了编码比特数,进而节省了编码码率。
附图说明
图1为相关技术方案提供的一种IBC编码模式的结构示意图;
图2为相关技术方案提供的一种建立MV候选列表的流程示意图;
图3为相关技术方案提供的一种IBC编码模式下Non-Merge模式的流程示意图;
图4为相关技术方案提供的一种IBC编码模式下Merge模式相邻块配置的结构示意图;
图5为相关技术方案提供的一种IBC编码模式下Merge模式的流程示意图;
图6为本申请实施例提供的一种编码预测方法的流程示意图;
图7(a)和图7(b)为本申请实施例提供的一种编码块的非平移运动的结构示意图;
图8(a)为本申请实施例提供的一种编码块相关联的至少两个控制点的运动矢量示意图;
图8(b)为本申请实施例提供的一种编码块内每个子块的运动向量样本示意图;
图9(a)为本申请实施例提供的一种基于五个相邻块构建候选列表的结构示意图;
图9(b)为本申请实施例提供的一种基于相邻块(A)推导编码块对应位置运动向量的结构示意图;
图10为本申请实施例提供的一种编码预测方法的详细流程示意图;
图11为本申请实施例提供的一种编码预测装置的组成结构示意图;
图12为本申请实施例提供的另一种编码预测装置的组成结构示意图;
图13为本申请实施例提供的一种编码预测装置的具体硬件结构示意图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
联合视频研究组(Joint Video Exploration Team,JVET)于2015年10月由ITU-T的视频编码专家组(Video Coding Experts Group,VCEG)和ISO/IEC的动态图像专家组(Moving Picture Experts Group,MPEG)所成立的一个工作组,该工作组的任务是制定下一代的视频编码标准。联合探索测试模型(Joint Exploration Test Model,JEM)为通用的参考软件平台,不同编码工具基于此平台验证。2018年4月,JVET正式命名下一代视频编码标准为多功能视频编码(Versatile Video Coding,VVC),其相应的测试模型为VTM。2018年7月,JEVT在第11次会议上建立了VVC工作草案2和VTM2的算法描述和编码方法。其中,在原先HEVC测试模型(HEVC Test Model,HM)的基础上,JVET在其中添加了许多新的工具,例如:混合树结构(Quadtree with nested multi-type tree,MT;它是由四叉树(Quad Tree,QT)、二叉树(Binary Tree,BT)和三叉树(Trigeminal Tree,TT)结构组成)、仿射运动补偿、基于子块的时域运动向量预测(Sub-Block Based Temporal Motion Vector Prediction,SbTMVP)、自适应运动向量精度(Advanced Motion Vector Resolution,AMVR)等。
视频编码压缩的基本原理是利用空域、时域和码字之间的相关性,尽可能去除冗余。目前流行的编码方式是采用基于块的混合视频编码框架,通过预测(包括帧内预测和帧间预测)、变换、量化以及熵编码等步骤来实现视频编码压缩。这种编码框架具有很强的生命力,HEVC也仍然沿用这种基于块的混合视频编码框架。
根据屏幕内容特点,在HEVC的基础上扩展出了SCC编码标准,其标准化工作在2016年已经基本完成。在SCC编码标准中,新增加了帧内块复制(Intra Block Copy,IBC)、调色板模式(Palette Mode,PLT)和自适应色彩变换(Adaptive ColorTransform,ACT)以及自适应运动向量解析(Adaptive Motion Vector Resolution,AMVR)等编码技术,以提高编码效率。当SCC作帧内预测时,除了传统帧内预测模式(Conventional Intra Prediction,CIP)之外,还可以包括IBC编码模式,IBC编码模式是一种类似于运动补偿的方法,在当前帧内找到与当前编码块相匹配的参考块,并且以块向量(Block Vector,BV)来表示。下面将结合图1对IBC编码模式进行详细介绍。
参见图1,其示出了相关技术方案提供的一种IBC编码模式的结构示意图;如图1所示,斜杠填充区域为搜索区域(即当前图像帧的已编码区域),黑色阴影块分别为当前的编码块(Current CU)和与之匹配的最佳预测块(Best Block Predictor),从当前的编码块指向与之匹配的最佳预测块之间的距离,称之为块向量(Block Vector,BV)。在现有的HEVC扩展标准——HEVC-SCC编码标准中,针对屏幕内容编码提出了一种帧内块复制(Intra Block Copy,IBC)技术。其基本思想类似于传统的帧间运动估计,在当前帧的已编码区域中搜索与待编码块相匹配的参考块,得到两个块之间的距离,该距离称为块向量,再基于块向量获取预测残差,最后对待编码块进行编码。
在最新的VVC参考模型BMS2.1采用的方案中,针对屏幕内容的编码,主要采用IBC编码模式,也可以称之为编码图像参考(Coding Picture Reference,CPR)模式。这里,IBC编码模式又可以分为第一子编码模式(比如IBC Non-Merge模式)和第二子编码模式(比如IBC Merge模式),且这两种编码模式都适用于小于或等于16×16的编码块。下面将通过这两种编码模式对IBC编码模式进行详细介绍。
第一子编码模式以IBC Non-Merge模式为例,该编码模式包含有两种搜索方式:基于哈希表的搜索(Hash-based Search)和基于帧间的搜索(Pattern Search)。
在进行搜索之前,首先需要根据帧间高级运动向量预测技术(Advanced Motion Vector Prediction,AMVP)来建立运动向量(Motion Vector,MV)候选列表,MV候选列表的建立过程如图2所示。参见图2,其示出了相关技术方案提供的一种建立MV候选列表的流程示意图,该流程可以包括:
S201:获取空域候选列表;
S202:获取时域候选列表;
S203:从所述空域候选列表中选取2个候选MV;
S204:从所述时域候选列表中选取1个候选MV;
S205:从选取得到的候选MV中,合并相同的候选MV;
S206:增加(0,0)的候选MV,组成候选MV初步列表;
S207:保留所述候选MV初步列表中的前两个MV;
S208:基于保留的前两个MV,组成MV候选列表。
需要说明的是,基于S208步骤所获得的MV候选列表,可以得到两个运动向量预测值(MV Predictor,MVP)。在得到2个MVP之后,可以开始搜索。
首先采用的搜索方式为Hash-based Search,这种搜索方式是为了加速整张图像的搜索。其中,当前编码块(Curblock)和参考块(Refblock)之间的哈希键值(Hash Key)匹配是以4×4的块为基本单位进行的;这里,参考块的选取可以扩展到整张图像的所有允许尺寸范围内。参见图3,其示出了相关技术方案提供的一种IBC编码模式下Non-Merge模式的流程示意图;该流程可以包括:
S301:基于MV候选列表,获取2个MVP;
S302:基于Hash-based Search,建立参考块候选列表;其中,所述参考块候选列表内每一个参考块与编码块之间具有Hash Key的匹配关系;
S303:遍历参考块候选列表,计算编码块到每个参考块所对应的BV;
S304:根据所述2个MVP,分别以每个MVP为搜索起点,计算每个BV所对应的RDcost;
S305:判断所述参考块候选列表的遍历是否结束;
S306:在所述遍历结束之后,返回RDcost最小时所对应的第一参数进行保留;其中,所述第一参数至少包括第一BV和第一MVP index;
S307:判断所述第一BV是否等于0;
S308:当所述第一BV不等于0时,在所述编码块进行运动补偿之后,对所述编码块进行预测编码。
需要说明的是,“建立参考块候选列表”,具体地,首先在当前的整张图像中以4×4的编码块为基本单位,通过编码块的位置建立编码块与Hash Key之间的映射表;然后针对编码块和参考块进行Hash Key匹配,当且仅当编码块的Hash Key值和参考块的Hash Key值相等时,认为参考块与编码块可以匹配;基于这些相匹配的参考块,可以构建出参考块候选列表。
还需要说明的是,RDcost是根据BV和MVP计算得到的;也就是说,RDcost与BV和MVP之间具有对应关系。其中,遍历参考块候选列表,编码块和该参考块候选列表中的每个参考块之间会存在一个BV;然后根据所述2个MVP,分别以每个MVP为搜索起点,进行每一个BV对应的率失真代价(Rate-Distortion Cost,RDcost)的计算,这样可以得到一个RDcost集合;在该RDcost集合中,可以选取出RDcost最小值,最后可以将这个RDcost最小值所对应的编码参数作为第一参数进行保留。
在本申请实施例中,针对步骤S305,当所述遍历没有结束时,返回步骤S303继续执行该流程;当所述遍历结束时,则执行步骤S306。由于保留了率失真代价(Rate-Distortion Cost,RDcost)最小时所对应的第一参数,这样在步骤S307之后,根据第一参数可以对所述编码块进行运动补偿,确定出编码块的预测残差,从而对所述编码块进行预测编码。在本申请实施例中,第一参数不仅可以包括第一BV和第一MVP索引(MVP index),还可以包括第一结果(cost),本申请实施例对此不作具体限定。
进一步地,在步骤S306之后,如果RDcost最小时所对应的第一BV等于0,或者针对步骤S302的Hash-based Search没有搜索到与编码块相匹配的参考块,此时还可以执行Pattern Search的搜索方式。该搜索方式是将当前图像作为P帧,设置为帧间预测类型,而且是将当前图像放置在REF_PLC_LIST_0末尾作为帧间预测时的参考帧。仍然参见图3,该搜索过程可以包括:
S309:当所述第一BV等于0时,执行基于Pattern Search的运动估计;
S310:基于编码块的相邻块,建立所述编码块的BV候选列表;
S311:基于所述BV候选列表,建立所述编码块的BV数组和与之对应的cost数组;
S312:遍历BV候选列表,计算每一个BV对应的SAD cost,更新所述BV数组和所述cost数组以SAD cost升序排列;
S313:判断所述BV候选列表的遍历是否结束;
S314:在所述遍历结束之后,选取SAD cost最小时所对应的第二BV和第二cost进行返回;
S315:判断所述第二BV是否等于0;
S316:当所述第二BV不等于0时,分别以每个MVP为搜索起点,计算所述第二BV所对应的RDcost;返回RDcost最小时所对应的第二参数进行保留;其中,所述第二参数至少包括第二BV和第二MVP index;
S317:当所述第二BV等于0时,结束流程。
需要说明的是,“建立所述编码块的BV候选列表”,具体地,首先分别访问当前的编码块所对应的AMVP空域候选列表中的相邻块(Neiblock),如果相邻块使用IBC编码模式,那么可以复用相邻块的参考块(NeiRefblock)作为编码块的参考,然后计算编码块与NeiRefblock之间的BV,并将其添加到编码块的BV候选列表中,从而构建出编码块的BV候选列表。
还需要说明的是,基于BV候选列表,可以建立BV数组和与之对应的cost数组,并使两个数组都是按照绝对差值总和代价(Sum of Absolute Differences Cost,SAD cost)升序排列;其中,遍历所述BV候选列表,根据每一次计算得到的BV所对应的SAD cost对两个数组的排序进行实时更新;在遍历结束之后,然后仅返回两个数组中第0个元素,即对应SAD cost最小的第二cost和第二BV;再以每个MVP为搜索起点,计算所返回的第二BV对应的RDcost,并保存RDcost最小时所对应的第二参数;这样在步骤S316之后,根据第二参数也可以对所述编码块进行运动补偿,确定出编码块的预测残差,从而对所述编码块进行预测编码。在本申请实施例中,第二参数不仅可以包括第二BV和第二MVP index,还可以包括第二cost,本申请实施例对此不作具体限定。
基于上述两种搜索方式所得到的最佳结果,再对编码块进行运动补偿之后,可以确定出预测残差,从而就可以对编码块进行预测编码。在编码器侧,编码端只需要传输对应的MVP索引、以及BV和MVP之间的差值(Motion Vector Difference,MVD)。在解码器侧,解码端首先按照同样的方式构造MV候选列表,根据传输得到的MVP索引和MVD即可以得到BV,从而就可以对经过预测编码之后的编码块进行解码处理,得到该编码块的预测像素值;再加上编码端所传输的预测残差,从而在解码端就可以得到该编码块中每一个像素所对应的重建像素值。
第二子编码模式以IBC Merge模式为例,该编码模式同样将当前图像作为P帧,设置为帧间预测类型,将当前图像作为帧间预测时的参考帧。参见图4,其示出相关技术方案提供的一种IBC编码模式下Merge模式相邻块配置的结构示意图。如图4所示,根据图4所示的A1->B1->B0->A0->B2的优先顺序,可以构建出编码块的MV候选列表。参见图5,其示出相关技术方案提供的一种IBC编码模式下Merge模式的流程示意图; 该流程可以包括:
S501:建立编码块对应的空域MV候选列表;
S502:遍历所述MV候选列表,将所述MV候选列表中的MVP直接作为所述编码块的BV,计算每一个BV所对应的RDcost;
S503:在遍历结束之后,将计算得到的RDcost中最小RDcost所对应的第三参数进行保留;其中,所述第三参数至少包括第三MVP index;
S504:基于所述第三参数,在所述编码块进行运动补偿之后,对所述编码块进行预测编码。
需要说明的是,“建立编码块对应的空域MV候选列表”,具体地,结合图4所示的配置结构,按照A1->B1->B0->A0->B2的顺序,最多选取4个空域候选MV放入MV候选列表,从而可以构建出编码块所对应的MV候选列表。
在第二子编码模式中,由于BV和MVP相同,因此不存在第一子编码模式中所述的MVD。这样,在编码器侧,编码端只需要传输MVP索引(MVP index)。在解码器侧,解码端首先按照同样的方式构造MV候选列表,然后根据传输得到的MVP索引就可以直接得到BV,从而就可以对经过预测编码之后的编码块进行解码处理。
在所述的IBC编码模式中,由于与编码块相匹配的参考块仍然是基于平移运动模型来得到的,导致这种IBC编码模式只是对屏幕内容中重复出现的块所对应的编码效率较高;然而在屏幕内容编码的场景中,同样存在缩放、旋转和变形等复杂运动情况,导致编码还具有进一步的优化空间。
本申请实施例提供了一种编码预测方法,在所述IBC编码模式的基础上,增加了仿射运动模型,通过用仿射运动模型来替换平移运动模型可以进一步降低编码比特数,从而节省了编码码率。这里,本申请实施例所提供的编码预测方法,不仅可以应用于编码器侧的帧内预测类型,还可以应用于解码器侧的帧内预测类型;也就是说,本申请实施例既可以应用于编码系统,又可以应用于解码系统,本申请实施例不作具体限定。
以编码器侧的帧内预测类型为例,下面将结合附图对本申请实施例进行详细描述。
参见图6,其示出了本申请实施例提供的一种编码预测方法的流程示例,该方法可以包括:
S601:针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
S602:基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
S603:基于所述第一编码参数,对所述编码块进行预测编码。
需要说明的是,“基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计”,可以得到多组编码参数;针对不同的编码参数分别进行率失真代价的计算,以得到不同的编码参数所对应的率失真代价;在这些率失真代价中选取出最小率失真代价,并将该最小率失真代价所对应的一组编码参数作为第一编码参数。
在本申请实施例中,编码块为当前图像中待编码的编码块,而且设置的是帧内预测类型。这样,首先确定与所述编码块相关联的至少两个控制点的运动向量预测值;然后基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块对应的第一编码参数,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;最后基于所述第一编码参数,对所述编码块进行预测编码;由于增加了仿射运动模型的运动估计,从而可以解决屏幕图像中帧内预测类型的编码块所存在的缩放、旋转以及变形等非平移的复 杂运动情况,进一步降低了编码比特数,进而节省了编码码率。
需要说明的是,仿射运动模型是帧间预测类型的一种新技术。在HEVC中,帧间预测是基于编码块的运动补偿,通常是采用平移运动模型,即假设一个编码块内所有像素点的运动向量(Motion Vector,MV)是相同的,一般由编码块左上角顶点的块矢量替代;但是在实际应用中,编码块还会存在缩放运动、旋转运动和变形运动等非平移运动,如图7(a)和图7(b)分别表示编码块的旋转运动和编码块的缩放运动;其中,ρ为缩放系数,ρ的取值根据实际情况具体设定,本申请实施例不作具体限定。
这里,仿射运动模型是通过控制点的MV进行运动补偿的。其中,控制点的表示方式有两种模式,比如两个控制点和三个控制点;编码块可以利用控制点的MV在参考帧中找到与之对应的参考块(或者称为映射块)。参见图8(a),其示出了本申请实施例提供的一种编码块相关联的至少两个控制点的运动矢量示意图;当编码块采用两个控制点时,即左上角和右上角两个顶点的运动向量,即图8(a)中所示的
Figure PCTCN2018124504-appb-000001
Figure PCTCN2018124504-appb-000002
也称之四参数仿射模型;当编码块采用三个控制点时,即左上角、右上角和左下角三个顶点的运动向量,即图8(a)中所示的
Figure PCTCN2018124504-appb-000003
Figure PCTCN2018124504-appb-000004
也称之六参数仿射模型。
若采用四参数仿射模型,则可以通过式(1)来推导编码块中每个4×4子块的运动向量;
Figure PCTCN2018124504-appb-000005
若采用六参数仿射模型,则可以通过式(2)来推导编码块中每个4×4子块的运动向量;
Figure PCTCN2018124504-appb-000006
其中,w和h分别表示编码块的宽度和高度,左上角顶点的运动向量
Figure PCTCN2018124504-appb-000007
右上角顶点的运动向量
Figure PCTCN2018124504-appb-000008
左下角顶点的运动向量
Figure PCTCN2018124504-appb-000009
编码块中每个像素点(x,y)对应的运动向量
Figure PCTCN2018124504-appb-000010
参见图8(b),其示出了本申请实施例提供的一种编码块内每个子块的运动向量样本示意图;其中,针对图8(b)所示的每个子块的运动向量样本,首先应用运动补偿差值滤波器,再结合式(1)或者式(2)所推导出的运动向量,可以完成对每个子块的编码预测。因此,使用仿射运动补偿能够更好地描述复杂运动情况,本申请实施例将仿射运动模型应用到基于帧内预测类型的编码块中,从而可以进一步降低编码比特数,进而提高了编码码率。
可以理解地,编码块对应的编码模式至少包括帧内块复制IBC编码模式和基于仿射运动模型的帧内块复制IBCAffine编码模式;也就是说,编码块可以选择IBC编码模式进行预测编码,也可以选择IBCAffine编码模式进行预测编码,但是本申请实施例不作具体限定。
在一些实施例中,在所述针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值之前,所述方法还包括:
获取所述编码块的最佳块向量;
当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;
若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块选择IBCAffine编码模式;
若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
进一步地,在一些实施例中,所述获取所述编码块的最佳块向量,包括:
对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;
遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;
基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
需要说明的是,预设倍数是预先设定的用于衡量编码块是否采用IBCAffine编码模式的判定值。在本申请实施例中,预设倍数可以设置为1.05倍;但是在实际应用中,预设倍数根据实际情况进行具体设定,本申请实施例不作具体限定。
还需要说明的是,针对编码块的这两种编码模式(IBC编码模式和IBCAffine编码模式),当编码块确定出最优编码模式(即编码块当前所选择的编码模式)时,可以将最优编码模式的Flag值设置为True,从而在编码系统中表明了编码块所采用的编码模式,从而按照该编码模式进行预测编码。
进一步地,在一些实施例中,当所述编码块选择IBCAffine编码模式时,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
进一步地,在一些实施例中,当所述编码块选择IBC编码模式时,所述方法还包括:
基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
基于所述第二编码参数,对所述编码块进行预测编码。
进一步地,在一些实施例中,在所述获取所述编码块的最佳块向量之后,所述方法还包括:
若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块的第三编码参数;
基于所述第三编码参数,对所述编码块进行预测编码。
需要说明的是,针对这两种编码模式,当编码块确定出最优编码模式(即,编码块当前选择的编码模式)时,可以将最优编码模式的Flag值设置为True,从而表明了编码块所采用的编码模式。
还需要说明的是,针对IBC编码模式,可以分为第一子编码模式(比如IBC Non-Merge模式)和第二子编码模式(比如IBC Merge模式);针对IBCAffine编码模式,还可以分为第三子编码模式(比如IBCAffine Non-Merge模式)和第四子编码模式(比如IBCAffine Merge模式);前面已经对第一子编码模式和第二子编码模式进行具体描 述,下面将分别对第三子编码模式和第四子编码模式进行具体描述。
其中,在第三子编码模式下,首先需要对编码块进行帧内块复制搜索(该搜索过程与第一子编码模式的搜索过程类似),建立该编码块的第一候选块列表;其中,第一候选块列表中的每个参考块与编码块之间均具有hash key匹配关系;结合图2所得到的两个预设MVP,遍历第一候选块列表,并且分别以所述两个预设MVP作为搜索起点,计算每一个块向量所对应的率失真代价;根据计算得到的率失真代价中将最小率失真代价所对应的块向量作为所述编码块的最佳块向量;这样,当最佳BV不等于0时,表示了该编码块内任意位置的hash key值均全部相等,将不再进行IBCAffine编码模式的运动估计,此时可以直接将所述最佳块向量作为所述编码块对应的第三编码参数,根据第三参数对所述编码块进行预测编码;当最佳BV等于0时,一方面,所述编码块需要进行基于IBCAffine编码模式的运动估计(即对编码块进行仿射运动模型的运动估计),所得到的最佳结果(比如率失真代价最小的结果)称之为第一运动估计结果,用IBCAffineCost表示;另一方面,所述编码块还需要进行基于IBC编码模式的运动估计(即对编码块进行平移运动模型的运动估计),所得到的最佳结果(比如率失真代价最小的结果)称之为第二运动估计结果,用IBCCost表示;这里,假定预设倍数为1.05倍,若IBCAffineCost≤1.05×IBCCost,则编码块选择IBCAffine编码模式,可以得到率失真代价最小的第一编码参数(比如第一BV与之对应的第一MVP索引);若IBCAffineCost>1.05×IBCCost,则编码块选择IBCAffine编码模式,可以得到率失真代价最小的第二编码参数(比如第二BV与之对应的第二MVP索引);从而根据所得到的第一编码参数或者第二编码参数进行运动补偿,可以得到编码块的预测残差,以对编码块进行预测编码。
在一些实施例中,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;
对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;
当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;
基于所述更新后的运动向量,得到所述编码块的第一编码参数。
需要说明的是,在第三子编码模式下,同样需要将当前图像当做P帧,预测模式设置为帧间预测模式,并将当前图像放在REF_PIC_LIST_0末尾作为帧间预测时的参考帧。首先编码块通过帧间基于仿射运动模型的高级运动向量预测技术(Affine Advanced Motion Vector Prediction,AffineAMVP)获取2个候选MVP,而且每一个候选MVP包含有至少两个控制点;然后以每一个候选MVP为搜索起点,进行仿射运动搜索。
具体地,针对当前编码块中任意像素点(x i,y i),对应在参考帧中对应位置的像素点预测值s i用式(3)表示,
s i=s'(x i+Δx i,y i+Δy i)      (3)
其中,Δx i、Δy i表示编码块的运动向量,它不是线性变化的,而是由下述式(4)中的4个参数(a、b、ω 0、ω 1)所决定的,
Figure PCTCN2018124504-appb-000011
其中,a、b分别表示编码块经过旋转变换后在参考帧中像素点坐标的平动分量;ω 0、ω 1表示编码块进行系数ρ的缩放变换后旋转矩阵中的参数;
Figure PCTCN2018124504-appb-000012
表示
Figure PCTCN2018124504-appb-000013
由行向量转置为列向量。
对上述式(3)进行一阶泰勒展开后,得到式(5),如下所示,
Figure PCTCN2018124504-appb-000014
其中,G xi和G yi表示梯度值,它们是通过Sobel梯度计算得到的。根据式(5),编码块中任意像素点(x i,y i)的预测误差可以通过下述的式(6)计算得到,
Figure PCTCN2018124504-appb-000015
其中,第0项表示编码块中任意像素点(x i,y i)的预测误差,第1项表示编码块中任意像素点(x i,y i)在原始图像和参考帧之间的匹配误差,第2项表示参考帧中对应位置的像素点预测值的梯度矩阵。
这样,根据上述预测值的梯度以及匹配误差进行迭代运算,该迭代运算过程是用于更新运动向量;当迭代运算的次数满足预设次数阈值时,此时的预测误差最小,所得到的运动向量就是最终需求更新后的运动向量。
还需要说明的是,预设次数阈值是预先设定的衡量预测误差最小所需求的迭代次数。在本申请实施例中,对于四参数仿射模型,预设次数阈值可以为5;而对于六参数仿射模型,预设次数阈值可以为4;在实际应用中,预设次数阈值可以根据实际情况进行具体设置,本申请实施例不作具体限定。
在一些实施例中,针对图像帧内预测类型的编码块,所述方法还包括:
建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;
遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计算所述编码块对应位置控制点的运动向量;
从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率失真代价最小的一组编码参数;
基于所述第四编码参数,对所述编码块进行预测编码。
需要说明的是,在第四子编码模式下,首先需要建立编码块的第二候选列表。参见图9(a),其示出了本申请实施例提供的一种基于五个相邻块构建候选列表的结构示意图;这样,针对9(a)所示的A、B、C、D和E五个相邻块,按照A->B->C->D->E的顺序依次进行访问;当相邻块使用IBCAffine编码模式,并且参考帧为序列REF_PIC_LIST_0中最后一帧图像(即当前图像)时,将该相邻块作为参考块放入第二候选块列表中;然后遍历第二候选块列表,根据控制点的个数进行选择,如果控制点的个数是2个,那么选择使用上述的式(1)来推导当前编码块对应位置控制点的MV;如果控制点的个数是3个,那么选择使用上述的式(2)来推导当前编码块对应位置控制点的MV,具体可以参考图9(b)所示的示例;从所得到的MV中,可以选择预测残差最小(即率失真代价最小)的结果进行返回,返回的结果为第四编码参数,根据第四编码参数可以对编码块进行预测编码。也就是说,在编码器侧,编码端需要将第二候选块列表的MVP索引以及预测残差传输给解码端;而在解码器侧,解码端可以建立与编码端一致的候选块列表,根据码流中所传输的MVP索引通过式(1)或者式(2)计算得到当前编码块的MV,从而就可以对经过预测编码之后的编码块进行解码处理。
上述实施例提供了一种编码预测方法,针对图像帧内预测类型的编码块,首先确定与所述编码块相关联的至少两个控制点的运动向量预测值;然后基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;最后基于所述第一编码参数对所述编码块进行预测编码;由于增加了仿射运动模型的运动估计,从而可以解决屏幕图像中帧内预测类型的编码块所存在缩放、旋转以及变形等非平移运动的情况,进一步降低了编码比特数,进而节省了编码码率。
基于前述实施例相同的发明构思,参见图10,其示出了本申请实施例提供的一种编码预测方法的详细流程,该详细流程可以包括:
S1001:对编码块进行IBC搜索,建立所述编码块的第一候选块列表;
S1002:遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;
S1003:基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为最佳块向量;
S1004:判断最佳块向量是否等于0;
S1005:当所述最佳块向量等于0时,基于IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,通过计算得到率失真代价最小的第一运动估计结果IBCAffineCost和第一编码参数;
S1006:当所述最佳块向量等于0时,基于IBC编码模式对所述编码块进行所述平移运动模型的运动估计,通过计算得到率失真代价最小的第二运动估计结果IBCCost和第二编码参数;
S1007:判断IBCAffineCost≤1.05×IBCCost是否成立;
S1008:若IBCAffineCost≤1.05×IBCCost成立,则所述编码块选择IBCAffine编码模式,得到所述编码块的第一编码参数;
S1009:若IBCAffineCost≤1.05×IBCCost不成立,则所述编码块选择IBC编码模式,得到所述编码块的第二编码参数;
S1010:若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块的第三编码参数;
S1011:在所述编码块进行运动补偿之后,对所述编码块进行预测编码。
需要说明的是,如果编码块不采用IBCAffine编码模式时,那么可以直接按照前述图3所示的流程图进行处理,这里不再赘述。另外,在S1004之后,通过判断最佳块向量是否等于0;当最佳块向量等于0时,执行步骤S1005和S1006;当最佳块向量不等于0时,执行步骤S1010;无论在步骤S1008之后,还是在步骤S1009之后,或者是在步骤S1010之后,均会执行步骤S1011。
还需要说明的是,上述实施例所提供的编码预测方法,不仅可以应用于编码器侧的帧内预测类型,还可以应用于解码器侧的帧内预测类型。也就是说,本申请实施例既可以应用于编码系统,又可以应用于解码系统,但是本申请实施例不作具体限定。
通过上述实施例,对前述实施例的具体实现进行了详细阐述,从中可以看出,通过前述实施例的技术方案,由于增加了仿射运动模型的运动估计,从而可以解决屏幕图像中帧内预测类型的编码块所存在缩放、旋转以及变形等非平移运动的情况,进一步降低了编码比特数,进而节省了编码码率。
基于前述实施例相同的发明构思,参见图11,其示出了本申请实施例提供的一种编码预测装置110的组成,可以包括:确定单元1101、运动估计单元1102和预测单元1103; 其中,
所述确定单元1101,配置为针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
所述运动估计单元1102,配置为基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
所述预测单元1103,配置为基于所述第一编码参数,对所述编码块进行预测编码。
在上述方案中,参见图12,所述编码预测装置110还包括获取单元1104和判断单元1105,其中,
所述获取单元1104,配置为获取所述编码块的最佳块向量;
所述判断单元1105,配置为当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;以及若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块选择IBCAffine编码模式;以及若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
在上述方案中,所述获取单元1104,具体配置为对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;以及遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;以及基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
在上述方案中,所述运动估计单元1102,具体配置为基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
在上述方案中,所述运动估计单元1102,还配置为基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
所述预测单元1103,还配置为基于所述第二编码参数,对所述编码块进行预测编码。
在上述方案中,所述运动估计单元1102,还配置为若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块对应的第三编码参数;
所述预测单元1103,还配置为基于所述第三编码参数,对所述编码块进行预测编码。
在上述方案中,所述运动估计单元1102,具体配置为基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;以及对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;以及当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;以及基于所述更新后的运动向量,得到所述编码块的第一编码参数。
在上述方案中,所述运动估计单元1102,还配置为针对图像帧内预测类型的编码块,建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;以及遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计 算所述编码块对应位置控制点的运动向量;以及从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率失真代价最小的一组编码参数;
所述预测单元1103,还配置为基于所述第四编码参数,对所述编码块进行预测编码。
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
因此,本实施例提供了一种计算机存储介质,该计算机存储介质存储有编码预测程序,所述编码预测程序被至少一个处理器执行时实现前述图6所示技术方案中所述方法的步骤。
基于上述编码预测装置110的组成以及计算机存储介质,参见图13,其示出了本申请实施例提供的编码预测装置110的具体硬件结构示例,可以包括:网络接口1301、存储器1302和处理器1303;各个组件通过总线系统1304耦合在一起。可理解,总线系统1304用于实现这些组件之间的连接通信。总线系统1304除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图13中将各种总线都标为总线系统1304。其中,网络接口1301,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
存储器1302,用于存储能够在处理器1303上运行的计算机程序;
处理器1303,用于在运行所述计算机程序时,执行:
针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
基于所述第一编码参数,对所述编码块进行预测编码。
可以理解,本申请实施例中的存储器1302可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM, DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本文描述的系统和方法的存储器1302旨在包括但不限于这些和任意其它适合类型的存储器。
而处理器1303可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1303中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1303可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1302,处理器1303读取存储器1302中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本文描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。
对于软件实现,可通过执行本文所述功能的模块(例如过程、函数等)来实现本文所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,处理器1303还配置为在运行所述计算机程序时,执行前述图6所示技术方案中所述方法的步骤。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机、计算机、服务器、或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本申请的保护之内。
工业实用性
本申请实施例中,针对图像帧内预测类型的编码块,首先确定与所述编码块相关联的至少两个控制点的运动向量预测值;然后基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块对应的第一编码参数,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;最后基于所述第一编码参数,对所述编码块进行预测编码;由于增加了仿射运动模型的运动估计,从而可以解决屏幕图像中帧内预测类型的编码块所存在的缩放、旋转以及变形等非平移运动的情况,进一步降低了编码比特数,进而节省了编码码率。

Claims (18)

  1. 一种编码预测方法,其中,所述方法包括:
    针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
    基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
    基于所述第一编码参数,对所述编码块进行预测编码。
  2. 根据权利要求1所述的方法,其中,所述编码块对应的编码模式至少包括帧内块复制IBC编码模式和基于仿射运动模型的帧内块复制IBCAffine编码模式,在所述针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值之前,所述方法还包括:
    获取所述编码块的最佳块向量;
    当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;
    若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块选择IBCAffine编码模式;
    若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
  3. 根据权利要求2所述的方法,其中,所述获取所述编码块的最佳块向量,包括:
    对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;
    遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;
    基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
  4. 根据权利要求2所述的方法,其中,当所述编码块选择IBCAffine编码模式时,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
    基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
  5. 根据权利要求2所述的方法,其中,当所述编码块选择IBC编码模式时,所述方法还包括:
    基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
    基于所述第二编码参数,对所述编码块进行预测编码。
  6. 根据权利要求2所述的方法,其中,在所述获取所述编码块的最佳块向量之后,所述方法还包括:
    若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块的第三编码 参数;
    基于所述第三编码参数,对所述编码块进行预测编码。
  7. 根据权利要求1所述的方法,其中,所述基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数,包括:
    基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;
    对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;
    当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;
    基于所述更新后的运动向量,得到所述编码块的第一编码参数。
  8. 根据权利要求1所述的方法,其中,针对图像帧内预测类型的编码块,所述方法还包括:
    建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;
    遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计算所述编码块对应位置控制点的运动向量;
    从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率失真代价最小的一组编码参数;
    基于所述第四编码参数,对所述编码块进行预测编码。
  9. 一种编码预测装置,其中,所述编码预测装置包括:确定单元、运动估计单元和预测单元,
    所述确定单元,配置为针对图像帧内预测类型的编码块,确定与所述编码块相关联的至少两个控制点的运动向量预测值;
    所述运动估计单元,配置为基于所述至少两个控制点的运动向量预测值对所述编码块进行仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过非平移运动方式的运动估计所得到的率失真代价最小的一组编码参数;
    所述预测单元,配置为基于所述第一编码参数,对所述编码块进行预测编码。
  10. 根据权利要求9所述的编码预测装置,其中,所述编码预测装置还包括获取单元和判断单元,
    所述获取单元,配置为获取所述编码块的最佳块向量;
    所述判断单元,配置为当所述最佳块向量等于0时,分别计算所述编码块基于IBCAffine编码模式的第一运动估计结果和所述编码块基于IBC编码模式的第二运动估计结果;以及若所述第一运动估计结果不大于所述第二运动估计结果的预设倍数,则所述编码块选择IBCAffine编码模式;以及若所述第一运动估计结果大于所述第二运动估计结果的预设倍数,则所述编码块选择IBC编码模式。
  11. 根据权利要求10所述的编码预测装置,其中,所述获取单元,具体配置为对所述编码块进行帧内块复制搜索,选取与所述编码块的哈希键值匹配的至少一个参考块,基于所述至少一个参考块建立所述编码块的第一候选块列表;以及遍历所述第一候选块列表,计算所述编码块与所述第一候选列表中每一个参考块之间的块向量;以及基于所述块向量,分别计算不同块向量对应的率失真代价;将率失真代价中最小率失真代价所对应的块向量作为所述编码块的最佳块向量。
  12. 根据权利要求10所述的编码预测装置,其中,所述运动估计单元,具体配置 为基于所述至少两个控制点的运动向量预测值,通过IBCAffine编码模式对所述编码块进行所述仿射运动模型的运动估计,得到所述编码块的第一编码参数;其中,所述第一编码参数用于指示所述编码块通过所述仿射运动模型的运动估计所得到的率失真代价最小的一组编码参数。
  13. 根据权利要求10所述的编码预测装置,其中,所述运动估计单元,还配置为基于IBC编码模式对所述编码块进行平移运动模型的运动估计,得到所述编码块的第二编码参数;其中,所述第二编码参数用于指示所述编码块通过所述平移运动模型的运动估计所得到的率失真代价最小的一组编码参数;
    所述预测单元,还配置为基于所述第二编码参数,对所述编码块进行预测编码。
  14. 根据权利要求10所述的编码预测装置,其中,所述运动估计单元,还配置为若所述最佳块向量不等于0,则直接将所述最佳块向量作为所述编码块的第三编码参数;
    所述预测单元,还配置为基于所述第三编码参数,对所述编码块进行预测编码。
  15. 根据权利要求9所述的编码预测装置,其中,所述运动估计单元,具体配置为基于所述至少两个控制点的运动向量预测值,计算所述编码块至少一个像素点在对应的参考帧中的预测值;以及对所述编码块至少一个像素点在原始帧与参考帧之间的匹配误差以及所述预测值的梯度矩阵进行迭代运算,根据所述迭代运算更新运动向量;以及当所述迭代运算的次数满足预设次数阈值时,得到更新后的运动向量;以及基于所述更新后的运动向量,得到所述编码块的第一编码参数。
  16. 根据权利要求9所述的编码预测装置,其中,所述运动估计单元,还配置为针对图像帧内预测类型的编码块,建立所述编码块的第二候选块列表;其中,所述第二候选块列表中的参考块与所述编码块空间相邻,且所述参考块采用IBCAffine编码模式进行编码;以及遍历所述第二候选块列表,根据所述第二候选块列表中每一个参考块至少两个控制点的运动向量,分别计算所述编码块对应位置控制点的运动向量;以及从所述运动向量中,获取所述编码块对应的第四编码参数;其中,所述第四编码参数用于指示所述编码块所得到的运动向量中率失真代价最小的一组编码参数;
    所述预测单元,还配置为基于所述第四编码参数,对所述编码块进行预测编码。
  17. 一种编码预测装置,其中,所述编码预测装置包括:存储器和处理器;
    所述存储器,用于存储能够在所述处理器上运行的计算机程序;
    所述处理器,用于在运行所述计算机程序时,执行如权利要求1至8任一项所述的方法的步骤。
  18. 一种计算机存储介质,其中,所述计算机存储介质存储有编码预测程序,所述编码预测程序被至少一个处理器执行时实现如权利要求1至8任一项所述的方法的步骤。
PCT/CN2018/124504 2018-12-27 2018-12-27 编码预测方法、装置及计算机存储介质 WO2020133115A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201880100324.9A CN113287309A (zh) 2018-12-27 2018-12-27 编码预测方法、装置及计算机存储介质
EP18945171.9A EP3902257A4 (en) 2018-12-27 2018-12-27 CODING PREDICTION METHOD AND DEVICE AND COMPUTER STORAGE MEDIUM
PCT/CN2018/124504 WO2020133115A1 (zh) 2018-12-27 2018-12-27 编码预测方法、装置及计算机存储介质
US17/357,621 US11632553B2 (en) 2018-12-27 2021-06-24 Coding prediction method and apparatus, and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/124504 WO2020133115A1 (zh) 2018-12-27 2018-12-27 编码预测方法、装置及计算机存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/357,621 Continuation US11632553B2 (en) 2018-12-27 2021-06-24 Coding prediction method and apparatus, and computer storage medium

Publications (1)

Publication Number Publication Date
WO2020133115A1 true WO2020133115A1 (zh) 2020-07-02

Family

ID=71126695

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/124504 WO2020133115A1 (zh) 2018-12-27 2018-12-27 编码预测方法、装置及计算机存储介质

Country Status (4)

Country Link
US (1) US11632553B2 (zh)
EP (1) EP3902257A4 (zh)
CN (1) CN113287309A (zh)
WO (1) WO2020133115A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218092A (zh) * 2020-10-17 2021-01-12 浙江大华技术股份有限公司 串编码技术的编码方法、设备及存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111050182B (zh) * 2019-12-27 2022-02-18 浙江大华技术股份有限公司 运动矢量的预测方法、视频编码方法及相关设备、装置
US11818384B2 (en) * 2020-09-24 2023-11-14 Ofinno, Llc Affine intra block copy refinement
WO2023171988A1 (ko) * 2022-03-11 2023-09-14 현대자동차주식회사 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 기록 매체
CN116156174B (zh) * 2023-02-23 2024-02-13 格兰菲智能科技有限公司 数据编码处理方法、装置、计算机设备和存储介质
CN116170594B (zh) * 2023-04-19 2023-07-14 中国科学技术大学 一种基于率失真代价预测的编码方法和装置
CN116760986B (zh) * 2023-08-23 2023-11-14 腾讯科技(深圳)有限公司 候选运动矢量生成方法、装置、计算机设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163116A (zh) * 2015-08-29 2015-12-16 华为技术有限公司 图像预测的方法及设备
WO2018047668A1 (ja) * 2016-09-12 2018-03-15 ソニー株式会社 画像処理装置および画像処理方法
US20180316929A1 (en) * 2017-04-28 2018-11-01 Qualcomm Incorporated Gradient based matching for motion search and derivation
CN108886619A (zh) * 2016-01-07 2018-11-23 联发科技股份有限公司 用于视频编解码系统的仿射合并模式预测的方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6614472B2 (ja) * 2013-09-30 2019-12-04 サン パテント トラスト 画像符号化方法、画像復号方法、画像符号化装置及び画像復号装置
US11012715B2 (en) * 2018-02-08 2021-05-18 Qualcomm Incorporated Intra block copy for video coding
JP7352625B2 (ja) * 2018-10-29 2023-09-28 華為技術有限公司 ビデオピクチャ予測方法及び装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163116A (zh) * 2015-08-29 2015-12-16 华为技术有限公司 图像预测的方法及设备
CN108886619A (zh) * 2016-01-07 2018-11-23 联发科技股份有限公司 用于视频编解码系统的仿射合并模式预测的方法及装置
WO2018047668A1 (ja) * 2016-09-12 2018-03-15 ソニー株式会社 画像処理装置および画像処理方法
US20180316929A1 (en) * 2017-04-28 2018-11-01 Qualcomm Incorporated Gradient based matching for motion search and derivation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3902257A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112218092A (zh) * 2020-10-17 2021-01-12 浙江大华技术股份有限公司 串编码技术的编码方法、设备及存储介质
CN112218092B (zh) * 2020-10-17 2022-09-06 浙江大华技术股份有限公司 串编码技术的编码方法、设备及存储介质

Also Published As

Publication number Publication date
US11632553B2 (en) 2023-04-18
CN113287309A (zh) 2021-08-20
EP3902257A1 (en) 2021-10-27
EP3902257A4 (en) 2022-01-05
US20210321112A1 (en) 2021-10-14

Similar Documents

Publication Publication Date Title
WO2020133115A1 (zh) 编码预测方法、装置及计算机存储介质
US10856006B2 (en) Method and system using overlapped search space for bi-predictive motion vector refinement
CN101529918B (zh) 预测参照信息生成方法、活动图像的编码及解码方法及其装置
US12137246B2 (en) Method and apparatus for low-complexity bi-directional intra prediction in video encoding and decoding
WO2019143841A1 (en) Affine motion compensation in video coding
US11917159B2 (en) Prediction method, encoder, decoder and computer storage medium
JPWO2007077989A1 (ja) 映像符号化方法及び復号方法、それらの装置、及びそれらのプログラム並びにプログラムを記録した記憶媒体
CN108449599B (zh) 一种基于面透射变换的视频编码与解码方法
CN113545073A (zh) 在解码侧运动修正搜索期间使用扩展样本
CN114845102A (zh) 光流修正的提前终止
WO2022227622A1 (zh) 一种权值可配置的帧间帧内联合预测编解码的方法及装置
WO2019242686A1 (en) Method and apparatus of motion vector buffer management for video coding system
CN117561714A (zh) 用于视频处理的方法、设备和介质
CN117616754A (zh) 用于视频处理的方法、设备和介质
TW202139707A (zh) 幀間預測方法、編碼器、解碼器以及儲存媒介
JP6019797B2 (ja) 動画像符号化装置、動画像符号化方法、及びプログラム
CN117581538A (zh) 视频处理的方法、设备和介质
CN110958452B (zh) 视频解码方法及视频解码器
WO2020007093A1 (zh) 一种图像预测方法及装置
JP2024542802A (ja) ビデオ処理のための方法、装置及び媒体
CN117501688A (zh) 用于视频处理的方法、设备和介质
CN117678223A (zh) 视频处理的方法、装置和介质
CN117501690A (zh) 用于视频处理的方法、设备和介质
CN118525515A (zh) 视频处理方法、设备和介质
CN113647108A (zh) 基于历史的运动矢量预测

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18945171

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018945171

Country of ref document: EP

Effective date: 20210723