[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US10536692B2 - Picture prediction method and related apparatus - Google Patents

Picture prediction method and related apparatus Download PDF

Info

Publication number
US10536692B2
US10536692B2 US15/454,356 US201715454356A US10536692B2 US 10536692 B2 US10536692 B2 US 10536692B2 US 201715454356 A US201715454356 A US 201715454356A US 10536692 B2 US10536692 B2 US 10536692B2
Authority
US
United States
Prior art keywords
template
pixel
templates
pixel area
picture block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US15/454,356
Other versions
US20170180727A1 (en
Inventor
Xin Huang
Hong Zhang
Haitao Yang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Assigned to HUAWEI TECHNOLOGIES CO., LTD. reassignment HUAWEI TECHNOLOGIES CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HUANG, XIN, YANG, HAITAO, ZHANG, HONG
Publication of US20170180727A1 publication Critical patent/US20170180727A1/en
Application granted granted Critical
Publication of US10536692B2 publication Critical patent/US10536692B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria

Definitions

  • the present application relates to the field of picture processing technologies, and specifically, to a picture prediction method and a related apparatus.
  • HEVC High Efficiency Video Coding
  • a basic principle of video compression coding is to use correlation between a space domain, a time domain, and a code word to remove redundancy as much as possible.
  • a prevalent practice is to use a block-based hybrid video coding framework to implement video compression coding by performing steps of prediction (including intra-frame prediction and inter-frame prediction), transform, quantization, entropy coding, and the like.
  • redundancy information of a current picture block is removed by using spatial pixel information of the current picture block, to obtain a residual
  • an inter-frame prediction technology redundancy information of a current picture block is removed by using pixel information of a coded or decoded picture adjacent to a current picture block, to obtain a residual.
  • This coding framework shows high viability, and therefore, HEVC still uses this block-based hybrid video coding framework.
  • a method for predicting a pixel value of a current picture block based on a non-local means filtering technology is provided. After all templates matching a current template are obtained by searching reference pictures, a predicted pixel value of a current picture block is obtained by using an average of pixel values of picture blocks corresponding to the all templates.
  • a predicted pixel value of a current picture block is obtained by using an average of pixel values of picture blocks corresponding to the all templates.
  • Embodiments of the present application provide a picture prediction method and a related apparatus, so as to improve picture prediction accuracy.
  • a first aspect of the present application provides a picture prediction method, including:
  • the determining a weight of a pixel area in a picture block corresponding to each template in the N templates includes:
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template includes: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • R(i,j) is equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • ( i,j ) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the determining N templates whose degrees of matching with a current template meet a preset condition includes:
  • the determining, from the M templates, N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the picture prediction method is applied in a video coding process, or the picture prediction method is applied in a video decoding process.
  • a second aspect of the present application provides a picture prediction apparatus, including:
  • a first determining unit configured to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block;
  • a second determining unit configured to determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different;
  • a predicting unit configured to calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
  • the second determining unit includes:
  • a first determining subunit configured to determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template
  • a second determining subunit configured to determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the first determining subunit is specifically configured to determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • R(i,j) is equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m , or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m or d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m ; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the predicting unit is specifically configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the first determining unit is specifically configured to: determine M templates with a highest degree of matching with the current template; and determine, from the M templates, N templates that meet the preset condition, where N is less than M.
  • the first determining unit is specifically configured to: determine the M templates with the highest degree of matching with the current template; and determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the picture prediction apparatus is applied to a video coding apparatus, or the picture prediction apparatus is applied to a video decoding apparatus.
  • a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different.
  • FIG. 1 - a is a schematic diagram of a prediction unit division manner corresponding to intra-frame prediction according to an embodiment of the present application
  • FIG. 1 - b is a schematic diagram of several prediction unit division manners corresponding to inter-frame prediction according to an embodiment of the present application;
  • FIG. 1 - c is a schematic flowchart of a picture prediction method according to an embodiment of the present application
  • FIG. 1 - d is a schematic diagram of a location relationship between a picture block and a template according to an embodiment of the present application
  • FIG. 2 is a schematic flowchart of another picture prediction method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another picture prediction method according to an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of another picture prediction method according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a picture prediction apparatus according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of another picture prediction apparatus according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of another picture prediction apparatus according to an embodiment of the present application.
  • Embodiments of the present application provide a picture prediction method and a related apparatus, so as to improve picture prediction accuracy.
  • a video sequence includes a series of pictures, the pictures are further divided into slices, and the slices are further divided into blocks.
  • Video coding is to perform coding processing from left to right and from top to bottom row by row starting from an upper left corner position of a picture by using a block as a unit.
  • a concept of a block is further extended.
  • MB macroblock
  • coding unit a prediction unit
  • TU transform unit
  • the CU may be divided into smaller CUs according to a quadtree, and the smaller CU may be further divided to form a quadtree structure.
  • the PU and the TU also have similar tree structures. Regardless of whether a unit is a CU, a PU, or a TU, the unit belongs to the concept of a block in essence.
  • the CU is similar to a macroblock MB or a coding block, and is a basic unit for partitioning and coding a picture.
  • the PU may correspond to a prediction block, and is a basic unit for predictive coding.
  • the CU is further divided into multiple PUs according to a division mode.
  • the TU may correspond to a transform block, and is a basic unit for transforming a prediction residual.
  • a size of the coding unit may include four levels: 64 ⁇ 64, 32 ⁇ 32, 16 ⁇ 16, and 8 ⁇ 8. Coding units at each level may be divided into prediction units of different sizes according to intra-frame prediction and inter-frame prediction. For example, as shown in FIG. 1 - a and FIG. 1 - b , FIG. 1 - a shows a prediction unit division manner corresponding to intra-frame prediction. FIG. 1 - b shows several prediction unit division manners corresponding to inter-frame prediction.
  • a skip mode and a direct mode become effective tools for improving coding efficiency.
  • Blocks of the two coding modes used when a bit rate is low can occupy more than a half of an entire coding sequence.
  • the skip mode When the skip mode is used, a motion vector of a current picture block can be derived by using nearby motion vectors only by adding a skip mode flag to a bit stream, and a value of a reference block is directly copied according to the motion vector as a reconstructed value of the current picture block.
  • an encoder may derive the motion vector of the current picture block by using the adjacent motion vectors, and directly copy the value of the reference block according to the motion vector as a predicted value of the current picture block, and perform predictive coding on the current picture block by using the predicted value in an encoder.
  • some new coding tools are introduced to further improve video coding efficiency.
  • a merge coding mode and an advanced motion vector prediction (AMVP) mode are two important inter-frame prediction tools.
  • a candidate motion information set is constructed by using motion information (including a prediction direction, a motion vector, and a reference picture index) of an adjacent coded block of a current coding block, candidate motion information that enables coding efficiency to be the highest may be selected as motion information of the current coding block by means of comparison, a predicted value of the current coding block is found in reference pictures, predictive coding is performed on the current coding block, and an index value that indicates an adjacent coded block whose motion information is selected is written into a bitstream.
  • motion information including a prediction direction, a motion vector, and a reference picture index
  • a motion vector of an adjacent coded block is used as a motion vector predicted value of a current coding block, a motion vector that enables coding efficiency to be the highest may be selected to predict a motion vector of the current coding block, and an index value that indicates an adjacent coded block whose motion vector is selected may be written into a video bitstream.
  • a picture prediction method provided in the embodiments of the present application is first described in the following.
  • the picture prediction method provided in the embodiments of the present application is executed by a video coding apparatus or a video decoding apparatus.
  • the video coding apparatus or the video decoding apparatus may be any apparatus that needs to output or store a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, a mobile phone, or a video server.
  • the picture prediction method includes: determining N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determining a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
  • FIG. 1 - c is a schematic flowchart of a picture prediction method according to an embodiment of the present application. As shown in FIG. 1 - c , the picture prediction method provided in this embodiment of the present application may include the following steps.
  • the current template is a template corresponding to a current picture block.
  • the N templates are obtained by searching reference pictures of the current picture block. N is a positive integer.
  • N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • a template may be L-shaped or in another shape.
  • a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • FIG. 1 - d a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • an L-shaped pixel area T x adjacent to a picture block x is a template T x corresponding to the picture block x
  • an L-shaped pixel area T m1 adjacent to a picture block m 1 is a template T m1 corresponding to the picture block m 1
  • an L-shaped pixel area T m2 adjacent to a picture block m 2 is a template T m2 corresponding to the picture block m 2 .
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m , or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • a template corresponding to a picture block is some of adjacent reconstructed pixels of the picture block, and has a correspondence with the picture block.
  • the template corresponding to the picture block may be used to represent the picture block for searching and matching.
  • a picture block corresponding to a matching template may be found for an operation such as prediction by using a correspondence between the template and the picture block.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template T m is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m , or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is,
  • a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different.
  • the determining N templates whose degrees of matching with a current template meet a preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
  • the determining, from the M templates, N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the determining a weight of a pixel area in a picture block corresponding to each template in the N templates may include: determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
  • ( i,j ) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • b6 5.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
  • FIG. 2 is a schematic flowchart of another picture prediction method according to another embodiment of the present application.
  • the another picture prediction method provided in the another embodiment of the present application may include the following steps.
  • M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
  • the current template is a template corresponding to a current picture block.
  • the M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
  • M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
  • the determining, from the M templates, N templates that meet a preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • a template may be L-shaped or in another shape.
  • a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • FIG. 1 - d a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • an L-shaped pixel area T x adjacent to a picture block x is a template T x corresponding to the picture block x
  • an L-shaped pixel area T m1 adjacent to a picture block m 1 is a template T m1 corresponding to the picture block m 1
  • an L-shaped pixel area T m2 adjacent to a picture block m 2 is a template T m2 corresponding to the picture block m 2 .
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m , or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m , to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template k is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m , or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is,
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than 0
  • the template T m is any template in the N templates.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
  • N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
  • FIG. 3 is a schematic flowchart of another picture prediction method according to another embodiment of the present application.
  • the another video coding method provided in the another embodiment of the present application may include the following steps.
  • a video coding apparatus determines M templates with a highest degree of matching with a current template.
  • M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
  • the current template is a template corresponding to a current picture block.
  • the M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
  • M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
  • the video coding apparatus determines, from the M templates, N templates that meet a preset condition.
  • the N templates that meet the preset condition are determined from the M templates includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the video coding apparatus determines a weight of each template in the N templates.
  • the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
  • the video coding apparatus determines, based on the weight of each template in the N templates, a weight of a pixel area in a picture block corresponding to each template in the N templates.
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • a template may be L-shaped or in another shape.
  • a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • FIG. 1 - d a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • an L-shaped pixel area T x adjacent to a picture block x is a template T x corresponding to the picture block x
  • an L-shaped pixel area T m1 adjacent to a picture block m 1 is a template T m1 corresponding to the picture bock m 1
  • an L-shaped pixel area T m2 adjacent to a picture block m 2 is a template T m2 corresponding to the picture block m 2 .
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m , or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template T m is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is, for example
  • the video coding apparatus calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • that the predicted pixel value of the pixel area in the current picture block is calculated based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the video coding apparatus obtains a prediction residual of the current picture block by using an original pixel value of the pixel area in the current picture block and the predicted pixel value of the pixel area in the current picture block.
  • the video coding apparatus writes the prediction residual of the current picture block into a video bitstream.
  • the picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
  • N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
  • FIG. 4 is a schematic flowchart of another picture prediction method according to another embodiment of the present application.
  • the another video decoding method provided in the another embodiment of the present application may include the following steps.
  • a video decoding apparatus determines M templates with a highest degree of matching with a current template.
  • M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
  • the current template is a template corresponding to a current picture block.
  • the M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
  • M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
  • the video decoding apparatus determines, from the M templates, N templates that meet a preset condition.
  • the N templates that meet the preset condition are determined from the M templates includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the video decoding apparatus determines a weight of each template in the N templates.
  • the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
  • the video decoding apparatus determines, based on the weight of each template in the N templates, a weight of a pixel area in a picture block corresponding to each template in the N templates.
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • a template may be L-shaped or in another shape.
  • a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • FIG. 1 - d a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block.
  • an L-shaped pixel area T x adjacent to a picture block x is a template T x corresponding to the picture block x
  • an L-shaped pixel area T m1 adjacent to a picture block m 1 is a template T m1 corresponding to the picture block m 1
  • an L-shaped pixel area T m2 adjacent to a picture block m 2 is a template T m2 corresponding to the picture block m 2 .
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m ; and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template T m is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is, for example
  • the video decoding apparatus calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T x
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • that the predicted pixel value of the pixel area in the current picture block is calculated based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the video decoding apparatus decodes a video bitstream to obtain a prediction residual of the current picture block.
  • the video decoding apparatus reconstructs the current picture block by using the predicted pixel value of the current picture block and the prediction residual of the current picture block.
  • N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel.
  • Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
  • coding efficiency for each color component is improved as follows: Y—8.3%, U—7.8%, and V—7.5%.
  • a condition 2 that is, in an application scenario in which N templates that meet a condition are directly determined from candidate templates, and there is a piecewise function relationship between ⁇ (R(i,j)) and R(i,j), coding efficiency for each color component is improved as follows: Y—8.1%, U—7.4%, and V—7.5%.
  • Rate-distortion performance improvement effects obtained by means of testing based on a ClassF sequence are shown in the following table.
  • Condition 1 Condition 2
  • Condition 3 BasketballDrillText ⁇ 2.9% ⁇ 3.3% ⁇ 3.3% ChinaSpeed ⁇ 4.0% ⁇ 3.6% ⁇ 4.2% SlideEditing ⁇ 19.2% ⁇ 18.5% ⁇ 19.4% SlideShow ⁇ 7.0% ⁇ 6.9% ⁇ 7.1% ClassF sequence ⁇ 8.3% ⁇ 8.1% ⁇ 8.5% average
  • an embodiment of the present application further provides a picture prediction apparatus 500 , and the apparatus 500 may include:
  • a first determining unit 510 configured to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, the N templates are obtained by searching reference pictures of the current picture block, and N is a positive integer;
  • a second determining unit 520 configured to determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different;
  • a predicting unit 530 configured to calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
  • the second determining unit 520 includes:
  • a first determining subunit configured to determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template
  • a second determining subunit configured to determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the first determining subunit is specifically configured to determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T m
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 ,
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • R(i,j) is equal to d(i,j) or e(i,j), d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m , and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the predicting unit 530 is specifically configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the first determining unit 510 is specifically configured to: determine M templates with a highest degree of matching with the current template; and determine, from the M templates, N templates that meet the preset condition, where N is less than M.
  • the first determining unit 510 is specifically configured to: determine the M templates with the highest degree of matching with the current template; and determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the picture prediction apparatus is applied to a video coding apparatus, or the picture prediction apparatus is applied to a video decoding apparatus.
  • the picture prediction apparatus 500 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
  • the picture prediction apparatus 500 determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different.
  • FIG. 6 is a schematic diagram of a picture prediction apparatus 600 according to an embodiment of the present application.
  • the picture prediction apparatus 600 may include at least one bus 601 , at least one processor 602 connected to the bus 601 , and at least one memory 603 connected to the bus 601 .
  • the processor 602 invokes, by using the bus 601 , code stored in the memory 603 , so as to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
  • N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • the templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks.
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template T m is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m , or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is,
  • N templates whose degrees of matching with the current template meet the preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
  • the processor 602 is configured to determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the processor 602 is configured to determine the weight of the pixel area in the picture block corresponding to each template in the N templates may include: determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the processor 602 may be configured to determine the weight of each template in the N templates based on the following formulas and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T m
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 ,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • b6 5.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m
  • e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m , or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the processor 602 is configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • ( i,j ) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the picture prediction apparatus 600 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
  • the picture prediction apparatus 600 determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different.
  • FIG. 7 is a structural block diagram of a picture prediction apparatus 700 according to another embodiment of the present application.
  • the picture prediction apparatus 700 may include at least one processor 701 , at least one memory 705 , and at least one communications bus 702 .
  • the communications bus 702 is configured to implement a connection and communication between these components.
  • the picture prediction apparatus 700 may include at least one network interface 704 and/or at least one user interface 703 , and the user interface 703 may include a display (for example, a touchscreen, an LCD, a holographic imaging device, a CRT, a projector), a click device (for example, a mouse, a trackball, a touchpad, a touchscreen), a camera, and/or a pickup apparatus.
  • a display for example, a touchscreen, an LCD, a holographic imaging device, a CRT, a projector
  • a click device for example, a mouse, a trackball, a touchpad, a touchscreen
  • camera and/or a pickup apparatus
  • the memory 705 may include a read-only memory and a random access memory, and provide an instruction and data for the processor 701 .
  • a part of the memory 705 may further include a non-volatile random access memory.
  • the memory 705 stores the following elements: an executable module or a data structure, or a subset of an executable module and a data structure, or an extended set of an executable module and a data structure:
  • an operating system 7051 including various system programs, and configured to: implement various basic services and process hardware-based tasks;
  • an application program module 7052 including various application programs, and configured to implement various application services.
  • the processor 701 by invoking a program or an instruction stored in the memory 705 , the processor 701 is configured to: determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
  • N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
  • the pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2 ⁇ 2, 1 ⁇ 2, 1 ⁇ 3, 4 ⁇ 2, 4 ⁇ 3, or 4 ⁇ 4 size, or another size.
  • the templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks.
  • the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
  • a parameter about a similarity between two objects may be used to represent the similarity between the two objects.
  • a parameter about a similarity between a template T m in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template T m and the pixel area w in the picture block m.
  • the picture block m is a picture block corresponding to the template T m .
  • the template T m may be any template in the N templates.
  • the pixel area w is any pixel area in the picture block m.
  • the parameter about the similarity between the template T m and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template T m , or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template T m to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template T m and a pixel value of the pixel area w.
  • the determined pixel area in the foregoing template T m may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template T m .
  • the upper left pixel area in the template T m is the upper left pixel in the template T m or a pixel block that is in the template T m and that includes the upper left pixel in the template T m ;
  • the lower left pixel area in the template T m is the lower left pixel in the template T m or a pixel block that is in the template T m and that includes the lower left pixel in the template T m ;
  • the upper right pixel area in the template T m is the upper right pixel in the template T m or a pixel block that is in the template T m and that includes the upper right pixel in the template T m ;
  • the center pixel area in the template T m is the center pixel in the template T m or a pixel block that is in the template T m and that includes the center pixel in the template T m .
  • the distance between the pixel area w and the determined pixel area in the template T m may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template T m , where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template T m , or the distance between the pixel area w and the determined pixel area in the template T m may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template T m (where the determined pixel in the template T m is,
  • N templates whose degrees of matching with the current template meet the preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
  • the processor 701 determines, from the M templates, the N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
  • the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
  • the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
  • the processor 701 determines the weight of the pixel area in the picture block corresponding to each template in the N templates may include: determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
  • the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
  • the processor 701 may determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
  • E (T x ,T m ) represents distortion between the current template T x and a template T m in the N templates
  • S represents a quantity of pixel areas in the current template T m
  • represents a template scaling factor
  • a and ⁇ are real numbers greater than
  • m represents a weight of the template T m
  • the template T m is any template in the N templates.
  • a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
  • m (i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template T m in the N templates
  • R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template T m
  • ⁇ (R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
  • the non-linear relationship between ⁇ (R(i,j)) and R(i,j) is: a value of ⁇ (R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 1 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 1 a ⁇ ⁇ 2 , b ⁇ ⁇ 1 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 2 a ⁇ ⁇ 3 , d ⁇ ( i , j ) > b ⁇ ⁇ 2 , where,
  • a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 4 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 3 a ⁇ ⁇ 5 , b ⁇ ⁇ 3 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 4 a ⁇ ⁇ 6 , b ⁇ ⁇ 4 ⁇ d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 5 a ⁇ ⁇ 7 , d ⁇ ( i , j ) > b ⁇ ⁇ 5 ,
  • a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
  • ⁇ (R(i,j)) and R(i,j) may be:
  • ⁇ ( d ⁇ ( i , j ) ) ⁇ a ⁇ ⁇ 8 , d ⁇ ( i , j ) ⁇ b ⁇ ⁇ 6 a ⁇ ⁇ 9 , d ⁇ ( i , j ) > b ⁇ ⁇ 6 , where
  • a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
  • b6 5.
  • non-linear relationship between ⁇ (R(i,j)) and R(i,j) may be a piecewise function relationship.
  • a piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
  • R(i,j) may be, for example, equal to d(i,j) or e(i,j), d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template T m , and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template T m , or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template T m .
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template T m
  • d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template T m
  • the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template T m is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template T m except the pixel y.
  • the processor 701 calculates the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
  • m (i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template T m in the N templates
  • p m (i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m
  • pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
  • the picture prediction apparatus 700 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
  • the picture prediction apparatus 700 determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
  • the pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different.
  • the disclosed apparatus may be implemented in other manners.
  • the described apparatus embodiment is merely an example.
  • the unit division is merely logical function division and may be other division in actual implementation.
  • a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
  • the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces.
  • the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic or other forms.
  • the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions in the embodiments.
  • functional units in the embodiments of the present application may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
  • the integrated unit may be implemented in a form of hardware, or may be implemented in a form of a software functional unit.
  • the integrated unit When the foregoing integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, the integrated unit may be stored in a computer-readable storage medium.
  • the software product is stored in a storage medium and includes instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the methods described in the embodiments of the present application.
  • the foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a read-only memory (ROM), a random access memory (RAM), a removable hard disk, a magnetic disk, or an optical disc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

A picture prediction method and a related apparatus are disclosed, which includes: determining N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, and the M templates are obtained by searching reference pictures of the current picture block; determining a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation of International Application No. PCT/CN2015/077272, filed on Apr. 23, 2015, which claims priority to Chinese Patent Application No. 201410606914.2, filed on Oct. 31, 2014. The disclosures of the aforementioned applications are hereby incorporated by reference in their entireties.
TECHNICAL FIELD
The present application relates to the field of picture processing technologies, and specifically, to a picture prediction method and a related apparatus.
BACKGROUND
With development of photoelectric acquisition technologies and continuous increase of requirements for high-definition digital videos, an amount of video data is increasingly large. Due to limited heterogeneous transmission bandwidths and diversified video applications, higher requirements are continuously imposed on video coding efficiency. A task of developing a High Efficiency Video Coding (HEVC) standard is initiated according to the requirements.
A basic principle of video compression coding is to use correlation between a space domain, a time domain, and a code word to remove redundancy as much as possible. Currently, a prevalent practice is to use a block-based hybrid video coding framework to implement video compression coding by performing steps of prediction (including intra-frame prediction and inter-frame prediction), transform, quantization, entropy coding, and the like. In an intra-frame prediction technology, redundancy information of a current picture block is removed by using spatial pixel information of the current picture block, to obtain a residual; in an inter-frame prediction technology, redundancy information of a current picture block is removed by using pixel information of a coded or decoded picture adjacent to a current picture block, to obtain a residual. This coding framework shows high viability, and therefore, HEVC still uses this block-based hybrid video coding framework.
In the prior art, a method for predicting a pixel value of a current picture block based on a non-local means filtering technology is provided. After all templates matching a current template are obtained by searching reference pictures, a predicted pixel value of a current picture block is obtained by using an average of pixel values of picture blocks corresponding to the all templates. However, in a test and practice process, it is found that, sometimes, prediction accuracy in an existing prediction technology is relatively low, and consequently, this is likely to affect video coding and decoding quality.
SUMMARY
Embodiments of the present application provide a picture prediction method and a related apparatus, so as to improve picture prediction accuracy.
A first aspect of the present application provides a picture prediction method, including:
    • determining N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block;
    • determining a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and
    • calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the determining a weight of a pixel area in a picture block corresponding to each template in the N templates includes:
    • determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and
    • determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
With reference to the first possible implementation manner of the first aspect, or the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template includes: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates.
With reference to any one of the first to the third possible implementation manners of the first aspect, in a fourth possible implementation manner of the first aspect, the determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates includes: determining, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
With reference to the fourth possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
With reference to the fifth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, the linear relationship between ∂(R(i,j)) and R(i,j) is: ∂(R(i,j))=β*R(i,j), where β is a scaling coefficient, and β is a real number greater than 0.
With reference to the fifth possible implementation manner of the first aspect, in a seventh possible implementation manner of the first aspect, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
With reference to the seventh possible implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
With reference to any one of the fourth to the eighth possible implementation manners of the first aspect, in a ninth possible implementation manner of the first aspect, R(i,j) is equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
With reference to the ninth possible implementation manner of the first aspect, in a tenth possible implementation manner of the first aspect, d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
With reference to any one of the first aspect, or the first to the tenth possible implementation manners of the first aspect, in an eleventh possible implementation manner of the first aspect, the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
With reference to any one of the first aspect, or the first to the eleventh possible implementation manners of the first aspect, in a twelfth possible implementation manner of the first aspect, the determining N templates whose degrees of matching with a current template meet a preset condition includes:
determining M templates with a highest degree of matching with the current template; and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
With reference to the twelfth possible implementation manner of the first aspect, in a thirteenth possible implementation manner of the first aspect, the determining, from the M templates, N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
With reference to the thirteenth possible implementation manner of the first aspect, in a fourteenth possible implementation manner of the first aspect, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
With reference to any one of the first aspect, or the first to the fourteenth possible implementation manners of the first aspect, in a fifteenth possible implementation manner of the first aspect, the picture prediction method is applied in a video coding process, or the picture prediction method is applied in a video decoding process.
A second aspect of the present application provides a picture prediction apparatus, including:
a first determining unit, configured to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block;
a second determining unit, configured to determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and
a predicting unit, configured to calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the second determining unit includes:
a first determining subunit, configured to determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and
a second determining subunit, configured to determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
With reference to the first possible implementation manner of the second aspect, or the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the first determining subunit is specifically configured to determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates.
With reference to any one of the first to the third possible implementation manners of the second aspect, in a fourth possible implementation manner of the second aspect, the second determining subunit is specifically configured to determine, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
With reference to the fourth possible implementation manner of the second aspect, in a fifth possible implementation manner of the second aspect, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
With reference to the fifth possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect, the linear relationship between ∂(R(i,j)) and R(i,j) is: ∂(R(i,j))=β*R(i,j), where β is a scaling coefficient, and β is a real number greater than 0.
With reference to the fifth possible implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
With reference to the seventh possible implementation manner of the second aspect, in an eighth possible implementation manner of the second aspect, the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
With reference to any one of the fourth to the eighth possible implementation manners of the second aspect, in a ninth possible implementation manner of the second aspect, R(i,j) is equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm, or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
With reference to the ninth possible implementation manner of the second aspect, in a tenth possible implementation manner of the second aspect, d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm or d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
With reference to any one of the second aspect, or the first to the tenth possible implementation manners of the second aspect, in an eleventh possible implementation manner of the second aspect, the predicting unit is specifically configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
With reference to any one of the second aspect, or the first to the eleventh possible implementation manners of the second aspect, in a twelfth possible implementation manner of the second aspect, the first determining unit is specifically configured to: determine M templates with a highest degree of matching with the current template; and determine, from the M templates, N templates that meet the preset condition, where N is less than M.
With reference to the twelfth possible implementation manner of the second aspect, in a thirteenth possible implementation manner of the second aspect, the first determining unit is specifically configured to: determine the M templates with the highest degree of matching with the current template; and determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
With reference to the thirteenth possible implementation manner of the second aspect, in a fourteenth possible implementation manner of the second aspect, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
With reference to any one of the second aspect, or the first to the fourteenth possible implementation manners of the second aspect, in a fifteenth possible implementation manner of the second aspect, the picture prediction apparatus is applied to a video coding apparatus, or the picture prediction apparatus is applied to a video decoding apparatus.
It can be learned that, in the solutions in the embodiments of the present application, after N templates whose degrees of matching with a current template meet a preset condition are determined, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
BRIEF DESCRIPTION OF DRAWINGS
To describe the technical solutions in the embodiments of the present application more clearly, the following briefly describes the accompanying drawings required for describing the embodiments or the prior art. Apparently, the accompanying drawings in the following description show merely some embodiments of the present application, and persons of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts.
FIG. 1-a is a schematic diagram of a prediction unit division manner corresponding to intra-frame prediction according to an embodiment of the present application;
FIG. 1-b is a schematic diagram of several prediction unit division manners corresponding to inter-frame prediction according to an embodiment of the present application;
FIG. 1-c is a schematic flowchart of a picture prediction method according to an embodiment of the present application;
FIG. 1-d is a schematic diagram of a location relationship between a picture block and a template according to an embodiment of the present application;
FIG. 2 is a schematic flowchart of another picture prediction method according to an embodiment of the present application;
FIG. 3 is a schematic flowchart of another picture prediction method according to an embodiment of the present application;
FIG. 4 is a schematic flowchart of another picture prediction method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a picture prediction apparatus according to an embodiment of the present application;
FIG. 6 is a schematic diagram of another picture prediction apparatus according to an embodiment of the present application; and
FIG. 7 is a schematic diagram of another picture prediction apparatus according to an embodiment of the present application.
DESCRIPTION OF EMBODIMENTS
Embodiments of the present application provide a picture prediction method and a related apparatus, so as to improve picture prediction accuracy.
To make persons skilled in the art understand the technical solutions in the present application better, the following clearly describes the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application. Apparently, the described embodiments are merely some rather than all of the embodiments of the present application. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present application without creative efforts shall fall within the protection scope of the present application.
Details are separately described in the following.
In the specification, claims, and accompanying drawings of the present application, terms “first”, “second”, “third”, “fourth”, and the like are intended to distinguish between different objects but do not indicate a particular order. In addition, terms “include”, “have”, and any other variant thereof are intended to cover a non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or units is not limited to the listed steps or units, but optionally further includes an unlisted step or unit, or optionally further includes another inherent step or unit of the process, the method, the product, or the device.
The following first describes some concepts that may be involved in the embodiments of the present application.
In most coding frameworks, a video sequence includes a series of pictures, the pictures are further divided into slices, and the slices are further divided into blocks. Video coding is to perform coding processing from left to right and from top to bottom row by row starting from an upper left corner position of a picture by using a block as a unit. In some new video coding standards, a concept of a block is further extended. There is a macroblock (MB) in the H.264 standard, and the MB may be further divided into multiple prediction blocks that may be used for predictive coding. In the HEVC standard, basic concepts such as a coding unit (CU), a prediction unit (PU), and a transform unit (TU) are used, and multiple units are classified according to functions, and a completely new tree-based structure is used for description. For example, the CU may be divided into smaller CUs according to a quadtree, and the smaller CU may be further divided to form a quadtree structure. The PU and the TU also have similar tree structures. Regardless of whether a unit is a CU, a PU, or a TU, the unit belongs to the concept of a block in essence. The CU is similar to a macroblock MB or a coding block, and is a basic unit for partitioning and coding a picture. The PU may correspond to a prediction block, and is a basic unit for predictive coding. The CU is further divided into multiple PUs according to a division mode. The TU may correspond to a transform block, and is a basic unit for transforming a prediction residual.
In the HEVC standard, a size of the coding unit may include four levels: 64×64, 32×32, 16×16, and 8×8. Coding units at each level may be divided into prediction units of different sizes according to intra-frame prediction and inter-frame prediction. For example, as shown in FIG. 1-a and FIG. 1-b, FIG. 1-a shows a prediction unit division manner corresponding to intra-frame prediction. FIG. 1-b shows several prediction unit division manners corresponding to inter-frame prediction.
In a development and evolution process of a video coding technology, video coding experts figure out various methods to use temporal and spatial correlation between adjacent coding/decoding blocks to try to improve coding efficiency. In the H264/Advanced Video Coding (AVC) standard, a skip mode and a direct mode become effective tools for improving coding efficiency. Blocks of the two coding modes used when a bit rate is low can occupy more than a half of an entire coding sequence. When the skip mode is used, a motion vector of a current picture block can be derived by using nearby motion vectors only by adding a skip mode flag to a bit stream, and a value of a reference block is directly copied according to the motion vector as a reconstructed value of the current picture block. In addition, when the direct mode is used, an encoder may derive the motion vector of the current picture block by using the adjacent motion vectors, and directly copy the value of the reference block according to the motion vector as a predicted value of the current picture block, and perform predictive coding on the current picture block by using the predicted value in an encoder. In the evolved HEVC standard, some new coding tools are introduced to further improve video coding efficiency. A merge coding mode and an advanced motion vector prediction (AMVP) mode are two important inter-frame prediction tools. During merge coding, a candidate motion information set is constructed by using motion information (including a prediction direction, a motion vector, and a reference picture index) of an adjacent coded block of a current coding block, candidate motion information that enables coding efficiency to be the highest may be selected as motion information of the current coding block by means of comparison, a predicted value of the current coding block is found in reference pictures, predictive coding is performed on the current coding block, and an index value that indicates an adjacent coded block whose motion information is selected is written into a bitstream. When the advanced motion vector prediction mode is used, a motion vector of an adjacent coded block is used as a motion vector predicted value of a current coding block, a motion vector that enables coding efficiency to be the highest may be selected to predict a motion vector of the current coding block, and an index value that indicates an adjacent coded block whose motion vector is selected may be written into a video bitstream.
The technical solutions in the embodiments of the present application are further discussed in the following.
A picture prediction method provided in the embodiments of the present application is first described in the following. The picture prediction method provided in the embodiments of the present application is executed by a video coding apparatus or a video decoding apparatus. The video coding apparatus or the video decoding apparatus may be any apparatus that needs to output or store a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, a mobile phone, or a video server.
In an embodiment of a picture prediction method of the present application, the picture prediction method includes: determining N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determining a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
Referring to FIG. 1-c, FIG. 1-c is a schematic flowchart of a picture prediction method according to an embodiment of the present application. As shown in FIG. 1-c, the picture prediction method provided in this embodiment of the present application may include the following steps.
101. Determine N templates whose degrees of matching with a current template meet a preset condition.
The current template is a template corresponding to a current picture block. The N templates are obtained by searching reference pictures of the current picture block. N is a positive integer.
For example, N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
102. Determine a weight of a pixel area in a picture block corresponding to each template in the N templates.
Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks. A template may be L-shaped or in another shape. For example, as shown in FIG. 1-d, a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block. For example, in FIG. 1-d, an L-shaped pixel area Tx adjacent to a picture block x is a template Tx corresponding to the picture block x, an L-shaped pixel area Tm1 adjacent to a picture block m1 is a template Tm1 corresponding to the picture block m1, and an L-shaped pixel area Tm2 adjacent to a picture block m2 is a template Tm2 corresponding to the picture block m2. There may be no overlap area between a picture block and a template corresponding to the picture block.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm, or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
A template corresponding to a picture block is some of adjacent reconstructed pixels of the picture block, and has a correspondence with the picture block. The template corresponding to the picture block may be used to represent the picture block for searching and matching. A picture block corresponding to a matching template may be found for an operation such as prediction by using a correspondence between the template and the picture block.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template Tm is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
103. Calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
It can be learned that, in the picture prediction solution in this embodiment, after N templates whose degrees of matching with a current template meet a preset condition are determined, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Optionally, in some possible implementation manners of the present application, the determining N templates whose degrees of matching with a current template meet a preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
For example, the determining, from the M templates, N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
There may be diverse manners of determining the weight of the pixel area in the picture block corresponding to each template in the N templates.
For example, the determining a weight of a pixel area in a picture block corresponding to each template in the N templates may include: determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, the determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, the determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates may include: determining, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 , where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For example, a1=0.3, a2=0.5, and a3=1. For example, b1=4, and b2=8.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For example, a4=0.3, a5=0.5, a6=0.8, and a7=1. For example, b3=4, b4=5, and b5=6.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
For example, a8=0.5, and a9=1. For example, b6=5.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
The picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
To better understand and implement the foregoing solutions in this embodiment of the present application, further descriptions are provided in the following with reference to a more specific application scenario.
Referring to FIG. 2, FIG. 2 is a schematic flowchart of another picture prediction method according to another embodiment of the present application. As shown in FIG. 2, the another picture prediction method provided in the another embodiment of the present application may include the following steps.
201. Determine M templates with a highest degree of matching with a current template.
M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
The current template is a template corresponding to a current picture block. The M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
For example, M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
202. Determine, from the M templates, N templates that meet a preset condition.
For example, the determining, from the M templates, N templates that meet a preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
203. Determine a weight of each template in the N templates.
For example, the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
204. Determine, based on the weight of each template in the N templates, a weight of a pixel area in a picture block corresponding to each template in the N templates.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks. A template may be L-shaped or in another shape. For example, as shown in FIG. 1-d, a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block. For example, in FIG. 1-d, an L-shaped pixel area Tx adjacent to a picture block x is a template Tx corresponding to the picture block x, an L-shaped pixel area Tm1 adjacent to a picture block m1 is a template Tm1 corresponding to the picture block m1, and an L-shaped pixel area Tm2 adjacent to a picture block m2 is a template Tm2 corresponding to the picture block m2. There may be no overlap area between a picture block and a template corresponding to the picture block.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm, or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm, to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm. Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template k is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
205. Calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, that the weight of the pixel area in the picture block corresponding to each template in the N templates is determined based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: determining, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, the calculating a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
It can be understood that, for a pixel area whose coordinates are (i,j), when the pixel area includes multiple pixels, i and j in (i,j) fall within specific value ranges instead of being fixed values.
The picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
It can be learned that, in the picture prediction solution in this embodiment, after M templates with a highest degree of matching with a current template are determined, N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Referring to FIG. 3, FIG. 3 is a schematic flowchart of another picture prediction method according to another embodiment of the present application. As shown in FIG. 3, the another video coding method provided in the another embodiment of the present application may include the following steps.
301. A video coding apparatus determines M templates with a highest degree of matching with a current template.
M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
The current template is a template corresponding to a current picture block. The M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
For example, M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
302. The video coding apparatus determines, from the M templates, N templates that meet a preset condition.
For example, that the N templates that meet the preset condition are determined from the M templates includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
303. The video coding apparatus determines a weight of each template in the N templates.
For example, the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
304. The video coding apparatus determines, based on the weight of each template in the N templates, a weight of a pixel area in a picture block corresponding to each template in the N templates.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks. A template may be L-shaped or in another shape. For example, as shown in FIG. 1-d, a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block. For example, in FIG. 1-d, an L-shaped pixel area Tx adjacent to a picture block x is a template Tx corresponding to the picture block x, an L-shaped pixel area Tm1 adjacent to a picture block m1 is a template Tm1 corresponding to the picture bock m1, and an L-shaped pixel area Tm2 adjacent to a picture block m2 is a template Tm2 corresponding to the picture block m2. There may be no overlap area between a picture block and a template corresponding to the picture block.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm, or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm. Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template Tm is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm m is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
305. The video coding apparatus calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, that the weight of the pixel area in the picture block corresponding to each template in the N templates is determined based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: determining, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00002
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, that the predicted pixel value of the pixel area in the current picture block is calculated based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
It can be understood that, for a pixel area whose coordinates are (i,j), when the pixel area includes multiple pixels, i and j in (i,j) fall within specific value ranges instead of being fixed values.
306. The video coding apparatus obtains a prediction residual of the current picture block by using an original pixel value of the pixel area in the current picture block and the predicted pixel value of the pixel area in the current picture block.
307. The video coding apparatus writes the prediction residual of the current picture block into a video bitstream.
The picture prediction method provided in this embodiment may be applied in a video coding process, or may be applied in a video decoding process.
It can be learned that, in the picture coding solution in this embodiment, after M templates with a highest degree of matching with a current template are determined, N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Referring to FIG. 4, FIG. 4 is a schematic flowchart of another picture prediction method according to another embodiment of the present application. As shown in FIG. 4, the another video decoding method provided in the another embodiment of the present application may include the following steps.
401. A video decoding apparatus determines M templates with a highest degree of matching with a current template.
M may be a specified value, that is, the M templates with the highest degree of matching with the current template may be selected from several candidate templates. For example, five templates with a highest degree of matching with the current template may be determined from ten candidate templates. Certainly, degrees of matching between the five templates and the current template are not necessarily equal, but a degree of matching between any template in the five templates and the current template is greater than or equal to a degree of matching between any candidate template except the five templates and the current template.
The current template is a template corresponding to a current picture block. The M templates are obtained by searching reference pictures of the current picture block. M is a positive integer.
For example, M may be equal to 2, 3, 4, 5, 7, 9, 12, or another value.
402. The video decoding apparatus determines, from the M templates, N templates that meet a preset condition.
For example, that the N templates that meet the preset condition are determined from the M templates includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
403. The video decoding apparatus determines a weight of each template in the N templates.
For example, the weight of each template in the N templates may be determined according to a degree of matching between each template in the N templates and the current template.
404. The video decoding apparatus determines, based on the weight of each template in the N templates, a weight of a pixel area in a picture block corresponding to each template in the N templates.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be determined based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block.
Weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks. A template may be L-shaped or in another shape. For example, as shown in FIG. 1-d, a template corresponding to a picture block may be an L-shaped pixel area adjacent to the picture block. For example, in FIG. 1-d, an L-shaped pixel area Tx adjacent to a picture block x is a template Tx corresponding to the picture block x, an L-shaped pixel area Tm1 adjacent to a picture block m1 is a template Tm1 corresponding to the picture block m1, and an L-shaped pixel area Tm2 adjacent to a picture block m2 is a template Tm2 corresponding to the picture block m2. There may be no overlap area between a picture block and a template corresponding to the picture block.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm; and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template Tm is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
405. The video decoding apparatus calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, that the weight of each template in the N templates is determined according to the degree of matching between each template in the N templates and the current template may include: determining the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tx, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, that the weight of the pixel area in the picture block corresponding to each template in the N templates is determined based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: determining, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, that the predicted pixel value of the pixel area in the current picture block is calculated based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates includes: calculating the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
It can be understood that, for a pixel area whose coordinates are (i,j), when the pixel area includes multiple pixels, i and j in (i,j) fall within specific value ranges instead of being fixed values.
406. The video decoding apparatus decodes a video bitstream to obtain a prediction residual of the current picture block.
407. The video decoding apparatus reconstructs the current picture block by using the predicted pixel value of the current picture block and the prediction residual of the current picture block.
It can be learned that, in the picture decoding solution in this embodiment, after M templates with a highest degree of matching with a current template are determined, N templates whose degrees of matching with the current template meet a preset condition are further determined from the M templates, a weight of a pixel area in a picture block corresponding to each template in the N templates is determined, and a predicted pixel value of a pixel area in the current picture block is calculated based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Performance is improved by implementing the foregoing solutions under some test conditions.
Under a condition 1, that is, in an application scenario in which N templates that meet a condition are directly determined from candidate templates, and there is a linear relationship between ∂(R(i,j)) and R(i,j) (for example, ∂(R(i,j))=β*R(i,j)), coding efficiency for each color component is improved as follows: Y—8.3%, U—7.8%, and V—7.5%.
Under a condition 2, that is, in an application scenario in which N templates that meet a condition are directly determined from candidate templates, and there is a piecewise function relationship between ∂(R(i,j)) and R(i,j), coding efficiency for each color component is improved as follows: Y—8.1%, U—7.4%, and V—7.5%.
Under a condition 3, that is, in an application scenario in which M templates are first determined from candidate templates, N templates that meet a condition are then determined from the M templates, and there is a linear relationship between ∂(R(i,j)) and R(i,j) (for example, ∂(R(i,j))=β*R(i,j)), coding efficiency for each color component is improved as follows: Y—8.5%, U—8.0%, and V—7.7%.
Rate-distortion performance improvement effects obtained by means of testing based on a ClassF sequence are shown in the following table.
Condition 1 Condition 2 Condition 3
BasketballDrillText  −2.9%  −3.3%  −3.3%
ChinaSpeed  −4.0%  −3.6%  −4.2%
SlideEditing −19.2% −18.5% −19.4%
SlideShow  −7.0%  −6.9%  −7.1%
ClassF sequence  −8.3%  −8.1%  −8.5%
average
It can be learned that, under a test condition, coding efficiency and rate-distortion performance can be greatly improved in the solutions in the embodiments of the present application.
Related apparatuses used to implement the foregoing solutions are further provided in the following.
Referring to FIG. 5, an embodiment of the present application further provides a picture prediction apparatus 500, and the apparatus 500 may include:
a first determining unit 510, configured to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, the N templates are obtained by searching reference pictures of the current picture block, and N is a positive integer;
a second determining unit 520, configured to determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and
a predicting unit 530, configured to calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
Optionally, in some possible implementation manners of the present application, the second determining unit 520 includes:
a first determining subunit, configured to determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and
a second determining subunit, configured to determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block includes: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, the first determining subunit is specifically configured to determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tm, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates.
Optionally, in some possible implementation manners of the present application, the second determining subunit is specifically configured to determine, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
Optionally, in some possible implementation manners of the present application, the linear relationship between ∂(R(i,j)) and R(i,j) is: ∂(R(i,j))=β*R(i,j) where β is a scaling coefficient, and β is a real number greater than 0.
Optionally, in some possible implementation manners of the present application, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
Optionally, in some possible implementation manners of the present application, the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0; or
the non-linear relationship between ∂(R(i,j)) and R(i,j) is:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
Optionally, in some possible implementation manners of the present application, R(i,j) is equal to d(i,j) or e(i,j), d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
Optionally, in some possible implementation manners of the present application, d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, the predicting unit 530 is specifically configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
Optionally, in some possible implementation manners of the present application, the first determining unit 510 is specifically configured to: determine M templates with a highest degree of matching with the current template; and determine, from the M templates, N templates that meet the preset condition, where N is less than M.
Optionally, in some possible implementation manners of the present application, the first determining unit 510 is specifically configured to: determine the M templates with the highest degree of matching with the current template; and determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
Optionally, in some possible implementation manners of the present application, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
Optionally, in some possible implementation manners of the present application, the picture prediction apparatus is applied to a video coding apparatus, or the picture prediction apparatus is applied to a video decoding apparatus.
It can be understood that functions of function modules of the picture prediction apparatus 500 in this embodiment may be specifically implemented according to the methods in the foregoing method embodiments. For a specific implementation process, refer to related descriptions in the foregoing method embodiments. Details are not described herein again. The picture prediction apparatus 500 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
It can be learned that, after determining N templates whose degrees of matching with a current template meet a preset condition, the picture prediction apparatus 500 in this embodiment determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Referring to FIG. 6, FIG. 6 is a schematic diagram of a picture prediction apparatus 600 according to an embodiment of the present application. The picture prediction apparatus 600 may include at least one bus 601, at least one processor 602 connected to the bus 601, and at least one memory 603 connected to the bus 601.
The processor 602 invokes, by using the bus 601, code stored in the memory 603, so as to determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
For example, N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm. Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template Tm is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
Optionally, in some possible implementation manners of the present application, that the N templates whose degrees of matching with the current template meet the preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
For example, that the processor 602 is configured to determine N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
There may be diverse manners of determining, by the processor 602, the weight of the pixel area in the picture block corresponding to each template in the N templates.
For example, that the processor 602 is configured to determine the weight of the pixel area in the picture block corresponding to each template in the N templates may include: determine a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determine, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, the processor 602 may be configured to determine the weight of each template in the N templates based on the following formulas and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tm, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, the processor 602 may be configured to determine, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For example, a1=0.3, a2=0.5, and a3=1. For example, b1=4, and b2=8.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For example, a4=0.3, a5=0.5, a6=0.8, and a7=1. For example, b3=4, b4=5, and b5=6.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
For example, a8=0.5, and a9=1. For example, b6=5.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j) d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm, or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, the processor 602 is configured to calculate the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
It can be understood that functions of function modules of the picture prediction apparatus 600 in this embodiment may be specifically implemented according to the methods in the foregoing method embodiments. For a specific implementation process, refer to related descriptions in the foregoing method embodiments. Details are not described herein again. The picture prediction apparatus 600 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
It can be learned that, after determining N templates whose degrees of matching with a current template meet a preset condition, the picture prediction apparatus 600 in this embodiment determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
Referring to FIG. 7, FIG. 7 is a structural block diagram of a picture prediction apparatus 700 according to another embodiment of the present application. The picture prediction apparatus 700 may include at least one processor 701, at least one memory 705, and at least one communications bus 702. The communications bus 702 is configured to implement a connection and communication between these components. Optionally, the picture prediction apparatus 700 may include at least one network interface 704 and/or at least one user interface 703, and the user interface 703 may include a display (for example, a touchscreen, an LCD, a holographic imaging device, a CRT, a projector), a click device (for example, a mouse, a trackball, a touchpad, a touchscreen), a camera, and/or a pickup apparatus.
The memory 705 may include a read-only memory and a random access memory, and provide an instruction and data for the processor 701. A part of the memory 705 may further include a non-volatile random access memory.
In some implementation manners, the memory 705 stores the following elements: an executable module or a data structure, or a subset of an executable module and a data structure, or an extended set of an executable module and a data structure:
an operating system 7051, including various system programs, and configured to: implement various basic services and process hardware-based tasks; and
an application program module 7052, including various application programs, and configured to implement various application services.
In this embodiment of the present application, by invoking a program or an instruction stored in the memory 705, the processor 701 is configured to: determine N templates whose degrees of matching with a current template meet a preset condition, where the current template is a template corresponding to a current picture block, N is a positive integer, and the N templates are obtained by searching reference pictures of the current picture block; determine a weight of a pixel area in a picture block corresponding to each template in the N templates, where weights of at least two pixel areas in a picture block corresponding to at least one template in the N templates are different; and calculate a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates, where the pixel area includes at least one pixel.
For example, N may be equal to 1, 2, 3, 4, 5, 7, 9, 12, or another value.
The pixel area mentioned in each embodiment of the present application includes at least one pixel. That is, the pixel area may be a pixel or a pixel block, and if the pixel area is a pixel block, the pixel area may have, for example, a 2×2, 1×2, 1×3, 4×2, 4×3, or 4×4 size, or another size.
The templates in the N templates correspond to different picture blocks, and therefore, the N templates correspond to N picture blocks.
For example, the weight of the pixel area in the picture block corresponding to each template in the N templates may be related to a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block. A parameter about a similarity between two objects may be used to represent the similarity between the two objects. For example, a parameter about a similarity between a template Tm in the N templates and a pixel area w in a picture block m may be used to represent the similarity between the template Tm and the pixel area w in the picture block m. The picture block m is a picture block corresponding to the template Tm. The template Tm may be any template in the N templates. The pixel area w is any pixel area in the picture block m.
In a specific example, the parameter about the similarity between the template Tm and the pixel area w in the picture block m may be, for example, a distance between the pixel area w and a determined pixel area in the template Tm, or may be a ratio of an average or a weighted average of pixel values of a determined pixel area in the template Tm to a pixel value of the pixel area w or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in the template Tm and a pixel value of the pixel area w. That is, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm and the distance between the pixel area w and the determined pixel area in the template Tm. Alternatively, there may be a correspondence between a weight of the pixel area w in the picture block m corresponding to the template Tm in the N templates and the ratio of the average or the weighted average of the pixel values of the determined pixel area in the template Tm to the pixel value of the pixel area w or the absolute difference between the average or the weighted average of the pixel values of the determined pixel area in the template Tm and the pixel value of the pixel area w.
Optionally, in some possible implementation manners of the present application, the determined pixel area in the foregoing template Tm may be, for example, an upper left pixel, an upper left pixel area, a lower left pixel, a lower left pixel area, an upper right pixel, an upper right pixel area, a lower right pixel, a lower right pixel area, a center pixel area, a center pixel, or another pixel area in the template Tm.
The upper left pixel area in the template Tm is the upper left pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper left pixel in the template Tm; the lower left pixel area in the template Tm is the lower left pixel in the template Tm or a pixel block that is in the template Tm and that includes the lower left pixel in the template Tm; the upper right pixel area in the template Tm is the upper right pixel in the template Tm or a pixel block that is in the template Tm and that includes the upper right pixel in the template Tm; the center pixel area in the template Tm is the center pixel in the template Tm or a pixel block that is in the template Tm and that includes the center pixel in the template Tm.
The distance between the pixel area w and the determined pixel area in the template Tm may be, for example, a distance between a pixel in the pixel area w and a pixel in the determined pixel area in the template Tm, where the two pixels are closest to each other, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between any pixel in the pixel area w and any pixel in the determined pixel area in the template Tm, or the distance between the pixel area w and the determined pixel area in the template Tm may be a distance between a determined pixel in the pixel area w (where the determined pixel in the pixel area w may be an upper left pixel, a lower left pixel, an upper right pixel, a lower right pixel, or a center pixel in the pixel area w) and a determined pixel in the determined pixel area in the template Tm (where the determined pixel in the template Tm is, for example, the upper left pixel, the lower left pixel, the upper right pixel, the lower right pixel, or the center pixel in the template Tm). Certainly, the distance between the pixel area w and the determined pixel area in the template Tm may be calculated in another manner.
Optionally, in some possible implementation manners of the present application, that the N templates whose degrees of matching with the current template meet the preset condition includes: determining N templates with a highest degree of matching with the current template; or determining M templates with a highest degree of matching with the current template, and determining, from the M templates, N templates that meet the preset condition, where N is less than M.
For example, that the processor 701 determines, from the M templates, the N templates that meet the preset condition includes: determining N templates from the M templates, where distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold. For example, the threshold is equal to average distortion between the pixel value of the current template and pixel values of the M templates, or the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates. Certainly, the threshold may be another preset value. It can be understood that a screening threshold is introduced to help obtain, by means of screening, a template that is with a relatively low matching degree and that is not involved in an operation, so that prediction accuracy is improved, and operation complexity is reduced.
There may be diverse manners of determining, by the processor 701, the weight of the pixel area in the picture block corresponding to each template in the N templates.
For example, that the processor 701 determines the weight of the pixel area in the picture block corresponding to each template in the N templates may include: determining a weight of each template in the N templates according to a degree of matching between each template in the N templates and the current template; and determining, based on the weight of each template in the N templates and a parameter about a similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates.
Optionally, in some possible implementation manners of the present application, the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block may include: a distance between a determined pixel area in each template in the N templates and the pixel area in the corresponding picture block; and/or a ratio of an average or a weighted average of pixel values of a determined pixel area in each template in the N templates to the pixel value of the pixel area in the corresponding picture block or an absolute difference between an average or a weighted average of pixel values of a determined pixel area in each template in the N templates and the pixel value of the pixel area in the corresponding picture block.
Optionally, in some possible implementation manners of the present application, the processor 701 may determine the weight of each template in the N templates based on the following formula and according to the degree of matching between each template in the N templates and the current template:
w m = a - ( E ( T x , T m ) / S ) × σ ,
where
E(T x ,T m ) represents distortion between the current template Tx and a template Tm in the N templates, S represents a quantity of pixel areas in the current template Tm, σ represents a template scaling factor, a and σ are real numbers greater than 0,
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and the template Tm is any template in the N templates. When a value of a is, for example, an integer such as 2, 3, or 4, operation complexity is simplified.
Optionally, in some possible implementation manners of the present application, the processor 701 may determine, based on the following formula and based on the weight of each template in the N templates and the parameter about the similarity between each template in the N templates and the pixel area in the corresponding picture block, the weight of the pixel area in the picture block corresponding to each template in the N templates:
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)),
where
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i,j) and that is in a picture block m corresponding to the template Tm in the N templates, R(i,j) represents a parameter about a similarity between the pixel area whose coordinates are (i,j) and that is in the picture block m and the template Tm, and ∂(R(i,j)) represents a pixel area scaling factor corresponding to R(i,j).
Optionally, in some possible implementation manners of the present application, there is a linear relationship or a non-linear relationship between ∂(R(i,j)) and R(i,j).
For example, the linear relationship between ∂(R(i,j)) and R(i,j) is:
∂(R(i,j))=β*R(i,j),
where β is a scaling coefficient, and β is a real number greater than 0.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) is: a value of ∂(R(i,j)) is determined based on a distance interval within which R(i,j) falls, and different distance intervals correspond to pixel area scaling factors with different values.
For example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
where,
a1 is less than a2, a2 is less than a3, and b1 is less than b2; and a1, a2, a3, b1, and b2 are real numbers greater than 0.
For example, a1=0.3, a2=0.5, and a3=1. For example, b1=4, and b2=8.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
where
a4 is less than a5, a5 is less than a6, a6 is less than a7, b3 is less than b4, and b4 is less than b5; and a4, a5, a6, a7, b3, b4, and b5 are real numbers greater than 0.
For example, a4=0.3, a5=0.5, a6=0.8, and a7=1. For example, b3=4, b4=5, and b5=6.
For another example, the non-linear relationship between ∂(R(i,j)) and R(i,j) may be:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
where
a8 is less than a9, and a8, a9, and b6 are real numbers greater than 0.
For example, a8=0.5, and a9=1. For example, b6=5.
The foregoing examples show that the non-linear relationship between ∂(R(i,j)) and R(i,j) may be a piecewise function relationship. A piecewise quantity of a piecewise function may not be limited to 2, 3, or 4 shown in the foregoing examples, and certainly, may be larger.
Optionally, in some possible implementation manners of the present application, R(i,j) may be, for example, equal to d(i,j) or e(i,j), d(i,j) represents a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a determined pixel area in the template Tm, and e(i,j) represents a ratio of a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm, or an absolute difference between a pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
For example, d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and an upper left pixel in the template Tm, or d(i,j) may represent a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and a pixel y in the template Tm; and the distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are (i,j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
Optionally, in some possible implementation manners of the present application, the processor 701 calculates the predicted pixel value of the pixel area in the current picture block by using the following formula and based on the weight and the pixel value of the pixel area in the picture block corresponding to each template in the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
where
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in the picture block m corresponding to the template Tm in the N templates, pm(i,j) represents the pixel value of the pixel area whose coordinates are (i,j) and that is in the picture block m, and pre(i,j) represents a predicted pixel value of a pixel area whose coordinates are (i,j) and that is in the current picture block.
It can be understood that functions of function modules of the picture prediction apparatus 700 in this embodiment may be specifically implemented according to the methods in the foregoing method embodiments. For a specific implementation process, refer to related descriptions in the foregoing method embodiments. Details are not described herein again. The picture prediction apparatus 700 may be any apparatus that needs to output or play a video, for example, a device such as a laptop computer, a tablet computer, a personal computer, or a mobile phone.
It can be learned that, after determining N templates whose degrees of matching with a current template meet a preset condition, the picture prediction apparatus 700 in this embodiment determines a weight of a pixel area in a picture block corresponding to each template in the N templates, and calculates a predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block corresponding to each template in the N templates. The pixel area includes at least one pixel. Weights of at least two pixel areas in a picture block corresponding to at least one template in the determined N templates are different, that is, weights of at least two pixel areas in a picture block corresponding to a same template are different. Therefore, a conventional-art solution in which weights of all pixel areas in picture blocks corresponding to all templates are the same (where the weights are all equal to 1) is abandoned. Because there is a specific difference between weights of pixel areas, this is more likely to conform to an actual correlation difference. In this way, a pixel value of a pixel area in a current picture block is accurately predicted, and video coding and decoding efficiency are improved.
It should be noted that, to make the descriptions brief, the foregoing method embodiments are expressed as a series of actions. However, persons skilled in the art should appreciate that the present application is not limited to the described action sequence, because according to the present application, some steps may be performed in other sequences or performed simultaneously. In addition, persons skilled in the art should also appreciate that all the embodiments described in the specification are implementable embodiments, and the related actions and modules are not necessarily mandatory to the present application.
In the foregoing embodiments, all the embodiments have respective focuses of description. For a part that is not described in detail in an embodiment, refer to related descriptions in other embodiments.
In the embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the described apparatus embodiment is merely an example. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions in the embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in a form of hardware, or may be implemented in a form of a software functional unit.
When the foregoing integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, the integrated unit may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions in the present application essentially, or the part contributing to the prior art, or all or some of the technical solutions may be implemented in the form of a software product. The software product is stored in a storage medium and includes instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the methods described in the embodiments of the present application. The foregoing storage medium includes any medium that can store program code, such as a USB flash drive, a read-only memory (ROM), a random access memory (RAM), a removable hard disk, a magnetic disk, or an optical disc.
The foregoing embodiments are merely intended for describing the technical solutions in the present application, but not for limiting the present application. Although the present application is described in detail with reference to the foregoing embodiments, persons of ordinary skill in the art should understand that they may still make modifications to the technical solutions described in the foregoing embodiments or make equivalent replacements to some technical features thereof, without departing from the spirit and scope of the technical solutions in the embodiments of the present application.

Claims (15)

What is claimed is:
1. A picture prediction method implemented by a processor, comprising:
determining M templates with a highest degree of matching with a current template;
determining, from the M templates, N templates whose degrees of matching with the current template meet a preset condition, wherein the current template is a template that corresponds to a current picture block, wherein N is a positive integer, wherein the N templates are obtained from reference pictures of the current picture block, and wherein N is less than M;
determining a weight of a pixel area in a picture block that spatially corresponds to a highest matching candidate from within the N templates, wherein determining the weight of the pixel area in the picture block further comprises:
determining a weight of each of the N templates based on the formula
w m = a - ( E ( T x , T m ) / S ) × σ
and according to a degree of matching between each of the N templates and the current template Tx wherein E(T x , T m ) represents distortion between the current template Tx and a template Tm in the N templates, wherein S represents a quantity of pixel areas in the current template Tx, wherein σ represents a template scaling factor, wherein α and σ are real numbers greater than 0, wherein
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and wherein the template Tm is any template in the N templates; and
determining, based on the weight of each of the N templates and a parameter about a similarity between each of the N templates and the pixel area in a corresponding picture block, the weight of the pixel area in a picture that corresponds to each of the N templates according to the formula
Figure US10536692-20200114-P00001
m(i,j)=
Figure US10536692-20200114-P00001
m −/∂(R(i,j)), wherein
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i, j) that is in a picture block m that corresponds to the template Tm, wherein R(i, j) represents a parameter about a similarity between the pixel area whose coordinates are (i, j) that is in the picture block m and the template Tm, wherein ∂(R(i, j)) represents a pixel area scaling factor that corresponds to the R(i, j), wherein a value of ∂(R(i, j)) is based on a distance interval within which the R(i, j) falls, wherein the ∂(R(i, j)) comprises either a linear relationship to the R(i, j) or a non-linear relationship to the R(i, j), wherein the linear relationship is β*R(i,j), wherein the non-linear relationship comprises the value of ∂(R(i, j)), and wherein β is a scaling coefficient and a real number greater than 0; and
calculating a predicted pixel value of a pixel area in the current picture block based on a weighted pixel value of the pixel area in the picture block that corresponds to the template comprising a highest degree of matching from within the N templates, wherein the pixel area comprises at least one pixel.
2. The method of claim 1, wherein the parameter about the similarity between each of the N templates and the pixel area in the corresponding picture block comprises either:
a distance between a determined pixel area in each of the N templates and the pixel area in the corresponding picture block;
a ratio of an average of pixel values of a determined pixel area in each of the N templates to a pixel value of the pixel area in the corresponding picture block;
a weighted average of the pixel values of the determined pixel area in each of the N templates to the pixel value of the pixel area in the corresponding picture block;
an absolute difference between an average of the pixel values of the determined pixel area in each of the N templates and the pixel value of the pixel area in the corresponding picture block; or
a weighted average of the pixel values of the determined pixel area in each of the N templates and the pixel value of the pixel area in the corresponding picture block.
3. The method of claim 1, wherein the different distance intervals correspond to pixel area scaling factors with different values.
4. The method of claim 3, wherein (a) the non-linear relationship between ∂(R(i, j)) and R(i, j) is:
( d ( i , j ) ) = { a 1 , d ( i , j ) b 1 a 2 , b 1 < d ( i , j ) b 2 a 3 , d ( i , j ) > b 2 ,
wherein a1 is less than a2, wherein the a2 is less than a3, wherein b1 is less than b2; and wherein the a1, the a2, the a3, the b1, and the b2 are real numbers greater than 0,
(b) the non-linear relationship between the ∂(R(i, j)) and the R(i, j) is:
( d ( i , j ) ) = { a 4 , d ( i , j ) b 3 a 5 , b 3 < d ( i , j ) b 4 a 6 , b 4 < d ( i , j ) b 5 a 7 , d ( i , j ) > b 5 ,
wherein a4 is less than a5, wherein the a5 is less than a6, wherein the a6 is less than a7, wherein the b3 is less than b4, wherein the b4 is less than b5; and wherein the a4, the a5, the a6, the a7, the b3, the b4, and the b5 are real numbers greater than 0 or
(c) the non-linear relationship between ∂(R(i, j)) and R(i, j) is:
( d ( i , j ) ) = { a 8 , d ( i , j ) b 6 a 9 , d ( i , j ) > b 6 ,
wherein a8 is less than a9, and wherein a8, a9, and b6 are real numbers greater than 0.
5. The method of claim 1, wherein R(i,j) is equal to either d(i, j) or e(i, j), wherein the d(i, j) represents a distance between the pixel area whose coordinates are (i, j) and that is in the picture block m and a determined pixel area in the template Tm, and wherein the e(i, j) represents either:
a ratio of a pixel value of the pixel area whose coordinates are (i, j) and that is in the picture block m to an average pixel value or a weighted average pixel value of the template Tm; or
an absolute difference between a pixel value of the pixel area whose coordinates are the (i, j) and that is in the picture block m and an average pixel value or a weighted average pixel value of the template Tm.
6. The method of claim 5, wherein d(i, j) represents either (a) a distance between the pixel area whose coordinates are (i, j) and that is in the picture block m and an upper left pixel in the template Tm, or (b) d(i, j) represents a distance between the pixel area whose coordinates are the (i, j) and that is in the picture block m and a pixel y in the template Tm and wherein the distance between the pixel area whose coordinates are the (i,j) and that is in the picture block m and the pixel y in the template Tm is less than or equal to a distance between the pixel area whose coordinates are the (i, j) and that is in the picture block m and any pixel in the template Tm except the pixel y.
7. The method of claim 1, wherein calculating the predicted pixel value of a pixel area in the current picture block based on the weight and a pixel value of the pixel area in the picture block that corresponds to each of the N templates comprises calculating the predicted pixel value of the pixel area in the current picture block using the following formula and based on the weight and the pixel value of the pixel area in the picture block that corresponds to each of the N templates:
pre ( i , j ) = m = 1 N ( w m ( i , j ) p m ( i , j ) ) m = 1 N w m ( i , j ) ,
wherein
Figure US10536692-20200114-P00001
m(i,j) represents the weight of the pixel area whose coordinates are (i,j) and that is in a picture block m that corresponds to the template Tm in the N templates, wherein pm(i, j) represents the pixel value of the pixel area whose coordinates are the (i,j) and that is in the picture block m, and wherein the pre(i, j) represents a predicted pixel value of a pixel area whose coordinates are the (i, j) and that is in the current picture block.
8. The method of claim 1, wherein determining, from the M templates, the N templates that meet the preset condition comprises determining the N templates from the M templates, wherein distortion between pixel values of the N templates and a pixel value of the current template is less than or equal to a threshold.
9. The method of claim 8, wherein either (a) the threshold is equal to average distortion between a pixel value of the current template and pixel values of the M templates or (b) the threshold is equal to an adjustment value of average distortion between the pixel value of the current template and pixel values of the M templates.
10. The method of claim 1, wherein the picture prediction method is applied in a video coding process or in a video decoding process.
11. A picture prediction apparatus, comprising:
a memory comprising instructions; and
a processor coupled to the memory and configured to execute the instructions, wherein the instructions cause the processor to:
determine M templates with a highest degree of matching with a current template;
determine, from the M templates, N templates whose degrees of matching with the current template meet a preset condition, wherein the current template is a template that corresponds to a current picture block, wherein N is a positive integer, wherein the N templates are obtained from reference pictures of the current picture block, and wherein N is less than M;
determine a weight of a pixel area in a picture block that spatially corresponds to a highest matching candidate from within the N templates, wherein determining the weight of the pixel area in the picture block further comprises:
determining a weight of each of the N templates based on the formula
w m = a - ( E ( T x , T m ) / S ) × σ
 and according to a degree of matching between each of the N templates and the current template Tx, wherein E(T x , T m ) represents distortion between the current template Tx and a template Tm in the N templates, wherein S represents a quantity of pixel areas in the current template Tx, wherein σ represents a template scaling factor, wherein α and σ are real numbers greater than 0, wherein
Figure US10536692-20200114-P00001
m represents a weight of the template Tm, and wherein the template Tm is any template in the N templates; and
determining, based on the weight of each of the N templates and a parameter about a similarity between each of the N templates and the pixel area in a corresponding picture block, the weight of the pixel area in a picture that corresponds to each of the N templates according to the formula
Figure US10536692-20200114-P00001
m(i, j)=
Figure US10536692-20200114-P00001
m −1/∂(R(i,j)), wherein
Figure US10536692-20200114-P00001
m(i,j) represents a weight of a pixel area whose coordinates are (i, j) that is in a picture block m that corresponds to the template Tm, wherein R(i, j) represents a parameter about a similarity between the pixel area whose coordinates are (i, j) that is in the picture block m and the template Tm, wherein ∂(R(i, j)) represents a pixel area scaling factor that corresponds to the R(i,j), wherein a value of ∂(R(i,j)) is based on a distance interval within which the R(i, j) falls, wherein the ∂(R(i, j)) comprises either a linear relationship to the R(i, j) or a non-linear relationship to the R(i, j), wherein the linear relationship is β*R(i,j), wherein the non-linear relationship comprises the value of ∂(R(i, j)) and wherein β is a scaling coefficient and a real number greater than 0; and
calculate a predicted pixel value of a pixel area in the current picture block based on weighted pixel value of the pixel area in the picture block that corresponds to the template comprising a highest degree of matching from within the N templates, wherein the pixel area comprises at least one pixel.
12. The picture prediction apparatus of claim 11, wherein the parameter about the similarity between each of the N templates and the pixel area in the corresponding picture block comprises either:
a distance between a determined pixel area in each of the N templates and the pixel area in the corresponding picture block;
a ratio of an average of pixel values of a determined pixel area in each of the N templates to a pixel value of the pixel area in the corresponding picture block;
a weighted average of the pixel values of the determined pixel area in each of the N templates to the pixel value of the pixel area in the corresponding picture block;
an absolute difference between an average of pixel values of the determined pixel area in each of the N templates and the pixel value of the pixel area in the corresponding picture block; or
a weighted average of the pixel values of the determined pixel area in each of the N templates and the pixel value of the pixel area in the corresponding picture block.
13. The picture prediction apparatus of claim 11, wherein the picture prediction apparatus is applied to a video coding apparatus or to a video decoding apparatus.
14. The picture prediction apparatus of claim 11, wherein weighted values of at least two different pixel areas in a picture block that corresponds to at least one of the N templates are different.
15. The method of claim 1, wherein weighted values of at least two different pixel areas in a picture block that corresponds to at least one of the N templates are different.
US15/454,356 2014-10-31 2017-03-09 Picture prediction method and related apparatus Active 2035-11-12 US10536692B2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201410606914.2A CN104363449B (en) 2014-10-31 2014-10-31 Image prediction method and relevant apparatus
CN201410606914 2014-10-31
CN201410606914.2 2014-10-31
PCT/CN2015/077272 WO2016065872A1 (en) 2014-10-31 2015-04-23 Image prediction method and relevant device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/077272 Continuation WO2016065872A1 (en) 2014-10-31 2015-04-23 Image prediction method and relevant device

Publications (2)

Publication Number Publication Date
US20170180727A1 US20170180727A1 (en) 2017-06-22
US10536692B2 true US10536692B2 (en) 2020-01-14

Family

ID=52530671

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/454,356 Active 2035-11-12 US10536692B2 (en) 2014-10-31 2017-03-09 Picture prediction method and related apparatus

Country Status (6)

Country Link
US (1) US10536692B2 (en)
EP (1) EP3177013B1 (en)
JP (1) JP6387582B2 (en)
KR (1) KR102005007B1 (en)
CN (1) CN104363449B (en)
WO (1) WO2016065872A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10427978B2 (en) 2008-04-22 2019-10-01 United States Gypsum Company Coated building panels and articles containing calcium sulfate hemihydrate
CN104363449B (en) 2014-10-31 2017-10-10 华为技术有限公司 Image prediction method and relevant apparatus
US10397569B2 (en) 2016-06-03 2019-08-27 Mediatek Inc. Method and apparatus for template-based intra prediction in image and video coding
CN107770525B (en) * 2016-08-15 2020-07-24 华为技术有限公司 Image coding method and device
CN106530275B (en) * 2016-10-11 2019-06-11 广州视源电子科技股份有限公司 Method and system for detecting wrong component
NL2020788B1 (en) 2017-04-21 2019-06-26 China Petroleum & Chem Corp Apparatus and Method for Treating Waste Water Containing Ammonium Salts
CN110248188A (en) * 2018-03-07 2019-09-17 华为技术有限公司 Predicted motion vector generation method and relevant device
EP3777167A1 (en) 2018-03-30 2021-02-17 Vid Scale, Inc. Template-based inter prediction techniques based on encoding and decoding latency reduction
US11956460B2 (en) * 2018-08-31 2024-04-09 Hulu, LLC Selective template matching in video coding
CN114424533A (en) * 2019-09-27 2022-04-29 Oppo广东移动通信有限公司 Method for determining predicted value, decoder and computer storage medium
WO2024034861A1 (en) * 2022-08-09 2024-02-15 현대자동차주식회사 Method and device for video coding using template-based prediction
WO2024210624A1 (en) * 2023-04-06 2024-10-10 현대자동차주식회사 Image encoding/decoding method, device, and recording medium storing bitstreams

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6289052B1 (en) * 1999-06-07 2001-09-11 Lucent Technologies Inc. Methods and apparatus for motion estimation using causal templates
JP2007043651A (en) 2005-07-05 2007-02-15 Ntt Docomo Inc Dynamic image encoding device, dynamic image encoding method, dynamic image encoding program, dynamic image decoding device, dynamic image decoding method, and dynamic image decoding program
US7236634B2 (en) * 2003-02-04 2007-06-26 Semiconductor Technology Academic Research Center Image encoding of moving pictures
CN101557514A (en) 2008-04-11 2009-10-14 华为技术有限公司 Method, device and system for inter-frame predicting encoding and decoding
US20100246675A1 (en) * 2009-03-30 2010-09-30 Sony Corporation Method and apparatus for intra-prediction in a video encoder
US20110170793A1 (en) * 2008-09-24 2011-07-14 Kazushi Sato Image processing apparatus and method
US20110176741A1 (en) * 2008-09-24 2011-07-21 Kazushi Sato Image processing apparatus and image processing method
EP2571272A1 (en) * 2007-04-09 2013-03-20 NTT DoCoMo, Inc. Image coding using template matching
EP2627086A1 (en) * 2012-02-10 2013-08-14 Thomson Licensing Method and device for encoding a block of an image and corresponding reconstructing method and device
US20140056348A1 (en) * 2011-03-14 2014-02-27 Thomson Licensing Methods and device for reconstructing and coding an image block
EP2704442A1 (en) * 2009-07-02 2014-03-05 Qualcomm Incorporated Template matching for video coding
CN103700115A (en) 2012-09-27 2014-04-02 中国航天科工集团第二研究院二O七所 Correlation matching tracking method of moving target in complex background
US20140169541A1 (en) 1997-04-03 2014-06-19 At&T Intellectual Property I, L.P. Profile management system including user interface for accessing and maintaining profile data of user subscribed telephony services
US20140169451A1 (en) * 2012-12-13 2014-06-19 Mitsubishi Electric Research Laboratories, Inc. Perceptually Coding Images and Videos
CN104363449A (en) 2014-10-31 2015-02-18 华为技术有限公司 Method and relevant device for predicting pictures
US20150334417A1 (en) * 2012-12-18 2015-11-19 Friedrich-Alexander-Universität Erlangen-Nürnberg Coding a Sequence of Digital Images

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140169541A1 (en) 1997-04-03 2014-06-19 At&T Intellectual Property I, L.P. Profile management system including user interface for accessing and maintaining profile data of user subscribed telephony services
US6289052B1 (en) * 1999-06-07 2001-09-11 Lucent Technologies Inc. Methods and apparatus for motion estimation using causal templates
US7236634B2 (en) * 2003-02-04 2007-06-26 Semiconductor Technology Academic Research Center Image encoding of moving pictures
US20090116759A1 (en) * 2005-07-05 2009-05-07 Ntt Docomo, Inc. Video encoding device, video encoding method, video encoding program, video decoding device, video decoding method, and video decoding program
JP2007043651A (en) 2005-07-05 2007-02-15 Ntt Docomo Inc Dynamic image encoding device, dynamic image encoding method, dynamic image encoding program, dynamic image decoding device, dynamic image decoding method, and dynamic image decoding program
US20120320976A1 (en) * 2005-07-05 2012-12-20 Ntt Docomo, Inc. Video encoding device, video encoding method, video encoding program, video decoding device, video decoding method, and video decoding program
EP2571272A1 (en) * 2007-04-09 2013-03-20 NTT DoCoMo, Inc. Image coding using template matching
CN101557514A (en) 2008-04-11 2009-10-14 华为技术有限公司 Method, device and system for inter-frame predicting encoding and decoding
US20140105298A1 (en) * 2008-04-11 2014-04-17 Huawei Technologies Co., Ltd. Inter-frame prediction coding method, device and system
US20100272183A1 (en) * 2008-04-11 2010-10-28 Huawei Technologies Co., Ltd. Inter-frame prediction coding method, device and system
JP2011511541A (en) 2008-04-11 2011-04-07 華為技術有限公司 Method, apparatus, and system for interframe predictive coding
US20110170793A1 (en) * 2008-09-24 2011-07-14 Kazushi Sato Image processing apparatus and method
US20110176741A1 (en) * 2008-09-24 2011-07-21 Kazushi Sato Image processing apparatus and image processing method
US20100246675A1 (en) * 2009-03-30 2010-09-30 Sony Corporation Method and apparatus for intra-prediction in a video encoder
EP2237217A2 (en) 2009-03-30 2010-10-06 Sony Corporation Method and apparatus for intra-prediction
CN101854545A (en) 2009-03-30 2010-10-06 索尼公司 The method of intra-prediction and the equipment that are used for video encoder
EP2704442A1 (en) * 2009-07-02 2014-03-05 Qualcomm Incorporated Template matching for video coding
US20140056348A1 (en) * 2011-03-14 2014-02-27 Thomson Licensing Methods and device for reconstructing and coding an image block
EP2627086A1 (en) * 2012-02-10 2013-08-14 Thomson Licensing Method and device for encoding a block of an image and corresponding reconstructing method and device
CN103700115A (en) 2012-09-27 2014-04-02 中国航天科工集团第二研究院二O七所 Correlation matching tracking method of moving target in complex background
US20140169451A1 (en) * 2012-12-13 2014-06-19 Mitsubishi Electric Research Laboratories, Inc. Perceptually Coding Images and Videos
US20150334417A1 (en) * 2012-12-18 2015-11-19 Friedrich-Alexander-Universität Erlangen-Nürnberg Coding a Sequence of Digital Images
CN104363449A (en) 2014-10-31 2015-02-18 华为技术有限公司 Method and relevant device for predicting pictures

Non-Patent Citations (17)

* Cited by examiner, † Cited by third party
Title
Chinese Office Action dated Jan. 25, 2017 in corresponding Chinese Patent Application No. 201410606914.2.
Eugen Wige et al.: "Pixel based Averaging Predictor for HEVC Lossless Coding" Imaging and Computer Vision, Siemens Corporate Technology, Munich Germany, IEEE, 2013.
Extended European Search Report dated Jul. 10, 2017 in corresponding European Patent Application No. 15855669.6.
Guionnet T et al.: "Intra prediction based on weighted template matching predictors (WTM)" 7. JCT-VC Meeting 98; MPEG Meeting; Geneva (Joint Collaborative Team on Video Coding of ISO?IEC JTC1/SC29/WG11 and ITU-T SG. 16); Nov. 9, 2011, XP030110582.
International Search Report dated Aug. 10, 2015 in corresponding International Application No. PCT/CN2015/077272.
International Search Report dated Aug. 10, 2015 in corresponding International Patent Application No. PCT/CN2015/077272.
ITU:"Series H: Audiovisual and Multimedia Systems Infrastructure of audiovisual services—Coding of moving video" H.264, ITU-T, Telecommunication Standardization Sector of ITU, Feb. 2014.
Japanese Office Action dated Apr. 24, 2018, in corresponding Japanese Patent Application No. 2017-517280, 5 pgs.
T GUIONNET, L GUILLO: "Non-CE6: Intra prediction based on weighted template matching predictors (WTM)", 7. JCT-VC MEETING; 98. MPEG MEETING; 21-11-2011 - 30-11-2011; GENEVA; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/, 9 November 2011 (2011-11-09), XP030110582
Thomas Guionnet et al., "Intra Prediction Based on Weighted Template Matching Predictors (WTM)"; JCTVC-G598; Geneva CH, Nov. 21-30, 2011 (Year: 2011). *
Thomas Guionnet et al., Intra Prediction Based on Weighted Template Matching Predictors(WTM); JCTVC-G598;Geneva, CH 2011 (Year: 2011). *
Thomas Guionnet et al.,  "Intra Prediction Based on Weighted Template Matching Predictors (WTM)"; JCTVC-G598; Geneva CH, Nov. 21-30, 2011 (Year: 2011). *
Yoshinori Suzuki et al., "Inter Frame Coding With Template Matching Averaging";© 2007 IEEE: 1-4244-1437-7 (Year: 2007). *
Yoshinori Suzuki et al.,  "INTER Frame Coding With Template Matching Averaging";© 2007 IEEE: 1-4244-1437-7 (Year: 2007). *
Yoshinori Suzuki et al.: "An Improved Low Delay Inter Frame Coding Using Template Matching Averaging", Picture Coding Symposium 2010; Dec. 8, 2010, XP030082006.
YOSHINORI SUZUKI, CHOONG SENG BOON (NTT DOCOMO, INC., JAPAN): "An Improved Low Delay Inter Frame Coding Using Template Matching Averaging", PICTURE CODING SYMPOSIUM 2010; 8-12-2010 - 10-12-2010; NAGOYA, 8 December 2010 (2010-12-08), XP030082006
Y-W Chen et al.: "MB Mode with Joint Application of Template and Block Motion Compensations" 2. JCT-VC Meeting; Jul. 21, 2010-Jul. 28, 2010; Geneva; (Joint Collaborative Team on Video Coding of ISO?IEC JTC1/SC29/WG11 and ITU-T SG. 16) No. JCTVC-B072, Jul. 23, 2010.

Also Published As

Publication number Publication date
WO2016065872A1 (en) 2016-05-06
EP3177013A1 (en) 2017-06-07
KR20170045270A (en) 2017-04-26
BR112017006017A2 (en) 2018-06-26
EP3177013B1 (en) 2022-05-11
JP6387582B2 (en) 2018-09-12
EP3177013A4 (en) 2017-08-09
CN104363449A (en) 2015-02-18
KR102005007B1 (en) 2019-07-29
CN104363449B (en) 2017-10-10
JP2017536725A (en) 2017-12-07
US20170180727A1 (en) 2017-06-22

Similar Documents

Publication Publication Date Title
US10536692B2 (en) Picture prediction method and related apparatus
US11968386B2 (en) Picture prediction method and related apparatus
US11240529B2 (en) Picture prediction method and picture prediction apparatus
US11178419B2 (en) Picture prediction method and related apparatus
US20210006818A1 (en) Picture prediction method and related apparatus
KR102059066B1 (en) Motion vector field coding method and decoding method, and coding and decoding apparatuses

Legal Events

Date Code Title Description
AS Assignment

Owner name: HUAWEI TECHNOLOGIES CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HUANG, XIN;ZHANG, HONG;YANG, HAITAO;SIGNING DATES FROM 20170223 TO 20170303;REEL/FRAME:041558/0059

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: AWAITING TC RESP, ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4