[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2024235610A1 - Inferring motion vector predictor based on motion vector difference - Google Patents

Inferring motion vector predictor based on motion vector difference Download PDF

Info

Publication number
WO2024235610A1
WO2024235610A1 PCT/EP2024/061521 EP2024061521W WO2024235610A1 WO 2024235610 A1 WO2024235610 A1 WO 2024235610A1 EP 2024061521 W EP2024061521 W EP 2024061521W WO 2024235610 A1 WO2024235610 A1 WO 2024235610A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion information
list
current block
predictor
difference
Prior art date
Application number
PCT/EP2024/061521
Other languages
French (fr)
Inventor
Fabrice Le Leannec
Tangi POIRIER
Pascal Le Guyadec
Antoine Robert
Franck Galpin
Original Assignee
Interdigital Ce Patent Holdings, Sas
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Interdigital Ce Patent Holdings, Sas filed Critical Interdigital Ce Patent Holdings, Sas
Publication of WO2024235610A1 publication Critical patent/WO2024235610A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock

Definitions

  • pictures of the video content are divided into blocks of samples (i.e., Pixels), these blocks being then partitioned into one or more sub-blocks, called original sub-blocks in the following.
  • An intra or inter prediction is then applied to each sub-block to exploit intra or inter image correlations.
  • a predictor sub-block is determined for each original sub- block.
  • a sub-block representing a difference between the original sub-block and the predictor sub-block often denoted as a prediction error sub-block, a prediction residual sub-block or simply a residual sub-block, is transformed, quantized and entropy coded to generate an encoded video stream.
  • the compressed data is decoded by inverse processes corresponding to the transform, quantization and entropic coding.
  • Inter prediction consists in predicting a current block of a current picture from at least one predictor block from a reference picture preceding or following the current picture.
  • a predictor block is identified in the reference picture by motion information.
  • motion information is now predicted using methods such Adaptive Motion Vector Prediction (AMVP) or merge.
  • AMVP Adaptive Motion Vector Prediction
  • these methods comprise a construction of a list of motion vector predictors (MVPs), a 2023PF00408 selection of a best MVP in the list and an encoding of the motion vector of the current block in the form of an index representing the selected MVP along, in some cases, a motion difference between the selected MVP and the actual motion vector of the current block.
  • MVPs motion vector predictors
  • 2023PF00408 selection of a best MVP in the list an encoding of the motion vector of the current block in the form of an index representing the selected MVP along, in some cases, a motion difference between the selected MVP and the actual motion vector of the current block.
  • one or more of the present embodiments provide a method comprising: decoding a motion information difference for a current block from video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value.
  • one or more of the present embodiments provide a method comprising: obtaining a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating a motion information difference between the motion information of the current block and the first motion information predictor; encoding the motion information difference in video data; and, applying a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, encoding a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination.
  • the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode.
  • the method is applied successively to each reference picture used for a temporal prediction of the current block.
  • the method is applied to the two first motion information predictors of the list.
  • the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from the coding tree unit or a virtual processing decoding unit comprising the current block. 2023PF00408
  • the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference.
  • one or more of the present embodiments provide a device comprising electronic circuitry configured for applying a process comprising: decoding a motion information difference for a current block from video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value.
  • one or more of the present embodiments provide a device comprising electronic circuitry configured for applying a process comprising: obtaining a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating a motion information difference between the motion information of the current block and the first motion information predictor; 2023PF00408 encoding the motion information difference in video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, encoding a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination.
  • the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode.
  • the electronic circuitry responsive to the list is a list of AMVP motion information predictors, is configured to apply the process successively to each reference picture used for a temporal prediction of the current block.
  • the electronic circuitry responsive to the list is a list of motion information predictors according to the MMVD mode, is configured to apply the process to the two first motion information predictors of the list.
  • the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from a coding tree unit or a virtual processing decoding unit comprising the current block.
  • the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference.
  • one or more of the present embodiments provide a computer program comprising program code instructions for implementing the method according to the first or the second aspect.
  • one or more of the present embodiments provide a non- transitory information storage medium storing program code instructions for implementing the method according to the first or the second aspect. 5.
  • Fig. 1 illustrates an example of context in which various embodiments may be implemented;
  • Fig. 2 illustrates schematically an example of partitioning undergone by a picture of pixels of an original video;
  • Fig.3 depicts schematically a method for encoding a video stream;
  • Fig.4 depicts schematically a method for decoding an encoded video stream;
  • Fig. 1 illustrates an example of context in which various embodiments may be implemented;
  • Fig.3 depicts schematically a method for encoding a video stream;
  • Fig.4 depicts schematically a method for decoding an encoded video stream;
  • Fig. 1 illustrates an example of context in which various embodiments may be implemented;
  • Fig.3 depicts schematically a method for encoding a video stream;
  • Fig.4 depicts schematically a method for
  • FIG. 5A illustrates schematically an example of hardware architecture of a processing module able to implement an encoding module or a decoding module in which various aspects and embodiments are implemented
  • Fig. 5B illustrates a block diagram of an example of a first system in which various aspects and embodiments are implemented
  • Fig.5C illustrates a block diagram of an example of a second system in which various aspects and embodiments are implemented
  • Figs.6A and 6B and illustrates schematically spatial and temporal positions considered for constructing a list of merge candidates
  • Figs.7A and 7B illustrates motion coding configurations in AMVP mode
  • Fig.8 illustrates an example of motion vector reconstruction process of an embodiment
  • Fig.9 illustrates an embodiment of an encoding algorithm
  • Fig.10 illustrate an embodiment of a decoding algorithm.
  • VVC Versatile Video Coding
  • ITU-T H.266 Versatile Video Coding
  • these embodiments are not limited to the video coding/decoding method corresponding to VVC.
  • These embodiments are in particular 2023PF00408 adapted to various video formats comprising for example HEVC (ISO/IEC 23008-2 – MPEG-H Part 2, High Efficiency Video Coding / ITU-T H.265)), AVC ((ISO/CEI 14496-10), EVC (Essential Video Coding/MPEG-5), AV1, AV2 and VP9.
  • Fig.1 describes an example of a context in which following embodiments can be implemented.
  • a system 11 that could be a camera, a storage device, a computer, a server or any device capable of delivering a video stream, transmits a video stream to a system 13 using a communication channel 12.
  • the video stream is either encoded and transmitted by the system 11 or received and/or stored by the system 11 and then transmitted.
  • the communication channel 12 is a wired (for example Internet or Ethernet) or a wireless (for example WiFi, 3G, 4G or 5G) network link.
  • the system 13, that could be for example a set top box, receives and decodes the video stream to generate a sequence of decoded pictures.
  • the obtained sequence of decoded pictures is then transmitted to a display system 15 using a communication channel 14, that could be a wired or wireless network.
  • the display system 15 then displays said pictures.
  • the system 13 is comprised in the display system 15.
  • the system 13 and display 15 are comprised in a TV, a computer, a tablet, a smartphone, a head-mounted display, etc.
  • Figs.2, 3 and 4 introduce an example of video format.
  • Fig.2 illustrates an example of partitioning undergone by a picture of pixels 21 of an original video sequence 20. It is considered here that a pixel is composed of three components: a luminance component and two chrominance components.
  • a picture is divided into a plurality of coding entities.
  • a picture is divided in a grid of blocks called coding tree units (CTU).
  • CTU coding tree units
  • a CTU consists of an ⁇ ⁇ ⁇ block of luminance samples together with two corresponding blocks of chrominance samples.
  • N is generally a power of two having a maximum value of “128” for example.
  • a picture is divided into one or more groups of CTU. For example, it can be divided into one or more tile rows and tile columns, a tile being a sequence of CTU covering a rectangular region of a picture.
  • a tile could be divided into one or more bricks, each of which consisting of 2023PF00408 at least one row of CTU within the tile.
  • another encoding entity called slice, exists, that can contain at least one tile of a picture or at least one brick of a tile.
  • the picture 21 is divided into three slices S1, S2 and S3 of the raster-scan slice mode, each comprising a plurality of tiles (not represented), each tile comprising only one brick.
  • a CTU may be partitioned into the form of a hierarchical tree of one or more sub-blocks called coding units (CU).
  • CU coding units
  • the CTU is the root (i.e., the parent node) of the hierarchical tree and can be partitioned in a plurality of CU (i.e. child nodes). Each CU becomes a leaf of the hierarchical tree if it is not further partitioned in smaller CU or becomes a parent node of smaller CU (i.e., child nodes) if it is further partitioned.
  • the CTU 24 is first partitioned in “4” square CU using a quadtree type partitioning.
  • the upper left CU is a leaf of the hierarchical tree since it is not further partitioned, i.e., it is not a parent node of any other CU.
  • the upper right CU is further partitioned in “4” smaller square CU using again a quadtree type partitioning.
  • the bottom right CU is vertically partitioned in “2” rectangular CU using a binary tree type partitioning.
  • the bottom left CU is vertically partitioned in “3” rectangular CU using a ternary tree type partitioning.
  • the partitioning is adaptive, each CTU being partitioned so as to optimize a compression efficiency of the CTU criterion.
  • PU prediction unit
  • TU transform unit
  • the coding entity that is used for prediction (i.e., a PU) and transform (i.e., a TU) can be a subdivision of a CU.
  • a CU of size 2 ⁇ ⁇ 2 ⁇ can be divided in PU 2411 of size ⁇ ⁇ 2 ⁇ or of size 2 ⁇ ⁇ ⁇ .
  • said CU can be divided in “4” TU 2412 of size ⁇ ⁇ ⁇ or in “16” TU of size ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • frontiers of the TU and PU are aligned on the frontiers of the CU.
  • a CU comprises generally one TU and one PU.
  • the term “block” or “picture block” can be used to refer to any one of a CTU, a CU, a PU and a TU.
  • the term “block” or “picture block” can be used to refer to a macroblock, a partition and a sub-block as 2023PF00408 specified in H.264/AVC or in other video coding standards, and more generally to refer to an array of samples of numerous sizes.
  • Fig.3 depicts schematically a method for encoding a video stream executed by an encoding module. Variations of this method for encoding are contemplated, but the method for encoding of Fig. 3 is described below for purposes of clarity without describing all expected variations.
  • a current original picture of an original video sequence may go through a pre-processing.
  • a color transform is applied to the current original picture (e.g., conversion from RGB 4:4:4 to YCbCr 4:2:0), or a remapping is applied to the current original picture components in order to get a signal distribution more resilient to compression (for instance using a histogram equalization of one of the color components).
  • Pictures obtained by pre-processing are called pre-processed pictures in the following.
  • the encoding of a pre-processed picture begins with a partitioning of the pre- processed picture during a step 302, as described in relation to Fig.2.
  • the pre-processed picture is thus partitioned into CTU, CU, PU, TU, etc.
  • the encoding module determines a coding mode between an intra prediction and an inter prediction.
  • the intra prediction consists of predicting, in accordance with an intra prediction method, during a step 303, the pixels of a current block from a prediction block derived from pixels of reconstructed blocks situated in a causal vicinity of the current block to be coded.
  • the result of the intra prediction is a prediction mode indicating which pixels of the blocks in the vicinity to use, and a residual block resulting from a calculation of a difference between the current block and the prediction block.
  • the inter prediction consists of predicting the pixels of a current block from a block of pixels, referred to as the reference block, of a picture preceding or following the current picture, this picture being referred to as the reference picture.
  • a block of the reference picture closest, in accordance with a similarity criterion, to the current block 2023PF00408 is determined by a motion estimation step 304.
  • a motion vector indicating the position of the reference block in the reference picture is determined. Said motion vector is used during a motion compensation step 305, involving interpolation operations between samples of the reference block, to generate a prediction block.
  • a residual block is then calculated in the form of a difference between the current block and the prediction block.
  • the uni-prediction inter mode described above was the only inter mode available.
  • the family of inter modes has grown significantly and comprises now many different inter modes, such as bi-prediction modes in which a current is predicted from two reference blocks designated by two different motion information.
  • the prediction mode optimising the compression performances in accordance with a rate/distortion optimization criterion (i.e., RDO criterion), among the prediction modes tested (Intra prediction modes, Inter prediction modes), is selected by the encoding module.
  • the prediction mode is selected, the residual block is transformed during a step 307.
  • a plurality of type of transforms can be applied to a transformed residual block.
  • a Multiple Transform Selection (MTS) scheme is used for both inter and intra predicted blocks. It uses multiple selected transforms from the DCT-VIII/DST-VII.
  • the transformed block is then quantized during a step 309.
  • the encoding module can skip the transform and apply quantization directly to the non-transformed residual signal.
  • the quantized residual block determined for the current block during an inter or intra prediction is encoded by an entropic encoder during a step 310.
  • the encoding module can bypass both transform and quantization, i.e., the entropic encoding is applied on the residual without the application of the transform or quantization processes.
  • the result of the entry coding is inserted in the video data 311.
  • the intra prediction mode is encoded by the entropic encoder during the step 310 in the video data 311.
  • 2023PF00408 When the current block is encoded according to an inter prediction, a process is applied to encode the motion information. The output of this process is then encoded by the entropic encoder during the step 310 in the video data 311.
  • Two processes are employed to encode the motion information: AMVP (Adaptive Motion Vector Prediction) or Merge.
  • AMVP Adaptive Motion Vector Prediction
  • Merge Two processes are employed to encode the motion information: AMVP (Adaptive Motion Vector Prediction) or Merge.
  • the motion information are predicted.
  • a motion vector predictor (MVP) is selected, and a motion vector difference noted MVd relative to the selected MVP is computed.
  • the MVP is selected in a list of AMVP candidates made of “2” candidates.
  • the index of the chosen MVP and the MVd are then encoded by the entropic encoder during step 310 along with the transformed and quantized residual block resulting from the inter prediction of the current block.
  • the AMVP candidate list is constructed first by deriving a first spatial candidate from a left block neighbouring the current block, if this block is available and inter coded. Then a second spatial candidate is derived from a top block neighbouring the current block, if this block is available and inter coded. Then, a temporal candidate is derived from a so-called collocated picture at a position collocated with the current block, if an inter block exist at this collocated position.
  • Each derived MVP candidate is scaled according to a temporal distance between the reference picture associated to this MVP candidate and the reference picture considered for the current block. A redundancy check is then conducted between derived spatial candidates and, if a duplicate candidate exists, this candidate is discarded.
  • the final AMVP candidate list contains the two first derived MVP candidates. If less than “2” MVP candidates are obtained through the above process, then the AMVP candidate list is completed with zero motion vectors.
  • the merge mode consists in deriving motion information of a current block from a selected motion information predictor candidate.
  • the motion information considered here includes all the inter prediction parameters of a block, that is to say: the uni-prediction or bi-prediction type, the reference picture index within each reference picture list and the motion vector(s).
  • the selected motion information predictor candidate (i.e., the merge candidate) is selected in a list of motion information predictor candidates (i.e., in a list of merge candidates).
  • the index of the selected merge candidate is encoded. If no residual block is encoded for the current block, the current block is considered as encoded according to a particular merge mode called skip mode.
  • the list of merge candidates is systematically made of “5” merge candidates. Up to “5” spatial positions are considered to retrieve some potential candidates for the list of merge candidates.
  • Fig. 6A illustrates schematically the five spatial positions considered for constructing a list of merge candidates. These positions are investigated according to the following order: 1. Left (A1) 2. Above (B1) 3.
  • TMVP Temporal predictor noted above right (B0) 4.
  • A0 Left bottom (A0) 5.
  • TMVP Temporal predictor noted above left (B2)
  • Fig. 6B illustrates schematically collocated positions considered for determining the TMVP. The determination of the TMVP consists first in investigating position H and, if no motion information is available at position H, the position C is investigated. A last pruning process is then applied to ensure that the set of spatial and temporal candidates does not contain redundant candidates.
  • B-slice slice allowing bi-predicted blocks
  • combined candidates are introduced in the list of merge candidates if this list is not full.
  • the whole-block-based motion representation consists in assigning one set of motion information, made of one or two motion vectors and associated reference picture(s) to an inter block.
  • the motion information of that block is represented under the form of one or two motion vectors for the whole block and a reference picture associated to each motion vector.
  • Sub-block-based motion coding mode typically consists in dividing a block into 4x4 or 8x8 luma samples subblocks and assigning an individual set of motion information (one or two couples of a motion vector and a reference picture) to each subblock. While in previous implementations of AMVP only spatial candidates, temporal candidates and the zero-motion vector candidate were considered for constructing the list of AMVP candidates, a new category of candidates, called HMVP (History-Based Motion Vector Prediction) candidates, was added in AMVP implementations adapted to the whole block-based motion representation. A principle of HMVP candidates is to use previously coded motion vectors as MVPs. These motion vectors are associated with adjacent or non-adjacent blocks relative to a current block.
  • HMVP History-Based Motion Vector Prediction
  • HMVP table a table of HMVP candidates (i.e., HMVP table) is maintained and updated on the fly, as a first-in-first-out (FIFO) buffer of MVPs.
  • FIFO first-in-first-out
  • the HMVP table is updated by appending the motion information of an inter predicted block to the end of the HMVP table as a new HMVP candidate.
  • a mechanism to remove redundant HMVP candidates is applied.
  • the HMVP table is reset at each CTU row to enable parallel processing.
  • the list of merge candidates was modified and three new merge modes were introduced.
  • the list of merge candidates is constructed with the following types of candidates: ⁇ Spatial candidates. ⁇ Temporal candidates. 2023PF00408 ⁇ HMVP candidates. Several HMVP candidates are inserted into the list of merge candidates so that the list reaches a maximum allowed number of merge candidates minus 1. ⁇ Pairwise Average candidates. Up to one pairwise average candidate is added to the list of merge candidates. Pairwise candidates are computed as follows: the two first merge candidates present in the list of merge candidates are considered and their motion vectors are averaged. This averaging is computed separately for each reference picture list.
  • each of the two first merge candidates are bi-prediction ones, motion vectors related to both lists L0 and L1 are averaged. If only one motion vector is present, it is taken as is to form the pairwise candidate. 5.
  • Zero motion vector candidate The three new merge modes comprise MMVD (Merge Mode with motion vector Difference), GPM (Geometric Partitioning Mode) and CIIP (Combined Intra/Inter Prediction).
  • MMVD can be viewed as a kind of merge mode in which a merge candidate is refined by a MVd.
  • a merge candidate is selected, it is further refined by a signalled MVd information.
  • the signaling of a MMVD mode comprises a merge candidate flag, an index to specify a motion magnitude and an index indicating a motion direction.
  • the merge candidate flag is signalled to specify which one is used between the first and second merge candidates.
  • the index specifying a motion magnitude and the index indicating a motion direction allow signaling a limited number of motion vector differences (MVd) on top of a signaled merge candidate, i.e., “4” vector directions and “8” magnitude values.
  • MVd motion vector differences
  • a block to predict is shown together with his two AMVP motion vector prediction candidates ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , and a motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ to signal. It is established that, in existing implementations of AMVP, the coding cost of a motion difference is highly correlated to the magnitude of the motion vector difference and increases as a function of the ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • Fig.7B shows that the coding configuration of the Fig.7A can also be reached by using the motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ instead of ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ instead of ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , leading to a smaller motion vector difference magnitude.
  • the situation of Fig. 7A can be detected as non-optimal by a decoding module.
  • the decoder is able to detect that the use of motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ is not the most optimal one to employ for pointing to the spatial position corresponding to ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ + ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , as shown in Fig. 7B.
  • the decoder is able to detect that the use of motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ is the most likely one, compared to ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ . Therefore, it can be considered that in the configuration of Fig. 7A, the signaling of the motion vector predictor using an index of the chosen motion vector predictor and the motion vector difference carry some redundant (or residual) information that can be exploited. Metadata such as SEI (supplemental enhancement information) messages can be attached to the encoded video stream 311.
  • SEI Supplemental Enhancement Information
  • a SEI message as defined for example in standards such as AVC, HEVC or VVC is a data container associated to a video stream and comprising metadata providing information relative to the video stream.
  • the current block is reconstructed so that the pixels corresponding to that block can be used for future predictions.
  • This reconstruction phase is also referred to as a prediction loop.
  • An inverse quantization is therefore applied to the transformed and quantized residual block during a step 312 and an inverse transformation is applied during a step 313.
  • the prediction block of the block is reconstructed.
  • the encoding module applies, when appropriate, during a step 316, a motion compensation using the motion information of the current block in order to identify 2023PF00408 each reference block of the current block.
  • the intra prediction mode is used for reconstructing the prediction block of the current block.
  • the prediction block and the reconstructed residual block are added in order to obtain the reconstructed current block.
  • an in-loop filtering intended to reduce the encoding artefacts is applied, during a step 317, to the reconstructed block.
  • In-loop filtering tools comprises deblocking filtering, SAO (Sample adaptive Offset) and ALF (Adaptive Loop Filtering).
  • SAO Sample adaptive Offset
  • ALF Adaptive Loop Filtering
  • FIG. 4 depicts schematically a method for decoding the encoded video stream 311 encoded according to method described in relation to Fig.3 executed by a decoding module. Variations of this method for decoding are contemplated, but the method for decoding of Fig. 4 is described below for purposes of clarity without describing all expected variations.
  • the decoding is done block by block. For a current block, it starts with an entropic decoding of the current block during a step 410. Entropic decoding allows to obtain, at least, the prediction mode of the block. If the block has been encoded according to an inter prediction mode, the entropic decoding allows to obtain, when appropriate, information representative of a motion of the current block and a residual block.
  • the motion information is reconstructed for the current block using the decoded information representative of the motion information.
  • the information representative of the motion comprises an index of the AMVP MVP in the AMVP list and the MVd.
  • the MVd is then added to the AMVP MVP to reconstruct the motion vector of the block.
  • an index of the merge MVP in the merge list is obtained.
  • the merge MVP corresponding to the index provides the motion information of the current block.
  • the information representative of the motion of the current block comprises some MMVD indices representative of a MVd.
  • the MVd corresponding to the MMVD indices is then added to the merge MVP to determine the motion information of the current block.
  • Steps 412, 413, 414, 415, 416 and 417 implemented by the decoding module are in all respects identical respectively to steps 412, 413, 414, 415, 416 and 417 implemented by the encoding module.
  • Decoded blocks are saved in decoded pictures and the decoded pictures are stored in a DPB 419 in a step 418.
  • the decoding module decodes a given picture, the pictures stored in the DPB 419 are identical to the pictures stored in the DPB 319 by the encoding module during the encoding of said given image.
  • the decoded picture can also be outputted by the decoding module for instance to be displayed.
  • the post-processing step 421 can comprise an inverse color transform (e.g. conversion from YCbCr 4:2:0 to RGB 4:4:4), an inverse mapping performing the inverse of the remapping process performed in the pre-processing of step 301 and a post-filtering for improving the reconstructed pictures based for example on filter parameters provided in a SEI message.
  • Fig. 5A illustrates schematically an example of hardware architecture of a processing module 500 able to implement an encoding module or a decoding module capable of implementing respectively a method for encoding of Fig.3 and a method for decoding of Fig. 4 modified according to different aspects and embodiments.
  • the encoding module is for example comprised in the system 11 when this apparatus is in charge of encoding the video stream.
  • the decoding module is for example comprised in the system 13.
  • the processing module 500 comprises, connected by a communication bus 5005: a processor or CPU (central processing unit) 5000 encompassing one or more microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture, as non-limiting examples; a random access memory (RAM) 5001; a read only memory (ROM) 5002; a storage unit 5003, which can include non-volatile memory and/or volatile memory, including, but not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Read-Only Memory (ROM), Programmable Read-Only Memory 2023PF00408 (PROM), Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash, magnetic disk drive, and/or optical disk drive, or a storage medium reader, such as a SD (secure digital)
  • the communication interface 5004 can include, but is not limited to, a transceiver configured to transmit and to receive data over a communication channel.
  • the communication interface 5004 can include, but is not limited to, a modem or network card. If the processing module 500 implements a decoding module, the communication interface 5004 enables for instance the processing module 500 to receive encoded video streams and to provide a sequence of decoded pictures. If the processing module 500 implements an encoding module, the communication interface 5004 enables for instance the processing module 500 to receive a sequence of original picture data to encode and to provide an encoded video stream.
  • the processor 5000 is capable of executing instructions loaded into the RAM 5001 from the ROM 5002, from an external memory (not shown), from a storage medium, or from a communication network.
  • the processor 5000 When the processing module 500 is powered up, the processor 5000 is capable of reading instructions from the RAM 5001 and executing them. These instructions form a computer program causing, for example, the implementation by the processor 5000 of a decoding method as described in relation with Fig.4, an encoding method described in relation to Fig.3, and methods described in relation to Figs.6 or 7, these methods comprising various aspects and embodiments described below in this document. All or some of the algorithms and steps of the methods of Figs.
  • 4, 6 and 7 may be implemented in software form by the execution of a set of instructions by a programmable machine such as a DSP (digital signal processor) or a microcontroller, or be implemented in hardware form by a machine or a dedicated component such as a FPGA (field-programmable gate array) or an ASIC (application-specific integrated circuit).
  • a programmable machine such as a DSP (digital signal processor) or a microcontroller
  • FPGA field-programmable gate array
  • ASIC application-specific integrated circuit
  • microprocessors, general purpose computers, special purpose computers, processors based or not on a multi-core architecture, DSP, microcontroller, FPGA and ASIC are electronic circuitry adapted to implement at least partially the 2023PF00408 methods of Figs.3, 4, 6 and 7.
  • Fig. 5C illustrates a block diagram of an example of the system 13 in which various aspects and embodiments are implemented.
  • the system 13 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances and head mounted display. Elements of system 13, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components.
  • the system 13 comprises one processing module 500 that implements a decoding module.
  • system 13 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports. In various embodiments, the system 13 is configured to implement one or more of the aspects described in this document.
  • the input to the processing module 500 can be provided through various input modules as indicated in block 531.
  • Such input modules include, but are not limited to, (i) a radio frequency (RF) module that receives an RF signal transmitted, for example, over the air by a broadcaster, (ii) a component (COMP) input module (or a set of COMP input modules), (iii) a Universal Serial Bus (USB) input module, and/or (iv) a High Definition Multimedia Interface (HDMI) input module.
  • RF radio frequency
  • COMP component
  • USB Universal Serial Bus
  • HDMI High Definition Multimedia Interface
  • Other examples not shown in FIG.5D, include composite video.
  • the input modules of block 531 have associated respective input processing elements as known in the art.
  • the RF module can be associated with elements suitable for (i) selecting a desired frequency (also referred to as selecting a signal, or band-limiting a signal to a band of frequencies), (ii) down-converting the selected signal, (iii) band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments, (iv) demodulating the down-converted and band- limited signal, (v) performing error correction, and (vi) demultiplexing to select the desired stream of data packets.
  • a desired frequency also referred to as selecting a signal, or band-limiting a signal to a band of frequencies
  • down-converting the selected signal for example
  • band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments
  • demodulating the down-converted and band- limited signal (v) performing error correction, and (vi) demultiplexing to select the desired stream
  • the RF module of various embodiments includes one or more elements to perform these functions, for example, frequency selectors, signal 2023PF00408 selectors, band-limiters, channel selectors, filters, downconverters, demodulators, error correctors, and demultiplexers.
  • the RF portion can include a tuner that performs various of these functions, including, for example, down-converting the received signal to a lower frequency (for example, an intermediate frequency or a near-baseband frequency) or to baseband.
  • the RF module and its associated input processing element receives an RF signal transmitted over a wired (for example, cable) medium, and performs frequency selection by filtering, down- converting, and filtering again to a desired frequency band.
  • Adding elements can include inserting elements in between existing elements, such as, for example, inserting amplifiers and an analog-to-digital converter.
  • the RF module includes an antenna.
  • the USB and/or HDMI modules can include respective interface processors for connecting system 13 to other electronic devices across USB and/or HDMI connections. It is to be understood that various aspects of input processing, for example, Reed-Solomon error correction, can be implemented, for example, within a separate input processing IC or within the processing module 500 as necessary. Similarly, aspects of USB or HDMI interface processing can be implemented within separate interface ICs or within the processing module 500 as necessary.
  • the demodulated, error corrected, and demultiplexed stream is provided to the processing module 500.
  • Various elements of system 13 can be provided within an integrated housing. Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
  • I2C Inter-IC
  • the processing module 500 is interconnected to other elements of said system 13 by the bus 5005.
  • the communication interface 5004 of the processing module 500 allows the system 13 to communicate on the communication channel 12.
  • the communication channel 12 can be implemented, for example, within a wired and/or a wireless medium.
  • Wi-Fi Wireless Fidelity
  • IEEE 802.11 IEEE refers to the Institute of Electrical and Electronics Engineers
  • the Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications.
  • the communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications.
  • Other embodiments provide streamed data to the system 13 using the RF connection of the input block 531. As indicated above, various embodiments provide data in a non- streaming manner.
  • various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
  • the system 13 can provide an output signal to various output devices, including the display system 15, speakers 56, and other peripheral devices 57.
  • the display system 15 of various embodiments includes one or more of, for example, a touchscreen display, an organic light-emitting diode (OLED) display, a curved display, and/or a foldable display.
  • the display 15 can be for a television, a tablet, a laptop, a cell phone (mobile phone), a head mounted display or other devices.
  • the display system 15 can also be integrated with other components (for example, as in a smart phone), or separate (for example, an external monitor for a laptop).
  • the other peripheral devices 57 include, in various examples of embodiments, one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, for both terms), a disk player, a stereo system, and/or a lighting system.
  • Various embodiments use one or more peripheral devices 57 that provide a function based on the output of the system 13. For example, a disk player performs the function of playing an output of the system 13.
  • control signals are communicated between the system 13 and the display system 15, speakers 56, or other peripheral devices 57 using signaling such as AV.Link, Consumer Electronics Control (CEC), or other communications protocols that enable device-to-device control with or without user intervention.
  • the output devices can be communicatively coupled to system 13 via dedicated connections through respective interfaces 532, 533, and 534. Alternatively, the output devices can be connected to system 13 using the communications channel 12 via the communications interface 5004 or a dedicated communication channel via the communication interface 5004.
  • the display system 15 and speakers 56 can be 2023PF00408 integrated in a single unit with the other components of system 13 in an electronic device such as, for example, a television.
  • the display interface 532 includes a display driver, such as, for example, a timing controller (T Con) chip.
  • T Con timing controller
  • the display system 15 and speaker 56 can alternatively be separate from one or more of the other components.
  • Fig. 5B illustrates a block diagram of an example of the system 11 in which various aspects and embodiments are implemented.
  • System 11 is very similar to system 13.
  • the system 11 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, a camera and a server.
  • Elements of system 11, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components.
  • IC integrated circuit
  • the system 11 comprises one processing module 500 that implements an encoding module.
  • the system 11 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports.
  • the system 11 is configured to implement one or more of the aspects described in this document.
  • the input to the processing module 500 can be provided through various input modules as indicated in block 531 already described in relation to Fig.5C.
  • Various elements of system 11 can be provided within an integrated housing. Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
  • I2C Inter-IC
  • the processing module 500 is interconnected to other elements of said system 11 by the bus 5005.
  • the communication interface 5004 of the processing module 500 allows the system 500 to communicate on the communication channel 12.
  • Data is streamed, or otherwise provided, to the system 11, in various 2023PF00408 embodiments, using a wireless network such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to the Institute of Electrical and Electronics Engineers).
  • IEEE 802.11 IEEE refers to the Institute of Electrical and Electronics Engineers.
  • the Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications.
  • the communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications.
  • Other embodiments provide streamed data to the system 11 using the RF connection of the input block 531.
  • various embodiments provide data in a non-streaming manner.
  • various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
  • the data provided to the system 11 can be provided in different format.
  • these data are encoded and compliant with a known video compression format such as AV1, VP9, VVC, HEVC, AVC, etc.
  • these data are raw data provided for example by a picture and/or audio acquisition module connected to the system 11 or comprised in the system 11. In that case, the processing module take in charge the encoding of these data.
  • the system 11 can provide an output signal to various output devices capable of storing and/or decoding the output signal such as the system 13.
  • decoding can encompass all or part of the processes performed, for example, on a received encoded video stream in order to produce a final output suitable for display.
  • processes include one or more of the processes typically performed by a decoder, for example, entropy decoding, inverse quantization, inverse transformation, and prediction.
  • processes also, or alternatively, include processes performed by a decoder of various implementations described in this application, for example, for reconstruction of motion information.
  • decoding process is intended to refer specifically to a subset of operations or generally to the broader decoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art. 2023PF00408 Various implementations involve encoding.
  • encoding as used in this application can encompass all or part of the processes performed, for example, on an input video sequence in order to produce an encoded video stream.
  • processes include one or more of the processes typically performed by an encoder, for example, partitioning, prediction, transformation, quantization, and entropy encoding.
  • such processes also, or alternatively, include processes performed by an encoder of various implementations described in this application, for example, for predicting motion information.
  • encoding process is intended to refer specifically to a subset of operations or generally to the broader encoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art.
  • syntax elements names as used herein are descriptive terms. As such, they do not preclude the use of other syntax element names.
  • Various embodiments refer to rate distortion optimization.
  • the rate distortion optimization is usually formulated as minimizing a rate distortion function, which is a weighted sum of the rate and of the distortion.
  • the approaches may be based on an extensive testing of all encoding options, including all considered modes or coding parameters values, with a complete evaluation of their coding cost and related distortion of a reconstructed signal after coding and decoding.
  • Faster approaches may also be used, to save encoding complexity, in particular with computation of an approximated distortion based on a prediction or a prediction residual signal, not the reconstructed one.
  • Mix of these two approaches can also be used, such as by using an approximated distortion for only some of the possible encoding options, and a complete distortion for other encoding options.
  • Other approaches only evaluate a subset of the possible encoding options.
  • 2023PF00408 many approaches employ any of a variety of techniques to perform the optimization, but the optimization is not necessarily a complete evaluation of both the coding cost and related distortion.
  • the implementations and aspects described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal.
  • An apparatus can be implemented in, for example, appropriate hardware, software, and firmware.
  • the methods can be implemented, for example, in a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device.
  • processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users.
  • PDAs portable/personal digital assistants
  • references to “one embodiment” or “an embodiment” or “one implementation” or “an implementation”, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment.
  • the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout this application are not necessarily all referring to the same embodiment. Additionally, this application may refer to “determining” various pieces of information. Determining the information can include one or more of, for example, estimating the information, calculating the information, predicting the information, retrieving the information from memory or obtaining the information for example from another device, module or from user.
  • this application may refer to “accessing” various pieces of information. Accessing the information can include one or more of, for example, receiving the information, retrieving the information (for example, from memory), storing the information, moving the information, copying the information, calculating the information, determining the information, predicting the information, or estimating the information. 2023PF00408 Additionally, this application may refer to “receiving” various pieces of information. Receiving is, as with “accessing”, intended to be a broad term. Receiving the information can include one or more of, for example, accessing the information, or retrieving the information (for example, from memory).
  • “receiving” is typically involved, in one way or another, during operations such as, for example, storing the information, processing the information, transmitting the information, moving the information, copying the information, erasing the information, calculating the information, determining the information, predicting the information, or estimating the information. It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, “one or more of” for example, in the cases of “A and/or B” and “at least one of A and B”, “one or more of A and B” is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B).
  • the word “signal” refers to, among other things, indicating something to a corresponding decoder.
  • the encoder signals a use of some coding tools.
  • the same parameters can be used at both the encoder side and the decoder side.
  • an encoder can transmit (explicit signaling) a particular parameter to the decoder so that the decoder can use the same particular parameter.
  • signaling can be used without transmitting (implicit signaling) to simply allow the decoder to know and select the particular parameter. By avoiding transmission of any actual functions, a bit savings is realized in various embodiments.
  • signaling can be accomplished in a variety of ways. For example, one or more syntax elements, flags, 2023PF00408 and so forth are used to signal information to a corresponding decoder in various embodiments. While the preceding relates to the verb form of the word “signal”, the word “signal” can also be used herein as a noun.
  • implementations can produce a variety of signals formatted to carry information that can be, for example, stored or transmitted. The information can include, for example, instructions for performing a method, or data produced by one of the described implementations.
  • a signal can be formatted to carry the encoded video stream and SEI messages of a described embodiment.
  • Such a signal can be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal.
  • the formatting can include, for example, encoding an encoded video stream and modulating a carrier with the encoded video stream.
  • the information that the signal carries can be, for example, analog or digital information.
  • the signal can be transmitted over a variety of different wired or wireless links, as is known.
  • the signal can be stored on a processor-readable medium. The following embodiments try to take advantage of the redundant (or residual) information existing in the encoding of the motion information in the form of an index on a MVP and a MVd.
  • the proposed embodiments are based on a computation of a validity checking criterion to detect a non-valid combination of motion vector predictor (MVP) and motion vector difference (MVD), and, according to the computed criteria, to infer the motion vector predictor used to reconstruct a motion vector of a current block.
  • MVP motion vector predictor
  • MVD motion vector difference
  • the same criteria is computed to detect a possible non-valid combination of MVP and MVD to code.
  • the MVP used is simply not signaled in the encoded video bit-stream (i.e., in the encoded video data).
  • Fig.8 illustrates an example of motion vector reconstruction process at decoder side of an embodiment.
  • 2023PF00408 S uppose a motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ as in Fig.8A is signaled in video data.
  • a decoding-side validity checking process is applied to determine a validity of a combination of a motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • the validity checking process is based on the following validity check criterion: ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • cost() is defined as the L1 norm of a vector, as a rate cost associated to coding that motion vector, as a codeword length resulting from a binarization of the motion vector, etc.
  • Fig.9 illustrates an encoding algorithm according to a first embodiment.
  • the encoding algorithm is for example executed by the processing module 500 of the system 11 when the system 11 implements the encoding module of Fig. 3.
  • the 2023PF00408 encoding algorithm is for example executed during step 308.
  • the processing module 500 obtains a motion vector predictor for a motion vector ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ of a current block (obtained in step 304).
  • the processing module 500 selects the motion vector predictor minimizing the motion vector difference among the motion vector predictor candidates ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ of the AMVP list.
  • the selected motion vector predictor candidate noted ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ , is represented by its index in the AMVP list idx.
  • the processing module 500 computes the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ : ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • the validity checking process comprises determining from the validity checking criterion of equations Eq.1, Eq.2 or Eq.3 that the non-selected motion vector predictor of the AMVP list ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the motion information difference form a valid combination.
  • step 904 the processing module 500 computes the validity checking criterion.
  • step 905 the processing module 500 determines if the combination of ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the motion information difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ is valid, i.e. if the above inequation is not fulfilled.
  • I f the combination of ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the motion information difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ is valid
  • step 905 is followed by a step 906.
  • step 906 the processing module 500 encodes the motion vector predictor index idx in the video data 311.
  • step 907 The encoding algorithm ends in a step 907.
  • step 905 is followed directly by step 907.
  • the encoding of the index idx is omitted.
  • the processing module 500 applies the validity checking process in steps 904 and 905 comprising determining from the validity checking criterion that the motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ form a valid combination.
  • the index idx of the motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ is encoded in the video data 311 only responsive to the motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ form a valid combination.
  • the process of Fig. 9 is applied with regards to both reference pictures (i.e., is applied successively to each reference picture) used for the temporal prediction of the current block.
  • Fig.10 illustrates a decoding algorithm according to the first embodiment.
  • the decoding algorithm is for example executed by the processing module 500 of the system 13 when the system 13 implements the decoding module of Fig. 4.
  • the decoding algorithm is for example executed during step 408.
  • the processing module 500 decodes the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ from the video data 311.
  • the processing module 500 applies the validity checking process on each MVP of the AMVP list (i.e., ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ and ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ) of a current block.
  • the processing module 500 calculates the validity checking criterion on the MVP ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • step 1002 1004 the processing 2023PF00408 m odule 500 infers the value of the index idx of the motion vector predictor ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ o r ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • Step 1004 is followed by a step 1005.
  • the processing module 5 00 reconstructs the motion vector of the current block ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ as the sum of the MVP of the AMVP list identified by the index idx and the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ : ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇
  • Step 1003 is followed by step 1005 already explained.
  • the process of Fig.10 is applied with regards to both reference pictures (i.e., is applied successively to each reference picture) used for the temporal prediction of the current block.
  • the parsing of the motion information of the current block requires the complete reconstruction of the motion information of the blocks providing the MVP of the AMVP list of the current block. Such dependencies may reduce the possibilities of parallelizing the processing of successive blocks in a picture.
  • two variants of the first embodiment of Fig. 9 and 10 allows making the parsing of the motion information of a current block less dependent on the motion vector reconstruction of successive blocks.
  • MVP candidates issued from the decoding of blocks sufficiently far from the current block are considered to code and decode the motion information of the current 2023PF00408 block in AMVP mode.
  • pictures are divided in decoding entities.
  • the decoding entity could be for instance a CTU or a Virtual Processing Decoding Unit (VPDU) defined as a 64x64 luma samples area for instance.
  • VPDU is used as the decoding process unit, that is, the elementary picture area that can be decoded on a chip memory at a time. Only blocks outside the decoding entity comprising the current block are considered to compute the list of MVP candidates for the current block.
  • the MVP flag information representing the MVP index may be hidden in some other coded data of the video data 311, in a way that the parsing process is the same whether the MVP index information is transmitted to the decoder or not. This information, when present is thus transmitted, for example in the parity bit of the coded syntax element representing the motion vector difference ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ .
  • the advantage of this approach is that the decoder side inference of the MVP index idx is made possible without making the bit-stream parsing dependent on the motion vector decoding process.
  • the method of the first embodiment is applied is case of use of the MMVD (merge with motion vector difference) mode.
  • the MMVD mode can be viewed as a kind of merge mode in which a merge candidate is refined by a motion vector difference.
  • the motion vector difference is signaled and is additively applied to the selected MVP.
  • the same principle as in first embodiment is applied to the two first merge candidates of the list of merge candidates when MMVD is used.
  • only the MMVD motion vector difference is signaled in the video data 311 in case one of the two first merge candidates is non-valid according to the validity checking criterion of equations Eq. 1, Eq. 2 or Eq. 3. Otherwise, if both merge candidates that can use MMVD motion vector difference are valid, then some signaling is done to indicate the actual MVP used to derive the motion information of the current block.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated sequence parameter set (SPS) signaling flag.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively be activated/deactivated by 2023PF00408 means of a dedicated picture parameter set (PPS) signaling flag.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated picture header syntax element.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated slice header syntax element.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated sub-picture level syntax element.
  • the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated CTU level syntax element.
  • a TV, set-top box, cell phone, tablet, or other electronic device that receives (e.g. using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described.
  • a server, camera, cell phone, tablet or other electronic device that transmits (e.g. using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method comprising: decoding (1000) a motion information difference for a current block from video data; applying (1001, 1002) a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring (1004) a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding (1003) the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing (1005) the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value.

Description

2023PF00408 INFERRING MOTION VECTOR PREDICTOR BASED ON MOTION VECTOR DIFFERENCE 1. CROSS REFERENCE TO RELATED APPLICATIONS This application claims priority to European Application No.23305766.0, filed May 15, 2023, which is incorporated herein by reference in its entirety. 2. TECHNICAL FIELD At least one of the present embodiments generally relates to a method and a device for improving encoding of motion information in video compression methods. 3. BACKGROUND To achieve high compression efficiency, video coding schemes usually employ predictions and transforms to leverage spatial and temporal redundancies in a video content. During an encoding, pictures of the video content are divided into blocks of samples (i.e., Pixels), these blocks being then partitioned into one or more sub-blocks, called original sub-blocks in the following. An intra or inter prediction is then applied to each sub-block to exploit intra or inter image correlations. Whatever the prediction method used (intra or inter), a predictor sub-block is determined for each original sub- block. Then, a sub-block representing a difference between the original sub-block and the predictor sub-block, often denoted as a prediction error sub-block, a prediction residual sub-block or simply a residual sub-block, is transformed, quantized and entropy coded to generate an encoded video stream. To reconstruct the video, the compressed data is decoded by inverse processes corresponding to the transform, quantization and entropic coding. Inter prediction consists in predicting a current block of a current picture from at least one predictor block from a reference picture preceding or following the current picture. A predictor block is identified in the reference picture by motion information. As the video compression methods evolve, the coding of the motion information has been greatly improved. In particular, motion information is now predicted using methods such Adaptive Motion Vector Prediction (AMVP) or merge. Basically, these methods comprise a construction of a list of motion vector predictors (MVPs), a 2023PF00408 selection of a best MVP in the list and an encoding of the motion vector of the current block in the form of an index representing the selected MVP along, in some cases, a motion difference between the selected MVP and the actual motion vector of the current block. It appears that the representation of a motion vector in the form of an index of a MVP along with a motion vector difference carry some residual information on some characteristics of the motion. The existence of such residual motion information could be considered as indicating that the representation of the motion vector is sub-optimal and could still be improved. It is desirable to propose solutions allowing to overcome the above issues. In particular, it is desirable to propose solutions improving the coding of the motion information. 4. BRIEF SUMMARY In a first aspect, one or more of the present embodiments provide a method comprising: decoding a motion information difference for a current block from video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value. 2023PF00408 In a second aspect, one or more of the present embodiments provide a method comprising: obtaining a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating a motion information difference between the motion information of the current block and the first motion information predictor; encoding the motion information difference in video data; and, applying a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, encoding a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination. In an embodiment of the first or the second aspect, the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode. In an embodiment of the first or the second aspect, responsive to the list is a list of AMVP motion information predictors, the method is applied successively to each reference picture used for a temporal prediction of the current block. In an embodiment of the first or the second aspect, responsive to the list is a list of motion information predictors according to the MMVD mode, the method is applied to the two first motion information predictors of the list. In an embodiment of the first or the second aspect, the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from the coding tree unit or a virtual processing decoding unit comprising the current block. 2023PF00408 In an embodiment of the first or the second aspect, responsive to the combination of each motion information predictor of the list and the motion information difference is valid, the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference. In a third aspect, one or more of the present embodiments provide a device comprising electronic circuitry configured for applying a process comprising: decoding a motion information difference for a current block from video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value. In a fourth aspect, one or more of the present embodiments provide a device comprising electronic circuitry configured for applying a process comprising: obtaining a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating a motion information difference between the motion information of the current block and the first motion information predictor; 2023PF00408 encoding the motion information difference in video data; applying a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, encoding a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination. In an embodiment of the third or the fourth aspect, the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode. In an embodiment of the third or the fourth aspect, responsive to the list is a list of AMVP motion information predictors, the electronic circuitry is configured to apply the process successively to each reference picture used for a temporal prediction of the current block. In an embodiment of the third or the fourth aspect, responsive to the list is a list of motion information predictors according to the MMVD mode, the electronic circuitry is configured to apply the process to the two first motion information predictors of the list. In an embodiment of the third or the fourth aspect, the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from a coding tree unit or a virtual processing decoding unit comprising the current block. In an embodiment of the third or the fourth aspect, responsive to the combination of each motion information predictor of the list and the motion information difference is valid, the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference. In a fifth aspect, one or more of the present embodiments provide a computer program comprising program code instructions for implementing the method according to the first or the second aspect. 2023PF00408 In a sixth aspect, one or more of the present embodiments provide a non- transitory information storage medium storing program code instructions for implementing the method according to the first or the second aspect. 5. BRIEF SUMMARY OF THE DRAWINGS Fig. 1 illustrates an example of context in which various embodiments may be implemented; Fig. 2 illustrates schematically an example of partitioning undergone by a picture of pixels of an original video; Fig.3 depicts schematically a method for encoding a video stream; Fig.4 depicts schematically a method for decoding an encoded video stream; Fig. 5A illustrates schematically an example of hardware architecture of a processing module able to implement an encoding module or a decoding module in which various aspects and embodiments are implemented; Fig. 5B illustrates a block diagram of an example of a first system in which various aspects and embodiments are implemented; Fig.5C illustrates a block diagram of an example of a second system in which various aspects and embodiments are implemented; Figs.6A and 6B and illustrates schematically spatial and temporal positions considered for constructing a list of merge candidates; Figs.7A and 7B illustrates motion coding configurations in AMVP mode; Fig.8 illustrates an example of motion vector reconstruction process of an embodiment; Fig.9 illustrates an embodiment of an encoding algorithm; and, Fig.10 illustrate an embodiment of a decoding algorithm. 6. DETAILED DESCRIPTION The following examples of embodiments are described in the context of a video format similar to VVC (ISO/IEC 23090-3 – MPEG-I : Versatile Video Coding (VVC) / ITU-T H.266). However, these embodiments are not limited to the video coding/decoding method corresponding to VVC. These embodiments are in particular 2023PF00408 adapted to various video formats comprising for example HEVC (ISO/IEC 23008-2 – MPEG-H Part 2, High Efficiency Video Coding / ITU-T H.265)), AVC ((ISO/CEI 14496-10), EVC (Essential Video Coding/MPEG-5), AV1, AV2 and VP9. Fig.1 describes an example of a context in which following embodiments can be implemented. In Fig. 1, a system 11, that could be a camera, a storage device, a computer, a server or any device capable of delivering a video stream, transmits a video stream to a system 13 using a communication channel 12. The video stream is either encoded and transmitted by the system 11 or received and/or stored by the system 11 and then transmitted. The communication channel 12 is a wired (for example Internet or Ethernet) or a wireless (for example WiFi, 3G, 4G or 5G) network link. The system 13, that could be for example a set top box, receives and decodes the video stream to generate a sequence of decoded pictures. The obtained sequence of decoded pictures is then transmitted to a display system 15 using a communication channel 14, that could be a wired or wireless network. The display system 15 then displays said pictures. In an embodiment, the system 13 is comprised in the display system 15. In that case, the system 13 and display 15 are comprised in a TV, a computer, a tablet, a smartphone, a head-mounted display, etc. Figs.2, 3 and 4 introduce an example of video format. Fig.2 illustrates an example of partitioning undergone by a picture of pixels 21 of an original video sequence 20. It is considered here that a pixel is composed of three components: a luminance component and two chrominance components. Other types of pixels are however possible comprising less or more components such as only a luminance component or an additional depth component or transparency component. A picture is divided into a plurality of coding entities. First, as represented by reference 23 in Fig. 2, a picture is divided in a grid of blocks called coding tree units (CTU). A CTU consists of an ^^ ൈ ^^ block of luminance samples together with two corresponding blocks of chrominance samples. N is generally a power of two having a maximum value of “128” for example. Second, a picture is divided into one or more groups of CTU. For example, it can be divided into one or more tile rows and tile columns, a tile being a sequence of CTU covering a rectangular region of a picture. In some cases, a tile could be divided into one or more bricks, each of which consisting of 2023PF00408 at least one row of CTU within the tile. Above the concept of tiles and bricks, another encoding entity, called slice, exists, that can contain at least one tile of a picture or at least one brick of a tile. In the example in Fig.2, as represented by reference 22, the picture 21 is divided into three slices S1, S2 and S3 of the raster-scan slice mode, each comprising a plurality of tiles (not represented), each tile comprising only one brick. As represented by reference 24 in Fig. 2, a CTU may be partitioned into the form of a hierarchical tree of one or more sub-blocks called coding units (CU). The CTU is the root (i.e., the parent node) of the hierarchical tree and can be partitioned in a plurality of CU (i.e. child nodes). Each CU becomes a leaf of the hierarchical tree if it is not further partitioned in smaller CU or becomes a parent node of smaller CU (i.e., child nodes) if it is further partitioned. In the example of Fig.2, the CTU 24 is first partitioned in “4” square CU using a quadtree type partitioning. The upper left CU is a leaf of the hierarchical tree since it is not further partitioned, i.e., it is not a parent node of any other CU. The upper right CU is further partitioned in “4” smaller square CU using again a quadtree type partitioning. The bottom right CU is vertically partitioned in “2” rectangular CU using a binary tree type partitioning. The bottom left CU is vertically partitioned in “3” rectangular CU using a ternary tree type partitioning. During the coding of a picture, the partitioning is adaptive, each CTU being partitioned so as to optimize a compression efficiency of the CTU criterion. In HEVC appeared the concept of prediction unit (PU) and transform unit (TU). Indeed, in HEVC, the coding entity that is used for prediction (i.e., a PU) and transform (i.e., a TU) can be a subdivision of a CU. For example, as represented in Fig.1, a CU of size 2 ^^ ൈ 2 ^^, can be divided in PU 2411 of size ^^ ൈ 2 ^^ or of size 2 ^^ ൈ ^^. In addition, said CU can be divided in “4” TU 2412 of size ^^ ൈ ^^ or in “16” TU of size ^ ே ଶ^ ൈ ^ ^. One can note that in VVC, except in some particular cases, frontiers of the TU and PU are aligned on the frontiers of the CU. Consequently, a CU comprises generally one TU and one PU. In the present application, the term “block” or “picture block” can be used to refer to any one of a CTU, a CU, a PU and a TU. In addition, the term “block” or “picture block” can be used to refer to a macroblock, a partition and a sub-block as 2023PF00408 specified in H.264/AVC or in other video coding standards, and more generally to refer to an array of samples of numerous sizes. In the present application, the terms “reconstructed” and “decoded” may be used interchangeably, the terms “pixel” and “sample” may be used interchangeably, the terms “image,” “picture”, “sub-picture”, “slice” and “frame” may be used interchangeably. Usually, but not necessarily, the term “reconstructed” is used at the encoder side while “decoded” is used at the decoder side. Fig.3 depicts schematically a method for encoding a video stream executed by an encoding module. Variations of this method for encoding are contemplated, but the method for encoding of Fig. 3 is described below for purposes of clarity without describing all expected variations. Before being encoded, a current original picture of an original video sequence may go through a pre-processing. For example, in a step 301, a color transform is applied to the current original picture (e.g., conversion from RGB 4:4:4 to YCbCr 4:2:0), or a remapping is applied to the current original picture components in order to get a signal distribution more resilient to compression (for instance using a histogram equalization of one of the color components). Pictures obtained by pre-processing are called pre-processed pictures in the following. The encoding of a pre-processed picture begins with a partitioning of the pre- processed picture during a step 302, as described in relation to Fig.2. The pre-processed picture is thus partitioned into CTU, CU, PU, TU, etc. For each block, the encoding module determines a coding mode between an intra prediction and an inter prediction. The intra prediction consists of predicting, in accordance with an intra prediction method, during a step 303, the pixels of a current block from a prediction block derived from pixels of reconstructed blocks situated in a causal vicinity of the current block to be coded. The result of the intra prediction is a prediction mode indicating which pixels of the blocks in the vicinity to use, and a residual block resulting from a calculation of a difference between the current block and the prediction block. The inter prediction consists of predicting the pixels of a current block from a block of pixels, referred to as the reference block, of a picture preceding or following the current picture, this picture being referred to as the reference picture. During the coding of a current block in accordance with the inter prediction method, a block of the reference picture closest, in accordance with a similarity criterion, to the current block 2023PF00408 is determined by a motion estimation step 304. During step 304, a motion vector indicating the position of the reference block in the reference picture is determined. Said motion vector is used during a motion compensation step 305, involving interpolation operations between samples of the reference block, to generate a prediction block. A residual block is then calculated in the form of a difference between the current block and the prediction block. In first video compression standards, the uni-prediction inter mode described above was the only inter mode available. As video compression standards evolve, the family of inter modes has grown significantly and comprises now many different inter modes, such as bi-prediction modes in which a current is predicted from two reference blocks designated by two different motion information. During a selection step 306, the prediction mode optimising the compression performances, in accordance with a rate/distortion optimization criterion (i.e., RDO criterion), among the prediction modes tested (Intra prediction modes, Inter prediction modes), is selected by the encoding module. When the prediction mode is selected, the residual block is transformed during a step 307. In some implementations, a plurality of type of transforms can be applied to a transformed residual block. Indeed, in addition to DCT-II, a Multiple Transform Selection (MTS) scheme is used for both inter and intra predicted blocks. It uses multiple selected transforms from the DCT-VIII/DST-VII. The transformed block is then quantized during a step 309. Note that the encoding module can skip the transform and apply quantization directly to the non-transformed residual signal. The quantized residual block determined for the current block during an inter or intra prediction is encoded by an entropic encoder during a step 310. Note that the encoding module can bypass both transform and quantization, i.e., the entropic encoding is applied on the residual without the application of the transform or quantization processes. The result of the entry coding is inserted in the video data 311. When the current block is coded according to an intra prediction mode, the intra prediction mode is encoded by the entropic encoder during the step 310 in the video data 311. 2023PF00408 When the current block is encoded according to an inter prediction, a process is applied to encode the motion information. The output of this process is then encoded by the entropic encoder during the step 310 in the video data 311. Two processes are employed to encode the motion information: AMVP (Adaptive Motion Vector Prediction) or Merge. In each process, the motion information are predicted. In an implementation of the AMVP mode, a motion vector predictor (MVP) is selected, and a motion vector difference noted MVd relative to the selected MVP is computed. The MVP is selected in a list of AMVP candidates made of “2” candidates. The index of the chosen MVP and the MVd are then encoded by the entropic encoder during step 310 along with the transformed and quantized residual block resulting from the inter prediction of the current block. The AMVP candidate list is constructed first by deriving a first spatial candidate from a left block neighbouring the current block, if this block is available and inter coded. Then a second spatial candidate is derived from a top block neighbouring the current block, if this block is available and inter coded. Then, a temporal candidate is derived from a so-called collocated picture at a position collocated with the current block, if an inter block exist at this collocated position. Each derived MVP candidate is scaled according to a temporal distance between the reference picture associated to this MVP candidate and the reference picture considered for the current block. A redundancy check is then conducted between derived spatial candidates and, if a duplicate candidate exists, this candidate is discarded. The final AMVP candidate list contains the two first derived MVP candidates. If less than “2” MVP candidates are obtained through the above process, then the AMVP candidate list is completed with zero motion vectors. The merge mode consists in deriving motion information of a current block from a selected motion information predictor candidate. The motion information considered here includes all the inter prediction parameters of a block, that is to say: the uni-prediction or bi-prediction type, the reference picture index within each reference picture list and the motion vector(s). The selected motion information predictor candidate (i.e., the merge candidate) is selected in a list of motion information predictor candidates (i.e., in a list of merge candidates). When a block is encoded in 2023PF00408 merge mode, the index of the selected merge candidate is encoded. If no residual block is encoded for the current block, the current block is considered as encoded according to a particular merge mode called skip mode. In some implementations, the list of merge candidates is systematically made of “5” merge candidates. Up to “5” spatial positions are considered to retrieve some potential candidates for the list of merge candidates. Fig. 6A illustrates schematically the five spatial positions considered for constructing a list of merge candidates. These positions are investigated according to the following order: 1. Left (A1) 2. Above (B1) 3. Above right (B0) 4. Left bottom (A0) 5. Above left (B2) Each spatial candidate is introduced in the list of merge candidates provided that the motion information corresponding to this candidate is not already present in the list of merge candidates. Then a temporal predictor noted TMVP is determined. Fig. 6B illustrates schematically collocated positions considered for determining the TMVP. The determination of the TMVP consists first in investigating position H and, if no motion information is available at position H, the position C is investigated. A last pruning process is then applied to ensure that the set of spatial and temporal candidates does not contain redundant candidates. In case of B-slice (slice allowing bi-predicted blocks), candidates of another type, called combined candidates, are introduced in the list of merge candidates if this list is not full. Finally, if the merge list is still not full then zero motion vectors are introduced in at the end of the merge list until it is full. Recently, the representation of the motion information has slightly evolved with the apparition of two main categories of motion representation: the whole-block-based motion representation and the sub-block-based motion representation. 2023PF00408 The whole-block-based motion representation consists in assigning one set of motion information, made of one or two motion vectors and associated reference picture(s) to an inter block. Thus, the motion information of that block is represented under the form of one or two motion vectors for the whole block and a reference picture associated to each motion vector. Sub-block-based motion coding mode typically consists in dividing a block into 4x4 or 8x8 luma samples subblocks and assigning an individual set of motion information (one or two couples of a motion vector and a reference picture) to each subblock. While in previous implementations of AMVP only spatial candidates, temporal candidates and the zero-motion vector candidate were considered for constructing the list of AMVP candidates, a new category of candidates, called HMVP (History-Based Motion Vector Prediction) candidates, was added in AMVP implementations adapted to the whole block-based motion representation. A principle of HMVP candidates is to use previously coded motion vectors as MVPs. These motion vectors are associated with adjacent or non-adjacent blocks relative to a current block. To do so, a table of HMVP candidates (i.e., HMVP table) is maintained and updated on the fly, as a first-in-first-out (FIFO) buffer of MVPs. There are up to five candidates in the HMVP table. After coding one inter predicted block, provided that this block is not in sub-block mode (including affine mode) or GPM (geometric partition mode), the HMVP table is updated by appending the motion information of an inter predicted block to the end of the HMVP table as a new HMVP candidate. In addition to the usual FIFO rule, a mechanism to remove redundant HMVP candidates is applied. One can note that the HMVP table is reset at each CTU row to enable parallel processing. In recent implementations of the merge mode adapted to the whole block-based representation, the list of merge candidates was modified and three new merge modes were introduced. The list of merge candidates is constructed with the following types of candidates: ^ Spatial candidates. ^ Temporal candidates. 2023PF00408 ^ HMVP candidates. Several HMVP candidates are inserted into the list of merge candidates so that the list reaches a maximum allowed number of merge candidates minus 1. ^ Pairwise Average candidates. Up to one pairwise average candidate is added to the list of merge candidates. Pairwise candidates are computed as follows: the two first merge candidates present in the list of merge candidates are considered and their motion vectors are averaged. This averaging is computed separately for each reference picture list. If each of the two first merge candidates are bi-prediction ones, motion vectors related to both lists L0 and L1 are averaged. If only one motion vector is present, it is taken as is to form the pairwise candidate. 5. Zero motion vector candidate. The three new merge modes comprise MMVD (Merge Mode with motion vector Difference), GPM (Geometric Partitioning Mode) and CIIP (Combined Intra/Inter Prediction). These new modes are detailed in document JVET-T2002-v2: Algorithm description for Versatile Video Coding and Test Model 11 (VTM 11), Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, 20th Meeting, by teleconference, 7 – 16 October 2020, Jianle Chen, Yan Ye, Seung Hwan Kim. MMVD can be viewed as a kind of merge mode in which a merge candidate is refined by a MVd. In MMVD, after a merge candidate is selected, it is further refined by a signalled MVd information. The signaling of a MMVD mode comprises a merge candidate flag, an index to specify a motion magnitude and an index indicating a motion direction. The merge candidate flag is signalled to specify which one is used between the first and second merge candidates. The index specifying a motion magnitude and the index indicating a motion direction allow signaling a limited number of motion vector differences (MVd) on top of a signaled merge candidate, i.e., “4” vector directions and “8” magnitude values. 2023PF00408 Fig. 7A illustrates a motion coding configuration in AMVP mode. A block to predict is shown together with his two AMVP motion vector prediction candidates^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ , and a motion vector difference ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ to signal. It is established that, in existing implementations of AMVP, the coding cost of a motion difference is highly correlated to the magnitude of the motion vector difference and increases as a function of the ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ magnitude. Fig.7B shows that the coding configuration of the Fig.7A can also be reached by using the motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ instead of ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ , leading to a smaller motion vector difference magnitude. In the following embodiments, it is shown that the situation of Fig. 7A can be detected as non-optimal by a decoding module. Indeed, in these embodiments, the decoder is able to detect that the use of motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ is not the most optimal one to employ for pointing to the spatial position corresponding to ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ + ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ , as shown in Fig. 7B. The decoder is able to detect that the use of motion vector predictor ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ is the most likely one, compared to ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ . Therefore, it can be considered that in the configuration of Fig. 7A, the signaling of the motion vector predictor using an index of the chosen motion vector predictor and the motion vector difference carry some redundant (or residual) information that can be exploited. Metadata such as SEI (supplemental enhancement information) messages can be attached to the encoded video stream 311. A SEI message as defined for example in standards such as AVC, HEVC or VVC is a data container associated to a video stream and comprising metadata providing information relative to the video stream. After the quantization step 309, the current block is reconstructed so that the pixels corresponding to that block can be used for future predictions. This reconstruction phase is also referred to as a prediction loop. An inverse quantization is therefore applied to the transformed and quantized residual block during a step 312 and an inverse transformation is applied during a step 313. According to the prediction mode used for the block obtained during a step 314, the prediction block of the block is reconstructed. If the current block is encoded according to an inter prediction mode, the encoding module applies, when appropriate, during a step 316, a motion compensation using the motion information of the current block in order to identify 2023PF00408 each reference block of the current block. If the current block is encoded according to an intra prediction mode, during a step 315, the intra prediction mode is used for reconstructing the prediction block of the current block. The prediction block and the reconstructed residual block are added in order to obtain the reconstructed current block. Following the reconstruction, an in-loop filtering intended to reduce the encoding artefacts is applied, during a step 317, to the reconstructed block. This filtering is called in-loop filtering since this filtering occurs in the prediction loop to obtain at the decoder the same reference pictures as the encoder and thus avoid a drift between the encoding and the decoding processes. In-loop filtering tools comprises deblocking filtering, SAO (Sample adaptive Offset) and ALF (Adaptive Loop Filtering). When a block is reconstructed, it is inserted during a step 318 into a reconstructed picture stored in a memory 319 of reconstructed pictures generally called Decoded Picture Buffer (DPB). The reconstructed pictures thus stored can then serve as reference pictures for other pictures to be coded. Fig. 4 depicts schematically a method for decoding the encoded video stream 311 encoded according to method described in relation to Fig.3 executed by a decoding module. Variations of this method for decoding are contemplated, but the method for decoding of Fig. 4 is described below for purposes of clarity without describing all expected variations. The decoding is done block by block. For a current block, it starts with an entropic decoding of the current block during a step 410. Entropic decoding allows to obtain, at least, the prediction mode of the block. If the block has been encoded according to an inter prediction mode, the entropic decoding allows to obtain, when appropriate, information representative of a motion of the current block and a residual block. During a step 408, the motion information is reconstructed for the current block using the decoded information representative of the motion information. In AMVP, the information representative of the motion comprises an index of the AMVP MVP in the AMVP list and the MVd. The MVd is then added to the AMVP MVP to reconstruct the motion vector of the block. In merge, an index of the merge MVP in the merge list is obtained. The merge MVP corresponding to the index provides the motion information of the current block. In 2023PF00408 MMVD, the information representative of the motion of the current block comprises some MMVD indices representative of a MVd. The MVd corresponding to the MMVD indices is then added to the merge MVP to determine the motion information of the current block. If the block has been encoded according to an intra prediction mode, entropic decoding allows to obtain the intra prediction mode and a residual block. Steps 412, 413, 414, 415, 416 and 417 implemented by the decoding module are in all respects identical respectively to steps 412, 413, 414, 415, 416 and 417 implemented by the encoding module. Decoded blocks are saved in decoded pictures and the decoded pictures are stored in a DPB 419 in a step 418. When the decoding module decodes a given picture, the pictures stored in the DPB 419 are identical to the pictures stored in the DPB 319 by the encoding module during the encoding of said given image. The decoded picture can also be outputted by the decoding module for instance to be displayed. The post-processing step 421 can comprise an inverse color transform (e.g. conversion from YCbCr 4:2:0 to RGB 4:4:4), an inverse mapping performing the inverse of the remapping process performed in the pre-processing of step 301 and a post-filtering for improving the reconstructed pictures based for example on filter parameters provided in a SEI message. Fig. 5A illustrates schematically an example of hardware architecture of a processing module 500 able to implement an encoding module or a decoding module capable of implementing respectively a method for encoding of Fig.3 and a method for decoding of Fig. 4 modified according to different aspects and embodiments. The encoding module is for example comprised in the system 11 when this apparatus is in charge of encoding the video stream. The decoding module is for example comprised in the system 13. The processing module 500 comprises, connected by a communication bus 5005: a processor or CPU (central processing unit) 5000 encompassing one or more microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture, as non-limiting examples; a random access memory (RAM) 5001; a read only memory (ROM) 5002; a storage unit 5003, which can include non-volatile memory and/or volatile memory, including, but not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Read-Only Memory (ROM), Programmable Read-Only Memory 2023PF00408 (PROM), Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash, magnetic disk drive, and/or optical disk drive, or a storage medium reader, such as a SD (secure digital) card reader and/or a hard disc drive (HDD) and/or a network accessible storage device; at least one communication interface 5004 for exchanging data with other modules, devices or equipment. The communication interface 5004 can include, but is not limited to, a transceiver configured to transmit and to receive data over a communication channel. The communication interface 5004 can include, but is not limited to, a modem or network card. If the processing module 500 implements a decoding module, the communication interface 5004 enables for instance the processing module 500 to receive encoded video streams and to provide a sequence of decoded pictures. If the processing module 500 implements an encoding module, the communication interface 5004 enables for instance the processing module 500 to receive a sequence of original picture data to encode and to provide an encoded video stream. The processor 5000 is capable of executing instructions loaded into the RAM 5001 from the ROM 5002, from an external memory (not shown), from a storage medium, or from a communication network. When the processing module 500 is powered up, the processor 5000 is capable of reading instructions from the RAM 5001 and executing them. These instructions form a computer program causing, for example, the implementation by the processor 5000 of a decoding method as described in relation with Fig.4, an encoding method described in relation to Fig.3, and methods described in relation to Figs.6 or 7, these methods comprising various aspects and embodiments described below in this document. All or some of the algorithms and steps of the methods of Figs. 3, 4, 6 and 7 may be implemented in software form by the execution of a set of instructions by a programmable machine such as a DSP (digital signal processor) or a microcontroller, or be implemented in hardware form by a machine or a dedicated component such as a FPGA (field-programmable gate array) or an ASIC (application-specific integrated circuit). As can be seen, microprocessors, general purpose computers, special purpose computers, processors based or not on a multi-core architecture, DSP, microcontroller, FPGA and ASIC are electronic circuitry adapted to implement at least partially the 2023PF00408 methods of Figs.3, 4, 6 and 7. Fig. 5C illustrates a block diagram of an example of the system 13 in which various aspects and embodiments are implemented. The system 13 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances and head mounted display. Elements of system 13, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components. For example, in at least one embodiment, the system 13 comprises one processing module 500 that implements a decoding module. In various embodiments, the system 13 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports. In various embodiments, the system 13 is configured to implement one or more of the aspects described in this document. The input to the processing module 500 can be provided through various input modules as indicated in block 531. Such input modules include, but are not limited to, (i) a radio frequency (RF) module that receives an RF signal transmitted, for example, over the air by a broadcaster, (ii) a component (COMP) input module (or a set of COMP input modules), (iii) a Universal Serial Bus (USB) input module, and/or (iv) a High Definition Multimedia Interface (HDMI) input module. Other examples, not shown in FIG.5D, include composite video. In various embodiments, the input modules of block 531 have associated respective input processing elements as known in the art. For example, the RF module can be associated with elements suitable for (i) selecting a desired frequency (also referred to as selecting a signal, or band-limiting a signal to a band of frequencies), (ii) down-converting the selected signal, (iii) band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments, (iv) demodulating the down-converted and band- limited signal, (v) performing error correction, and (vi) demultiplexing to select the desired stream of data packets. The RF module of various embodiments includes one or more elements to perform these functions, for example, frequency selectors, signal 2023PF00408 selectors, band-limiters, channel selectors, filters, downconverters, demodulators, error correctors, and demultiplexers. The RF portion can include a tuner that performs various of these functions, including, for example, down-converting the received signal to a lower frequency (for example, an intermediate frequency or a near-baseband frequency) or to baseband. In one set-top box embodiment, the RF module and its associated input processing element receives an RF signal transmitted over a wired (for example, cable) medium, and performs frequency selection by filtering, down- converting, and filtering again to a desired frequency band. Various embodiments rearrange the order of the above-described (and other) elements, remove some of these elements, and/or add other elements performing similar or different functions. Adding elements can include inserting elements in between existing elements, such as, for example, inserting amplifiers and an analog-to-digital converter. In various embodiments, the RF module includes an antenna. Additionally, the USB and/or HDMI modules can include respective interface processors for connecting system 13 to other electronic devices across USB and/or HDMI connections. It is to be understood that various aspects of input processing, for example, Reed-Solomon error correction, can be implemented, for example, within a separate input processing IC or within the processing module 500 as necessary. Similarly, aspects of USB or HDMI interface processing can be implemented within separate interface ICs or within the processing module 500 as necessary. The demodulated, error corrected, and demultiplexed stream is provided to the processing module 500. Various elements of system 13 can be provided within an integrated housing. Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards. For example, in the system 13, the processing module 500 is interconnected to other elements of said system 13 by the bus 5005. The communication interface 5004 of the processing module 500 allows the system 13 to communicate on the communication channel 12. As already mentioned above, the communication channel 12 can be implemented, for example, within a wired and/or a wireless medium. Data is streamed, or otherwise provided, to the system 13, in various 2023PF00408 embodiments, using a wireless network such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to the Institute of Electrical and Electronics Engineers). The Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications. The communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications. Other embodiments provide streamed data to the system 13 using the RF connection of the input block 531. As indicated above, various embodiments provide data in a non- streaming manner. Additionally, various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network. The system 13 can provide an output signal to various output devices, including the display system 15, speakers 56, and other peripheral devices 57. The display system 15 of various embodiments includes one or more of, for example, a touchscreen display, an organic light-emitting diode (OLED) display, a curved display, and/or a foldable display. The display 15 can be for a television, a tablet, a laptop, a cell phone (mobile phone), a head mounted display or other devices. The display system 15 can also be integrated with other components (for example, as in a smart phone), or separate (for example, an external monitor for a laptop). The other peripheral devices 57 include, in various examples of embodiments, one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, for both terms), a disk player, a stereo system, and/or a lighting system. Various embodiments use one or more peripheral devices 57 that provide a function based on the output of the system 13. For example, a disk player performs the function of playing an output of the system 13. In various embodiments, control signals are communicated between the system 13 and the display system 15, speakers 56, or other peripheral devices 57 using signaling such as AV.Link, Consumer Electronics Control (CEC), or other communications protocols that enable device-to-device control with or without user intervention. The output devices can be communicatively coupled to system 13 via dedicated connections through respective interfaces 532, 533, and 534. Alternatively, the output devices can be connected to system 13 using the communications channel 12 via the communications interface 5004 or a dedicated communication channel via the communication interface 5004. The display system 15 and speakers 56 can be 2023PF00408 integrated in a single unit with the other components of system 13 in an electronic device such as, for example, a television. In various embodiments, the display interface 532 includes a display driver, such as, for example, a timing controller (T Con) chip. The display system 15 and speaker 56 can alternatively be separate from one or more of the other components. In various embodiments in which the display system 15 and speakers 56 are external components, the output signal can be provided via dedicated output connections, including, for example, HDMI ports, USB ports, or COMP outputs. Fig. 5B illustrates a block diagram of an example of the system 11 in which various aspects and embodiments are implemented. System 11 is very similar to system 13. The system 11 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects and embodiments described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, a camera and a server. Elements of system 11, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components. For example, in at least one embodiment, the system 11 comprises one processing module 500 that implements an encoding module. In various embodiments, the system 11 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports. In various embodiments, the system 11 is configured to implement one or more of the aspects described in this document. The input to the processing module 500 can be provided through various input modules as indicated in block 531 already described in relation to Fig.5C. Various elements of system 11 can be provided within an integrated housing. Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangements, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards. For example, in the system 11, the processing module 500 is interconnected to other elements of said system 11 by the bus 5005. The communication interface 5004 of the processing module 500 allows the system 500 to communicate on the communication channel 12. Data is streamed, or otherwise provided, to the system 11, in various 2023PF00408 embodiments, using a wireless network such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to the Institute of Electrical and Electronics Engineers). The Wi- Fi signal of these embodiments is received over the communications channel 12 and the communications interface 5004 which are adapted for Wi-Fi communications. The communications channel 12 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over-the-top communications. Other embodiments provide streamed data to the system 11 using the RF connection of the input block 531. As indicated above, various embodiments provide data in a non-streaming manner. Additionally, various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network. The data provided to the system 11 can be provided in different format. In various embodiments these data are encoded and compliant with a known video compression format such as AV1, VP9, VVC, HEVC, AVC, etc. In various embodiments, these data are raw data provided for example by a picture and/or audio acquisition module connected to the system 11 or comprised in the system 11. In that case, the processing module take in charge the encoding of these data. The system 11 can provide an output signal to various output devices capable of storing and/or decoding the output signal such as the system 13. Various implementations involve decoding. “Decoding”, as used in this application, can encompass all or part of the processes performed, for example, on a received encoded video stream in order to produce a final output suitable for display. In various embodiments, such processes include one or more of the processes typically performed by a decoder, for example, entropy decoding, inverse quantization, inverse transformation, and prediction. In various embodiments, such processes also, or alternatively, include processes performed by a decoder of various implementations described in this application, for example, for reconstruction of motion information. Whether the phrase “decoding process” is intended to refer specifically to a subset of operations or generally to the broader decoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art. 2023PF00408 Various implementations involve encoding. In an analogous way to the above discussion about “decoding”, “encoding” as used in this application can encompass all or part of the processes performed, for example, on an input video sequence in order to produce an encoded video stream. In various embodiments, such processes include one or more of the processes typically performed by an encoder, for example, partitioning, prediction, transformation, quantization, and entropy encoding. In various embodiments, such processes also, or alternatively, include processes performed by an encoder of various implementations described in this application, for example, for predicting motion information. Whether the phrase “encoding process” is intended to refer specifically to a subset of operations or generally to the broader encoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art. Note that the syntax elements names as used herein, are descriptive terms. As such, they do not preclude the use of other syntax element names. When a figure is presented as a flow diagram, it should be understood that it also provides a block diagram of a corresponding apparatus. Similarly, when a figure is presented as a block diagram, it should be understood that it also provides a flow diagram of a corresponding method/process. Various embodiments refer to rate distortion optimization. In particular, during the encoding process, the balance or trade-off between a rate and a distortion is usually considered. The rate distortion optimization is usually formulated as minimizing a rate distortion function, which is a weighted sum of the rate and of the distortion. There are different approaches to solve the rate distortion optimization problem. For example, the approaches may be based on an extensive testing of all encoding options, including all considered modes or coding parameters values, with a complete evaluation of their coding cost and related distortion of a reconstructed signal after coding and decoding. Faster approaches may also be used, to save encoding complexity, in particular with computation of an approximated distortion based on a prediction or a prediction residual signal, not the reconstructed one. Mix of these two approaches can also be used, such as by using an approximated distortion for only some of the possible encoding options, and a complete distortion for other encoding options. Other approaches only evaluate a subset of the possible encoding options. More generally, 2023PF00408 many approaches employ any of a variety of techniques to perform the optimization, but the optimization is not necessarily a complete evaluation of both the coding cost and related distortion. The implementations and aspects described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed can also be implemented in other forms (for example, an apparatus or program). An apparatus can be implemented in, for example, appropriate hardware, software, and firmware. The methods can be implemented, for example, in a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users. Reference to “one embodiment” or “an embodiment” or “one implementation” or “an implementation”, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout this application are not necessarily all referring to the same embodiment. Additionally, this application may refer to “determining” various pieces of information. Determining the information can include one or more of, for example, estimating the information, calculating the information, predicting the information, retrieving the information from memory or obtaining the information for example from another device, module or from user. Further, this application may refer to “accessing” various pieces of information. Accessing the information can include one or more of, for example, receiving the information, retrieving the information (for example, from memory), storing the information, moving the information, copying the information, calculating the information, determining the information, predicting the information, or estimating the information. 2023PF00408 Additionally, this application may refer to “receiving” various pieces of information. Receiving is, as with “accessing”, intended to be a broad term. Receiving the information can include one or more of, for example, accessing the information, or retrieving the information (for example, from memory). Further, “receiving” is typically involved, in one way or another, during operations such as, for example, storing the information, processing the information, transmitting the information, moving the information, copying the information, erasing the information, calculating the information, determining the information, predicting the information, or estimating the information. It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, “one or more of” for example, in the cases of
Figure imgf000028_0001
“A and/or B” and “at least one of A and B”, “one or more of A and B” is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, “one or more of A, B and C” such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as is clear to one of ordinary skill in this and related arts, for as many items as are listed. Also, as used herein, the word “signal” refers to, among other things, indicating something to a corresponding decoder. For example, in certain embodiments the encoder signals a use of some coding tools. In this way, in an embodiment the same parameters can be used at both the encoder side and the decoder side. Thus, for example, an encoder can transmit (explicit signaling) a particular parameter to the decoder so that the decoder can use the same particular parameter. Conversely, if the decoder already has the particular parameter as well as others, then signaling can be used without transmitting (implicit signaling) to simply allow the decoder to know and select the particular parameter. By avoiding transmission of any actual functions, a bit savings is realized in various embodiments. It is to be appreciated that signaling can be accomplished in a variety of ways. For example, one or more syntax elements, flags, 2023PF00408 and so forth are used to signal information to a corresponding decoder in various embodiments. While the preceding relates to the verb form of the word “signal”, the word “signal” can also be used herein as a noun. As will be evident to one of ordinary skill in the art, implementations can produce a variety of signals formatted to carry information that can be, for example, stored or transmitted. The information can include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal can be formatted to carry the encoded video stream and SEI messages of a described embodiment. Such a signal can be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting can include, for example, encoding an encoded video stream and modulating a carrier with the encoded video stream. The information that the signal carries can be, for example, analog or digital information. The signal can be transmitted over a variety of different wired or wireless links, as is known. The signal can be stored on a processor-readable medium. The following embodiments try to take advantage of the redundant (or residual) information existing in the encoding of the motion information in the form of an index on a MVP and a MVd. In particular, the proposed embodiments are based on a computation of a validity checking criterion to detect a non-valid combination of motion vector predictor (MVP) and motion vector difference (MVD), and, according to the computed criteria, to infer the motion vector predictor used to reconstruct a motion vector of a current block. At encoder side, given a motion vector to encode for the current block, the same criteria is computed to detect a possible non-valid combination of MVP and MVD to code. In case such non-valid combination exists, the MVP used is simply not signaled in the encoded video bit-stream (i.e., in the encoded video data). The overall process ensures that the encoded MVD and decoder-side MVP inferring method lead to an identical reconstructed motion vector for the current block on encoder and decoder sides. Fig.8 illustrates an example of motion vector reconstruction process at decoder side of an embodiment. 2023PF00408 Suppose a motion vector difference ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ as in Fig.8A is signaled in video data. In this embodiment, a decoding-side validity checking process is applied to determine a validity of a combination of a motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ . The validity checking process is based on the following validity check criterion: ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ .൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ^ ൯ ^ ଶ ฮ ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ଶ ^ െ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ฮ (Eq. 1) As can be vector
Figure imgf000030_0001
difference ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ and the motion information predictors of the AMVP list (i.e., ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^. This validity check criterion is easy to compute and is strictly equivalent to the following second validity check criterion: ฮ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ଶ ฮ ^ ฮ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ െ ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ଶ ^ ฮ (Eq. 2) which can be
Figure imgf000030_0002
^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ൯ (Eq. 3) Where cost() is defined as the L1 norm
Figure imgf000030_0003
of a vector, as a rate cost associated to coding that motion vector, as a codeword length resulting from a binarization of the motion vector, etc. Intuitively, one easily understands that the above validity check criterion is fulfilled when the MVD, when summed with ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ , points to a spatial location in region ^^^ of Fig. 8, which consists in spatial positions closer to the position indicated by ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ than the position indicated by ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ . Thus, is appears more efficient to use ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ than ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ as the MV predictor to code the desired motion vector in the present case. Since such case can be detected by the decoder, various embodiments propose to infer, at decoder side, the MVP used in combination with the received MVD, rather than parsing a flag indicating the MVP used. Therefore, that flag is no more signaled, which leads to bitrate saving over the existing video coding methods. Fig.9 illustrates an encoding algorithm according to a first embodiment. The encoding algorithm is for example executed by the processing module 500 of the system 11 when the system 11 implements the encoding module of Fig. 3. The 2023PF00408 encoding algorithm is for example executed during step 308. In a step 901, the processing module 500 obtains a motion vector predictor for a motion vector ^ ^ ^^ ^ ^^ ^ ^ ^^^^^^ of a current block (obtained in step 304). During step 901, the processing module 500 selects the motion vector predictor minimizing the motion vector difference among the motion vector predictor candidates ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ of the AMVP list. The selected motion vector predictor candidate, noted ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^^ ^^ ௫ , is represented by its index in the AMVP list idx. In a step 902, the processing module 500 computes the motion vector difference^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ : ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൌ ^ ^ ^^ ^ ^^ ^ ^ ^^^^^^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ௗ௫ In a step 903, the the motion vector difference
Figure imgf000031_0001
in the encoded video stream (i.e., in the video data) 311. In step 904 and 905, the processing module 500 applies a validity checking process. The validity checking process comprises determining from the validity checking criterion of equations Eq.1, Eq.2 or Eq.3 that the non-selected motion vector predictor of the AMVP list ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion information difference form a valid combination. For instance, applying equation Eq.3: ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^^ ି ^^^ ^^ ^^ ௫ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^^ ^^ ௫൯ In step 904, the processing module 500 computes the validity checking criterion. In step 905, the processing module 500 determines if the combination of^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion information difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ is valid, i.e. if the above inequation is not fulfilled. If the combination of ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion information difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ is valid step 905 is followed by a step 906. In step 906, the processing module 500 encodes the motion vector predictor index idx in the video data 311. The encoding algorithm ends in a step 907. If the combination of ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion information difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ is not valid, step 905 is followed directly by step 907. In other words, the encoding of the index idx is omitted. 2023PF00408 As can be seen, the processing module 500 applies the validity checking process in steps 904 and 905 comprising determining from the validity checking criterion that the motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion vector difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ form a valid combination. The index idx of the motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ௗ௫ is encoded in the video data 311 only responsive to the motion vector predictor ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ି^ௗ௫ and the motion vector difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ form a valid combination. Note for a bi-predicted block in AMVP mode, the process of Fig. 9 is applied with regards to both reference pictures (i.e., is applied successively to each reference picture) used for the temporal prediction of the current block. Fig.10 illustrates a decoding algorithm according to the first embodiment. The decoding algorithm is for example executed by the processing module 500 of the system 13 when the system 13 implements the decoding module of Fig. 4. The decoding algorithm is for example executed during step 408. In a step 1000, the processing module 500 decodes the motion vector difference^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ from the video data 311. In a step 1001, the processing module 500 applies the validity checking process on each MVP of the AMVP list (i.e., ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ) of a current block. To do so, the processing module 500 calculates the validity checking criterion on the MVP ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ using for instance equation Eq.3: ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ െ ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ൯ and then MVP ^ ^ ^^^^^^^^^^
Figure imgf000032_0001
^ ^^ ^^^ : ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ െ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ൯ In a step the combination of
Figure imgf000032_0002
one of the MVP ( ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ and ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ) of the AMVP list and the motion vector difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ form an invalid combination. If the validity checking criterion is fulfilled for one combination (i.e., if: ^^ ^^ ^^ ^^൫^ ^^^^^^ ^^^^^ ൯ ^ ^^ ^^ ^^ ^^൫^ ^^^^^^ ^^^^^ ^ ^^^^^^^ ^^^^^ ^^^^^ ^ െ ^^^^^^^ ^^^^^^ ^^^^ ^ or
Figure imgf000032_0003
^^ ^^ ^^ ^^൫ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ൯^ then step 1002 1004, the processing
Figure imgf000032_0004
2023PF00408 module 500 infers the value of the index idx of the motion vector predictor ^ ^^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^^ ^^ ௫ of the AMVP list to apply to predict the motion vector of the current block as the index of the single valid MVP. If ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ ൯, the index idx is inferred to “1” and if
Figure imgf000033_0001
the index idx is inferred to “0”.
Figure imgf000033_0002
Step 1004 is followed by a step 1005. During step 1005, the processing module 500 reconstructs the motion vector of the current block ^ ^ ^^ ^ ^^ ^ ^ ^^^^^^ as the sum of the MVP of the AMVP list identified by the index idx and the motion vector difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ : ^ ^ ^^ ^ ^^ ^ ^ ^^^^^^ ൌ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ^ௗ௫ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ If the combination ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ) of the AMVP list and the motion vector
Figure imgf000033_0003
a valid combination (if ^^ ^^ ^^ ^^൫ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^^ ^ ^ ^ െ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ^ ^ ^ ൯ and ^^ ^^ ^^ ^^൫ ^ ^^^ ^ ^^ ^ ^^ ^ ^ ௗ ൯ ^ ^^ ^^ ^^ ^^൫ ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ ^
Figure imgf000033_0004
During step 1003, the processing module 500 decodes the index idx from the video data 311. Step 1003 is followed by step 1005 already explained. Note for a bi-predicted blocks in AMVP mode, the process of Fig.10 is applied with regards to both reference pictures (i.e., is applied successively to each reference picture) used for the temporal prediction of the current block. As can be seen from the method of Fig. 10, the parsing of the motion information of the current block requires the complete reconstruction of the motion information of the blocks providing the MVP of the AMVP list of the current block. Such dependencies may reduce the possibilities of parallelizing the processing of successive blocks in a picture. In the following, two variants of the first embodiment of Fig. 9 and 10 allows making the parsing of the motion information of a current block less dependent on the motion vector reconstruction of successive blocks. In a first variant, to reduce the dependency of the parsing of the motion information of the current block on the availability of motion vectors of past blocks, only MVP candidates issued from the decoding of blocks sufficiently far from the current block are considered to code and decode the motion information of the current 2023PF00408 block in AMVP mode. For example, pictures are divided in decoding entities. The decoding entity could be for instance a CTU or a Virtual Processing Decoding Unit (VPDU) defined as a 64x64 luma samples area for instance. Such VPDU is used as the decoding process unit, that is, the elementary picture area that can be decoded on a chip memory at a time. Only blocks outside the decoding entity comprising the current block are considered to compute the list of MVP candidates for the current block. In a second variant, the MVP flag information representing the MVP index may be hidden in some other coded data of the video data 311, in a way that the parsing process is the same whether the MVP index information is transmitted to the decoder or not. This information, when present is thus transmitted, for example in the parity bit of the coded syntax element representing the motion vector difference ^ ^ ^^ ^ ^^ ^ ^^ ^ ^ ௗ . The advantage of this approach is that the decoder side inference of the MVP index idx is made possible without making the bit-stream parsing dependent on the motion vector decoding process. In a third variant, the method of the first embodiment is applied is case of use of the MMVD (merge with motion vector difference) mode. As a reminder, the MMVD mode can be viewed as a kind of merge mode in which a merge candidate is refined by a motion vector difference. The motion vector difference is signaled and is additively applied to the selected MVP. In the third variant, the same principle as in first embodiment is applied to the two first merge candidates of the list of merge candidates when MMVD is used. According to that variant, only the MMVD motion vector difference is signaled in the video data 311 in case one of the two first merge candidates is non-valid according to the validity checking criterion of equations Eq. 1, Eq. 2 or Eq. 3. Otherwise, if both merge candidates that can use MMVD motion vector difference are valid, then some signaling is done to indicate the actual MVP used to derive the motion information of the current block. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated sequence parameter set (SPS) signaling flag. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively be activated/deactivated by 2023PF00408 means of a dedicated picture parameter set (PPS) signaling flag. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated picture header syntax element. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated slice header syntax element. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated sub-picture level syntax element. According to a further variant, the proposed embodiment and variants allowing modifying motion information coding are normatively activated/deactivated by means of a dedicated CTU level syntax element. We described above a number of embodiments. Features of these embodiments can be provided alone or in any combination. Further, embodiments can include one or more of the following features, devices, or aspects, alone or in any combination, across various claim categories and types: ^ A bitstream or signal that includes one or more of the described syntax elements, or variations thereof. ^ Creating and/or transmitting and/or receiving and/or decoding a bitstream or signal that includes one or more of the described syntax elements, or variations thereof. ^ A TV, set-top box, cell phone, tablet, or other electronic device that performs at least one of the embodiments described. ^ A TV, set-top box, cell phone, tablet, or other electronic device that performs at least one of the embodiments described, and that displays (e.g. using a monitor, screen, or other type of display) a resulting picture. ^ A TV, set-top box, cell phone, tablet, or other electronic device that tunes (e.g. using a tuner) a channel to receive a signal including an encoded video stream, and performs at least one of the embodiments described. 2023PF00408 ^ A TV, set-top box, cell phone, tablet, or other electronic device that receives (e.g. using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described. ^ A server, camera, cell phone, tablet or other electronic device that transmits (e.g. using an antenna) a signal over the air that includes an encoded video stream, and performs at least one of the embodiments described. ^ A server, camera, cell phone, tablet or other electronic device that tunes (e.g. using a tuner) a channel to transmit a signal including an encoded video stream, and performs at least one of the embodiments described.

Claims

2023PF00408 Claims 1. A method comprising: decoding (1000) a motion information difference for a current block from video data; applying (1001, 1002) a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring (1004) a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding (1003) the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing (1005) the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value. 2. A method comprising: obtaining (901) a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating (902) a motion information difference between the motion information of the current block and the first motion information predictor; encoding (903) the motion information difference in video data; and, applying (904, 905) a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, 2023PF00408 encoding (906) a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination. 3. The method of claim 1 or 2 wherein the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode. 4. The method of claim 3 wherein, responsive to the list is a list of AMVP motion information predictors, the method is applied successively to each reference picture used for a temporal prediction of the current block. 5. The method of claim 3 wherein, responsive to the list is a list of motion information predictors according to the MMVD mode, the method is applied to the two first motion information predictors of the list. 6. The method of any previous claims wherein the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from the coding tree unit or a virtual processing decoding unit comprising the current block. 7. The method of any previous claim wherein, responsive to the combination of each motion information predictor of the list and the motion information difference is valid, the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference. 8. A device comprising electronic circuitry configured for applying a process comprising: decoding (1000) a motion information difference for a current block from video data; applying (1001, 1002) a validity checking process comprising determining from a criterion calculated from the motion information difference and motion information 2023PF00408 predictors of a list of motion information predictors of the current block that a motion information predictor of the list and the motion information difference form an invalid combination; inferring (1004) a value representing the motion information predictor of the list to apply to predict a motion information of the current block responsive to the motion information difference and one motion information predictor of the list form an invalid combination; decoding (1003) the value representing the motion information predictor of the list to apply to predict the motion information of the current block from the video data responsive to the combination of each motion information predictor of the list and the motion information is valid; and, reconstructing (1005) the motion information of the current block as a sum of the motion information difference and the motion information predictor of the list identified by the value. 9. A device comprising electronic circuitry configured for applying a process comprising: obtaining (901) a first motion information predictor of a list of motion information predictors of a current block of a picture for predicting motion information of a current block; calculating (902) a motion information difference between the motion information of the current block and the first motion information predictor; encoding (903) the motion information difference in video data; applying (904, 905) a validity checking process comprising determining from a criterion calculated from the motion information difference and the motion information predictors of the list that a second motion information predictor of the list and the motion information difference form a valid combination; and, encoding (906) a value representing the first motion information predictor only responsive to the second motion information predictor and the motion information difference form a valid combination. 2023PF00408 10. The device of claim 8 or 9 wherein the list is a list of AMVP motion information predictors or a list of motion information predictors according to the MMVD mode. 11. The device of claim 10 wherein, responsive to the list is a list of AMVP motion information predictors, the electronic circuitry is configured to apply the process successively to each reference picture used for a temporal prediction of the current block. 12. The device of claim 10 wherein, responsive to the list is a list of motion information predictors according to the MMVD mode, the electronic circuitry is configured to apply the process to the two first motion information predictors of the list. 13. The device of any previous claim from claim 8 to 12 wherein the motion information predictors of the list are derived from a coding tree unit or a virtual processing decoding unit different respectively from a coding tree unit or a virtual processing decoding unit comprising the current block. 14. The device of any previous claim from claim 8 to 12 wherein, responsive to the combination of each motion information predictor of the list and the motion information difference is valid, the value representing the motion information predictor of the list to apply to predict the motion information of the current block is a parity bit of a syntax element representing the motion information difference. 15. A computer program comprising program code instructions for implementing the method according to any previous claims from claim 1 to 7. 16. Non-transitory information storage medium storing program code instructions for implementing the method according to any previous claims from claim 1 to 7.
PCT/EP2024/061521 2023-05-15 2024-04-26 Inferring motion vector predictor based on motion vector difference WO2024235610A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP23305766.0 2023-05-15
EP23305766 2023-05-15

Publications (1)

Publication Number Publication Date
WO2024235610A1 true WO2024235610A1 (en) 2024-11-21

Family

ID=86605006

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2024/061521 WO2024235610A1 (en) 2023-05-15 2024-04-26 Inferring motion vector predictor based on motion vector difference

Country Status (1)

Country Link
WO (1) WO2024235610A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220417521A1 (en) * 2021-06-25 2022-12-29 Qualcomm Incorporated Hybrid inter bi-prediction in video coding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220417521A1 (en) * 2021-06-25 2022-12-29 Qualcomm Incorporated Hybrid inter bi-prediction in video coding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANLE CHENYAN YESEUNG HWAN KIM: "Algorithm description for Versatile Video Coding and Test Model 11 (VTM 11), Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29", 20TH MEETING, BY TELECONFERENCE, 7 October 2020 (2020-10-07)
YANG W ET AL: "Efficient Motion Vector Coding Algorithms Based on Adaptive Template Matching Techniques", 39. VCEG MEETING; 16-1-2010 - 22-1-2010; KYOTO; (VIDEO CODING EXPERTSGROUP OF ITU-T SG.16) URL: HTTP://WFTP3.ITU.INT/AV-ARCH/VIDEO-SITE/,, no. VCEG-AM16; VCEG-AM16, 16 January 2010 (2010-01-16), XP030003736 *

Similar Documents

Publication Publication Date Title
US20240214553A1 (en) Spatial local illumination compensation
US20220159277A1 (en) Method and apparatus for video encoding and decoding with subblock based local illumination compensation
US20240214557A1 (en) Method and device of video coding using local illumination compensation (lic) groups
US12041241B2 (en) Method and device for image encoding and decoding with a shifted position
KR20210069715A (en) Affine mode signaling in video encoding and decoding
EP3706421A1 (en) Method and apparatus for video encoding and decoding based on affine motion compensation
CN112753222A (en) Motion vector prediction in video encoding and decoding
KR20220047729A (en) Motion vector processing for video encoding and decoding
WO2024002675A1 (en) Simplification for cross-component intra prediction
US12143567B2 (en) Subblock merge candidates in triangle merge mode
WO2024235610A1 (en) Inferring motion vector predictor based on motion vector difference
EP3991428A1 (en) Hmvc for affine and sbtmvp motion vector prediction modes
KR20220027173A (en) Motion vector prediction in video encoding and decoding
EP3949418A1 (en) Inter-prediction parameter derivation for video encoding and decoding
WO2020112451A1 (en) Combining affine candidates
WO2020060757A1 (en) Translational and affine candidates in a unified list
WO2024235608A1 (en) Removing some redundancies in motion information coding
US20250030834A1 (en) Method and device for picture encoding and decoding
US20250024067A1 (en) Video encoding and decoding using reference picture resampling
US20240205412A1 (en) Spatial illumination compensation on large areas
WO2025002767A1 (en) New hypothesis for multi-hypothesis inter prediction mode
WO2024223400A1 (en) Motion information predictor selection
WO2023194103A1 (en) Temporal intra mode derivation
WO2024208669A1 (en) Methods and apparatuses for encoding and decoding an image or a video
WO2024068298A1 (en) Mixing analog and digital neural networks implementations in video coding processes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24721677

Country of ref document: EP

Kind code of ref document: A1