WO2019234001A1 - Video coding and decoding - Google Patents
Video coding and decoding Download PDFInfo
- Publication number
- WO2019234001A1 WO2019234001A1 PCT/EP2019/064458 EP2019064458W WO2019234001A1 WO 2019234001 A1 WO2019234001 A1 WO 2019234001A1 EP 2019064458 W EP2019064458 W EP 2019064458W WO 2019234001 A1 WO2019234001 A1 WO 2019234001A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- sao
- bitstream
- filtering
- group
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
- H04N19/198—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
Definitions
- VVC Versatile Video Coding
- the goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020.
- the main target applications and services include— but not limited to— 360-degree and high-dynamic-range (HDR) videos.
- HDR high-dynamic-range
- JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs.
- Some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard.
- UHD ultra-high definition
- JEM JVET exploration model
- SAO sample adaptive offset
- US 9769450 discloses an SAO filter for three dimensional or 3D Video Coding or 3DVC such as implemented by the HEVC standard.
- the filter directly re-uses SAO filter parameters of an independent view or a coded dependent view to encode another dependent view, or re-uses only part of the SAO filter parameters of the independent view or a coded dependent view to encode another dependent view.
- the SAO parameters are re-used by copying them from the independent view or coded dependent view.
- US 2014/0192860 Al relates to the scalable extension of HEVC.
- HEVC scalable extension aims at allowing coding/decoding of a video made of multiple scalability layers, each layer being made up of a series of frames. Coding efficiency is improved by inferring, or deriving, SAO parameters to be used at an upper layer (e.g. an enhancement layer) from the SAO parameters actually used at a lower (e.g. base) layer. This is because inferring some SAO parameters makes it possible to avoid transmitting them.
- SAO parameters to be used at an upper layer (e.g. an enhancement layer) from the SAO parameters actually used at a lower (e.g. base) layer. This is because inferring some SAO parameters makes it possible to avoid transmitting them.
- SAO parameters for Coding Tree Units are grouped in order to reduce the encoder delay. It is proposed to provide SAO parameters for a region (for example a row) that enable the determination of SAO parameters for the filtering regions to be performed in parallel.
- HE VC standard proposes to signal whether SAO parameter set (sao merge up flag, sao merge left flag ) are derived or not from the above Coding Tree Unit.
- a method and corresponding device for signalling a Sample Adaptive Offset (SAO) filtering In a first aspect, it is also proposed a method and corresponding device for signalling a Sample Adaptive Offset (SAO) filtering.
- the first aspect also concerns a method a corresponding device for performing a Sample Adaptive Offset (SAO) filtering.
- the first aspect also described corresponding encoding and decoding method and associated devices.
- a method and corresponding device for encoding an image comprising a plurality of image parts, one or more image parts being predicted from one or more other image parts according to a first or a second prediction mode.
- a method of signalling, in a bitstream, Sample Adaptive Offset (SAO) filtering parameters for use in performing SAO filtering on an image comprising a plurality of image parts, the image parts being groupable into groups of image parts using two or more different available groupings comprising: determining which of said available groupings applies to an image part to be filtered; if the determined grouping is a predetermined one of the different available grroupings, including in the bitstream an inferring-permitted syntax element, which indicates that it is permitted to infer the SAO parameters for performing SAO filtering on the image part to be filtered from the SAO parameters used for filtering another image part; and if the determined grouping is another one of the different available groupings, not including the syntax element in the bitstream.
- SAO Sample Adaptive Offset
- Figure 1 is a diagram for use in explaining a coding structure used in HEVC
- Figure 2 is a block diagram schematically illustrating a data communication system in which one or more embodiments of the invention may be implemented;
- Figure 3 is a block diagram illustrating components of a processing device in which one or more embodiments of the invention may be implemented;
- Figure 4 is a flow chart illustrating steps of an encoding method according to embodiments of the invention;
- Figure 5 is a flow chart illustrating steps of a loop filtering process of in accordance with one or more embodiments of the invention;
- Figure 6 is a flow chart illustrating steps of a decoding method according to embodiments of the invention.
- Figure 7A and 7B are diagrams for use in explaining edge-type SAO filtering in
- Figure 8 is a diagram for use in explaining band-type SAO filtering in HEVC.
- Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications
- Figure 10 is a flow chart illustrating in more detail one of the steps of the Figure 9 process
- Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications
- Figure 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in a CTU-level derivation of SAO parameters
- Figure 13 shows one of the steps of Figure 12 in more detail
- Figure 14 shows another one of the steps of Figure 12 in more detail
- Figure 15 shows yet another one of the steps of Figure 12 in more detail
- Figure 16 shows various different groupings 1201-1206 of CTUs in a slice
- Figure 17 is a diagram showing image parts of a frame in a derivation of SAO parameters in which a first method of sharing SAO parameters is used;
- Figure 18 is a flowchart of an example of a process for setting SAO parameters in the derivation of Figure 17;
- Figure 19 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a column of CTUs;
- Figure 20 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a group of NxN CTUs;
- Figure 21 is a diagram showing image parts of one NxN group in the derivation of Figure 20;
- Figure 22 illustrates an example of how to select the SAO parameter derivation according to the sixth embodiment of the invention.
- Figure 23 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group
- Figure 24 is a diagram showing image parts of multiple 2x2 groups
- Figure 25 is a flow chart illustrating a encoding process according to the first embodiment of the invention
- Figure 26 is a flow chart illustrating a decoding process according to the third embodiment of the invention
- Figure 27 is a flow chart illustrating a decoding process according to the fourth embodiment of the invention.
- Figure 28 is a diagram showing a system comprising an encoder or a decoder and a communication network according to embodiments of the present invention,
- Figure 1 relates to a coding structure used in the High Efficiency Video Coding (HEVC) video standard.
- a video sequence 1 is made up of a succession of digital images i. Each such digital image is represented by one or more matrices. The matrix coefficients represent pixels.
- HEVC High Efficiency Video Coding
- An image 2 of the sequence may be divided into slices 3.
- a slice may in some instances constitute an entire image.
- These slices are divided into non-overlapping Coding Tree Units (CTUs) 4.
- a Coding Tree Unit (CTU) is the basic processing unit of the High Efficiency Video Coding (HEVC) video standard and conceptually corresponds in structure to macroblock units that were used in several previous video standards.
- a CTU is also sometimes referred to as a Largest Coding Unit (LCU).
- a CTU is generally of size 64 pixels x 64 pixels.
- Each CTU may in turn be iteratively divided into smaller variable-size Coding Units (CUs) 5 using a quadtree decomposition.
- CUs variable-size Coding Units
- Coding units are the elementary coding elements and are constituted by two kinds of sub-unit called a Prediction Unit (PU) and a Transform Unit (TU).
- the maximum size of a PU or TU is equal to the CU size.
- a Prediction Unit corresponds to the partition of the CU for prediction of pixels values.
- Various different partitions of a CU into PUs are possible as shown by 6 including a partition into 4 square PUs and two different partitions into 2 rectangular PUs.
- a Transform Unit is an elementary unit that is subjected to spatial transformation using DCT.
- a CU can be partitioned into TUs based on a quadtree representation 7.
- NAL Network Abstraction Layer
- coding parameters of the video sequence are stored in dedicated NAL units called parameter sets.
- SPS Sequence Parameter Set
- PPS Picture Parameter Set
- HEVC also includes a Video Parameter Set (VPS) NAL unit which contains parameters describing the overall structure of the bitstream.
- the VPS is a new type of parameter set defined in HEVC, and applies to all of the layers of a bitstream.
- a layer may contain multiple temporal sub-layers, and all version 1 bitstreams are restricted to a single layer.
- HEVC has certain layered extensions for scalability and multiview and these will enable multiple layers, with a backwards compatible version 1 base layer.
- FIG. 2 illustrates a data communication system in which one or more embodiments of the invention may be implemented.
- the data communication system comprises a transmission device, in this case a server 201, which is operable to transmit data packets of a data stream to a receiving device, in this case a client terminal 202, via a data communication network 200.
- the data communication network 200 may be a Wide Area Network (WAN) or a Local Area Network (LAN).
- WAN Wide Area Network
- LAN Local Area Network
- Such a network may be for example a wireless network (Wifi / 802.1 la or b or g), an Ethernet network, an Internet network or a mixed network composed of several different networks.
- the data communication network 200 may be a Wide Area Network (WAN) or a Local Area Network (LAN).
- WAN Wide Area Network
- LAN Local Area Network
- Such a network may be for example a wireless network (Wifi / 802.1 la or b or g), an Ethernet network, an Internet network or a mixed network composed
- the communication system may be a digital television broadcast system in which the server 201 sends the same data content to multiple clients.
- the data stream 204 provided by the server 201 may be composed of multimedia data representing video and audio data. Audio and video data streams may, in some embodiments of the invention, be captured by the server 201 using a microphone and a camera
- data streams may be stored on the server 201 or received by the server 501 from another data provider, or generated at the server 201.
- the server 201 is provided with an encoder for encoding video and audio streams in particular to provide a compressed bitstream for transmission that is a more compact representation of the data presented as input to the encoder.
- the compression of the video data may be for example in accordance with the HEVC format or H.264/AVC format.
- the client 202 receives the transmitted bitstream and decodes the reconstructed bitstream to reproduce video images on a display device and the audio data by a loud speaker.
- a streaming scenario is considered in the example of Figure 2, it will be appreciated that in some embodiments of the invention the data communication between an encoder and a decoder may be performed using for example a media storage device such as an optical disc.
- a video image is transmitted with data representative of compensation offsets for application to reconstructed pixels of the image to provide filtered pixels in a final image.
- FIG. 3 schematically illustrates a processing device 300 configured to implement at least one embodiment of the present invention.
- the processing device 300 may be a device such as a micro-computer, a workstation or a light portable device.
- the device 300 comprises a communication bus 313 connected to:
- central processing unit 311 such as a microprocessor, denoted CPU;
- ROM read only memory
- RAM random access memory 312, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to embodiments of the invention;
- the apparatus 300 may also include the following components:
- -a data storage means 304 such as a hard disk, for storing computer programs for implementing methods of one or more embodiments of the invention and data used or produced during the implementation of one or more embodiments of the invention;
- the disk drive being adapted to read data from the disk 306 or to write data onto said disk;
- -a screen 309 for displaying data and/or serving as a graphical interface with the user, by means of a keyboard 310 or any other pointing means.
- the apparatus 300 can be connected to various peripherals, such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.
- peripherals such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.
- the communication bus provides communication and interoperability between the various elements included in the apparatus 300 or connected to it.
- the representation of the bus is not limiting and in particular the central processing unit is operable to communicate instructions to any element of the apparatus 300 directly or by means of another element of the apparatus 300.
- the disk 306 can be replaced by any information medium such as for example a compact disk (CD-ROM), rewritable or not, a ZIP disk or a memory card and, in general terms, by an information storage means that can be read by a microcomputer or by a microprocessor, integrated or not into the apparatus, possibly removable and adapted to store one or more programs whose execution enables the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to the invention to be implemented.
- CD-ROM compact disk
- ZIP disk or a memory card
- the executable code may be stored either in read only memory 307, on the hard disk 304 or on a removable digital medium such as for example a disk 306 as described previously.
- the executable code of the programs can be received by means of the communication network 303, via the interface 302, in order to be stored in one of the storage means of the apparatus 300 before being executed, such as the hard disk 304.
- the central processing unit 311 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to the invention, instructions that are stored in one of the aforementioned storage means.
- the program or programs that are stored in a non-volatile memory for example on the hard disk 304 or in the read only memory 307, are transferred into the random access memory 312, which then contains the executable code of the program or programs, as well as registers for storing the variables and parameters necessary for implementing the invention.
- the apparatus is a programmable apparatus which uses software to implement the invention.
- the present invention may be
- ASIC Application Specific Integrated Circuit
- Figure 4 illustrates a block diagram of an encoder according to at least one embodiment of the invention.
- the encoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, at least one corresponding step of a method implementing at least one embodiment of encoding an image of a sequence of images according to one or more embodiments of the invention.
- An original sequence of digital images / ⁇ to m 401 is received as an input by the encoder 40.
- Each digital image is represented by a set of samples, known as pixels.
- a bitstream 410 is output by the encoder 40 after implementation of the encoding process.
- the bitstream 410 comprises a plurality of encoding units or slices, each slice comprising a slice header for transmitting encoding values of encoding parameters used to encode the slice and a slice body, comprising encoded video data.
- the input digital images iO to m 401 are divided into blocks of pixels by module 402.
- the blocks correspond to image portions and may be of variable sizes (e.g. 4x4, 8x8, 16x16, 32x32, 64x64 pixels).
- a coding mode is selected for each input block. Two families of coding modes are provided: coding modes based on spatial prediction coding (Intra prediction), and coding modes based on temporal prediction (Inter coding, Merge, SKIP).
- Module 403 implements an Intra prediction process, in which the given block to be encoded is predicted by a predictor computed from pixels of the neighbourhood of said block to be encoded. An indication of the selected Intra predictor and the difference between the given block and its predictor is encoded to provide a residual if the Intra coding is selected.
- Temporal prediction is implemented by motion estimation module 404 and motion compensation module 405. Firstly a reference image from among a set of reference images/pictures 416 is selected, and a portion of the reference image, also called reference area or image portion, which is the closest area to the given block to be encoded, is selected by the motion estimation module 404. Motion compensation module 405 then predicts the block to be encoded using the selected area. The difference between the selected reference area and the given block, also called a residual block, is computed by the motion
- the selected reference area is indicated by a motion vector.
- a prediction direction is encoded.
- at least one motion vector is encoded.
- Motion vector predictors of a set of motion information predictors is obtained from the motion vectors field 418 by a motion vector prediction and coding module 417.
- the encoder 40 further comprises a selection module 406 for selection of the coding mode by applying an encoding cost criterion, such as a rate-distortion criterion.
- an encoding cost criterion such as a rate-distortion criterion.
- a transform such as DCT
- the transformed data obtained is then quantized by quantization module 408 and entropy encoded by entropy encoding module 409.
- the encoded residual block of the current block being encoded is inserted into the bitstream 410.
- the encoder 40 also performs decoding of the encoded image in order to produce a reference image for the motion estimation of the subsequent images. This enables the encoder and the decoder receiving the bitstream to have the same reference frames.
- the inverse quantization module 411 performs inverse quantization of the quantized data, followed by an inverse transform by inverse transform module 412.
- the intra prediction module 413 uses the prediction information to determine which predictor to use for a given block and the motion compensation module 414 actually adds the residual obtained by module 412 to the reference area obtained from the set of reference images 416.
- Post filtering is then applied by module 415 to filter the reconstructed frame of pixels.
- an SAO loop filter is used in which compensation offsets are added to the pixel values of the reconstructed pixels of the reconstructed image
- Figure 5 is a flow chart illustrating steps of loop filtering process according to at least one embodiment of the invention.
- the encoder In an initial step 51 , the encoder generates the
- a deblocking filter is applied on this first reconstruction in order to generate a deblocked reconstruction 53.
- the aim of the deblocking filter is to remove block artifacts generated by residual quantization and block motion compensation or block Intra prediction. These artifacts are visually important at low bitrates.
- the deblocking filter operates to smooth the block boundaries according to the characteristics of two neighboring blocks. The encoding mode of each block, the quantization parameters used for the residual coding, and the neighboring pixel differences in the boundary are taken into account. The same criterion/classification is applied for all frames and no additional data is transmitted.
- the deblocking filter improves the visual quality of the current frame by removing blocking artifacts and it also improves the motion estimation and motion compensation for subsequent frames. Indeed, high frequencies of the block artifact are removed, and so these high frequencies do not need to be compensated for with the texture residual of the following frames.
- the deblocked reconstruction is filtered by a sample adaptive offset (SAO) loop filter in step 54 using SAO parameters 58 determined in accordance with embodiments of the invention.
- the resulting frame 55 may then be filtered with an adaptive loop filter (ALF) in step 56 to generate the reconstructed frame 57 which will be displayed and used as a reference frame for the following Inter frames.
- ALF adaptive loop filter
- each pixel of the frame region is classified into a class or group. The same offset value is added to every pixel value which belongs to a certain class or group.
- FIG. 6 illustrates a block diagram of a decoder 60 which may be used to receive data from an encoder according an embodiment of the invention.
- the decoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, a corresponding step of a method implemented by the decoder 60.
- the decoder 60 receives a bitstream 61 comprising encoding units, each one being composed of a header containing information on encoding parameters and a body containing the encoded video data.
- the encoded video data is entropy encoded, and the motion vector predictors’ indexes are encoded, for a given block, on a predetermined number of bits.
- the received encoded video data is entropy decoded by module 62.
- the residual data are then dequantized by module 63 and then an inverse transform is applied by module 64 to obtain pixel values.
- the mode data indicating the coding mode are also entropy decoded and based on the mode, an INTRA type decoding or an INTER type decoding is performed on the encoded blocks of image data.
- an INTRA predictor is determined by intra prediction module 65 based on the intra prediction mode specified in the bitstream.
- the motion prediction information is extracted from the bitstream so as to find the reference area used by the encoder.
- the motion prediction information is composed of the reference frame index and the motion vector residual.
- the motion vector predictor is added to the motion vector residual in order to obtain the motion vector by motion vector decoding module 70.
- Motion vector decoding module 70 applies motion vector decoding for each current block encoded by motion prediction. Once an index of the motion vector predictor, for the current block has been obtained the actual value of the motion vector associated with the current block can be decoded and used to apply motion compensation by module 66. The reference image portion indicated by the decoded motion vector is extracted from a reference image/picture 68 to apply the motion compensation 66. The motion vector field data 71 is updated with the decoded motion vector in order to be used for the inverse prediction of subsequent decoded motion vectors. Finally, a decoded block is obtained. Post filtering is applied by post filtering module 67 similarly to post filtering module 415 applied at the encoder as described with reference to Figure 4. A decoded video signal 69 is finally provided by the decoder 60.
- SAO filtering is to improve the quality of the reconstructed frame by sending additional data in the bitstream in contrast to the deblocking filter where no information is transmitted.
- each pixel is classified into a predetermined class or group and the same offset value is added to every pixel sample of the same class/group.
- One offset is encoded in the bitstream for each class.
- SAO loop filtering has two SAO types: an Edge Offset (EO) type and a Band Offset (BO) type.
- EO Edge Offset
- BO Band Offset
- An example of Edge Offset type is schematically illustrated in Figures 7A and 7B
- an example of Band Offset type is schematically illustrated in Figure 8.
- SAO filtering is applied CTU by CTU.
- the parameters needed to perform the SAO filtering (set of SAO parameters) are selected for each CTU at the encoder side and the necessary parameters are decoded and/or derived for each CTU at the decoder side.
- This offers the possibility of easily encoding and decoding the video sequence by processing each CTU at once without introducing delays in the processing of the whole frame.
- SAO filtering is enabled, only one SAO type is used: either the Edge Offset type filter or the Band Offset type filter according to the related parameters transmitted in the bitstream for each classification.
- One of the SAO parameters in HEVC is an SAO type parameter saojypejdx which indicates for the CTU whether EO type, BO type or no SAO filtering is selected for the CTU concerned.
- the SAO parameters for a given CTU can be copied from the upper or left CTU, for example, instead of transmitting all the SAO data.
- One of the SAO parameters in HEVC is a sao_merge_up flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the upper CTU.
- Another of the SAO parameters in HEVC is a saojnergejeft flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the left CTU.
- SAO filtering may be applied independently for different color components (e.g. YUV) of the frame.
- one set of SAO parameters may be provided for the luma component Y and another set of SAO parameters may be provided for both chroma components U and V in common.
- one or more SAO parameters may be used as common filtering parameters for two or more color components, while other SAO parameters are dedicated (per-component) filtering parameters for the color components.
- the SAO type parameter sao ypejdx is common to U and V, and so is a EO class parameter which indicates a class for EO filtering (see below), whereas a BO class parameter which indicates a group of classes for BO filtering has dedicated (per-component) SAO parameters for U and V.
- Edge Offset type involves determining an edge index for each pixel by comparing its pixel value to the values of two neighboring pixels. Moreover, these two neighboring pixels depend on a parameter which indicates the direction of these two neighboring pixels with respect to the current pixel. These directions are the 0-degree (horizontal direction), 45- degree (diagonal direction), 90-degree (vertical direction) and 135-degree (second diagonal direction). These four directions are schematically illustrated in Figure 7A.
- the table of Figure 7B gives the offset value to be applied to the pixel value of a particular pixel“C” according to the value of the two neighboring pixels Cnl and Cn2 at the decoder side.
- the offset to be added to the pixel value of the pixel C is“+ 01”.
- the offset to be added to this pixel sample value is“+ 02”.
- the offset to be applied to this pixel sample is“- 03”.
- the value of C is greater than the two values of Cnl or Cn2, the offset to be applied to this pixel sample is“- 04”.
- each offset (01, 02, 03, 04) is encoded in the bitstream.
- the sign to be applied to each offset depends on the edge index (or the Edge Index in the HEVC specifications) to which the current pixel belongs. According to the table represented in Figure 7B, for Edge Index 0 and for Edge Index 1 (01, 02) a positive offset is applied. For Edge Index 3 and Edge Index 4 (03, 04), a negative offset is applied to the current pixel.
- the direction for the Edge Offset amongst the four directions of Figure 7A is specified in the bitstream by a“sao _eo class luma” field for the luma component and a“sao_eo_class_chroma” field for both chroma components U and V.
- the SAO Edge Index corresponding to the index value is obtained by the following formula:
- Edgelndex sign (C - Cn2) - sign (Cnl- C) +2
- the difference between the pixel value of C and the pixel value of both its neighboring pixels Cnl and Cn2 can be shared for current pixel C and its neighbors.
- the term sign (Cnl- C) has already computed for the previous pixels (to be precise it was computed as C’-Cn2’ at a time when the current pixel C’ at that time was the present neighboring pixel Cnl and the neighboring pixel Cn2’ was what is now the current pixel C).
- this sign (c n l- c) does not need to be computed again.
- Band Offset type in SAO also depends on the pixel value of the sample to be processed.
- a class in SAO Band offset is defined as a range of pixel values. Conventionally, for all pixels within a range, the same offset is added to the pixel value. In the HEVC specifications, the number of offsets for the Band Offset filter is four for each reconstructed block or frame area of pixels (CTU), as schematically illustrated in Figure 8.
- SAO Band offset splits the full range of pixel values into 32 ranges of the same size. These 32 ranges are the classes of SAO Band offset.
- Classifying the pixels into 32 ranges of the full interval includes 5 bits checking needed to classify the pixels values for fast implementation i.e. only the 5 first bits (5 most significant bits) are checked to classify a pixel into one of the 32 classes/ ranges of the full range.
- the bitdepth is 8 bits per pixel
- the maximum value of a pixel can be 255.
- the range of pixel values is between 0 and 255.
- each class contains 8 pixel values.
- the start of the band represented by the grey area (40), that contains four ranges or classes, is signaled in the bitstream to identify the position of the first class of pixels or the first range of pixel values.
- the syntax element representative of this position is the“ sao_band jpositiorT’ field in the HEVC specifications. This corresponds to the start of class 41 in Figure 8. According to the HEVC specifications,
- Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications.
- the process of Figure 9 is applied for each CTU to generate a set of SAO parameters for all components.
- a predictive scheme is used for the CTU mode. This predictive mode involves checking if the CTU on the left of the current CTU uses the same SAO parameters (this is specified in the bitstream through a flag named
- step 503 the“ saojnergejeft Jlag” is read from the bitstream 502 and decoded. If its value is true, then the process proceeds to step 504 where the SAO parameters of left CTU are copied for the current CTU. This enables the types for YUV of the SAO filter for the current CTU to be determined in step 508.
- step 503 If the outcome is negative in step 503 then the“ sao_merge_up Jlag” is read from the bitstream and decoded. If its value is true, then the process proceeds to step 505 where the SAO parameters of the above CTU are copied for the current CTU. This enables the types of the SAO filter for the current CTU to be determined in step 508.
- step 505 If the outcome is negative in step 505, then the SAO parameters for the current CTU are read and decoded from the bitstream in step 507 for the Luma Y component and both U and V components (501) (551) for the type.
- the offsets for Chroma are independent.
- step 508 the parameters are obtained and the type of SAO filter is determined in step 508.
- step 511 a check is performed to determine if the three colour components (Y and U & V) for the current CTU have been processed. If the outcome is positive, the determination of the SAO parameters for the three components is complete and the next CTU can be processed in step 510. Otherwise, (Only Y was processed) U and V are processed together and the process restarts from initial step 512 previously described.
- Figure 10 is a flow chart illustrating steps of a process of parsing of SAO parameters in the bitstream 601 at the decoder side.
- initial step 602 the”sao ypejdx_X” syntax element is read and decoded.
- the code word representing this syntax element can use a fixed length code or could use any method of arithmetic coding.
- sao_type_idx_X enables determination of the type of SAO applied for the frame area to be processed for the colour component Y or for both Chroma components U & V. For example, for a YUV 4:2:0 sequence, two components are considered: one for Y, and one for U and V.
- The“ saoJypejdxJC’ can take 3 values as follows depending on the SAO type encoded in the bitstream.‘O’ corresponds to no SAO,‘ 1’ corresponds to the Band Offset case illustrated in Figure 8 and‘2’ corresponds to the Edge Offset type filter illustrated in Figures 3 A and 3B.
- a test is performed to determine if the“ saoJypejdx ’ is strictly positive. If“saoJypejdxJC’ is equal to“0” signifying that there is no SAO for this frame area (CTU) for Y if X is set equal to Y and that there is no SAO for this frame area for U and V if X is set equal to U and V. The determination of the SAO parameters is complete and the process proceeds to step 608. Otherwise if the“ saojypejdx” is strictly positive, this signifies that SAO parameters exist for this CTU in the bitstream.
- step 606 a loop is performed for four iterations.
- step 607 the absolute value of offset j is read and decoded from the bitstream.
- These four offsets correspond either to the four absolute values of the offsets (01, 02, 03, 04) of the four Edge indexes of SAO Edge Offset (see Figure 7B) or to the four absolute values of the offsets related to the four ranges of the SAO band Offset (see Figure 8).
- MAX abs SAO offset value (1 « (Min(bitDepth, l0)-5))-l
- « is the left (bit) shift operator.
- This formula means that the maximum absolute value of an offset is 7 for a pixel value bitdepth of 8 bits, and 31 for a pixel value bitdepth of 10 bits and beyond.
- the current HE VC standard amendment addressing extended bitdepth video sequences provides similar formula for a pixel value having a bitdepth of 12 bits and beyond.
- the absolute value decoded may be a quantized value which is dequantized before it is applied to pixel values at the decoder for SAO filtering. An indication of use or not of this quantification is transmitted in the slice header.
- the sign is signaled in the bitstream as a second part of the offset if the absolute value of the offset is not equal to 0.
- the bit of the sign is bypassed when CABAC is used.
- step 607 the process proceeds to step 603 where a test is performed to determine if the type of SAO corresponds to the Band Offset type (sao_type_idx_X 1).
- the signs of the offsets for the Band Offset mode are decoded in steps 609 and 610, except for each offset that has a zero value, before the following step 604 is performed in order to read in the bitstream and to decode the position “ sao_band _position_X” of the SAO band as illustrated in Figure 8.
- the read syntax element is“sao eo class luma” and if X is set equal to U and V, the read syntax element is“sao eo class chroma”.
- FIG 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications, for example during the step 907 in Figure 6.
- this image part is a CTU.
- This same process is also applied in the decoding loop (step 715 in Figure 4) at the encoder in order to produce the reference frames used for the motion estimation and compensation of the following frames.
- This process is related to the SAO filtering for one color component (thus suffix“_X” in the syntax elements has been omitted below).
- An initial step 701 comprises determining the SAO filtering parameters according to processes depicted in Figures 9 and 10.
- the SAO filtering parameters are determined by the encoder and the encoded SAO parameters are included in the bitstream. Accordingly, on the decoder side in step 701 the decoder reads and decodes the parameters from the bitstream.
- Step 701 gives the sao ypejdx and if it equals 1 the sao_band _position 702 and if it equals 2 the sao_eo_class Junta or sao _eo class chroma (according to the colour component processed). It may be noted that if the element sao ypejdx is equal to 0 the SAO filtering is not applied.
- Step 701 gives also the offsets table of the 4 offsets 703.
- a variable i used to successively consider each pixel Pi of the current block or frame area (CTU), is set to 0 in step 704.
- step 706 pixel p is extracted from the frame area 705
- the decision module 708 tests if R is in a class that is to be filtered using the conventional SAO filtering.
- the related class number j is identified and the related offset value ff set / j s extracted in step 710 from the offsets table 703. In the case of the
- steps 710 and 711 are carried out differently, as will be explained later in the description of those embodiments.
- p is inserted in step 713 into the filtered frame area 716 without filtering.
- step 713 the variable i is incremented in step 714 in order to filter the subsequent pixels of the current frame area 705 (if any - test 715).
- step 715 the filtered frame area 716 is reconstructed and can be added to the SAO reconstructed frame (see frame 908 of Figure 6 or 716 of Figure 4).
- JEM JVET exploration model
- SAO sample adaptive offset
- FIG. 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in the CTU-level.
- the process starts with a current CTU 1101.
- the process of Step 1102 is described below with reference to Figure 13.
- the RD cost for the SAO merge Left is evaluated if the Left CTU is in the current Slice 1103 as the RD cost of the SAO Merge UP (1104).
- CTUStats 1102 new SAO parameters are evaluated for Luma 1105 and for both Chroma components 1109.
- FIG. 13 is a flow chart illustrating steps of an example of a statistics computed at the encoder side that can be applied for the Edge Offset type filter, in the case of the conventional SAO filtering. The similar approach may also be used for the Band Offset type filter.
- Figure 13 illustrates the setting of the variable CTUStats containing all information needed to derive each best rate distortion offsets for each class. Moreover, it illustrates the selection of the best SAO parameters set for the current CTU. For each colour component Y, U, V (or RGB) 811 each SAO type is evaluated. For each SAO type 812 the variables Sum j and SumNbPiX j are set to zero in an initial step 801. The current frame area 803 contains N pixels.
- j is the current range number to determine the four offsets (related to the four edge indexes shown in Figure 7B for Edge Offset type or to the 32 ranges of pixel values shown in Figure 8 for Band Offset type).
- Sum is the sum of the differences between the pixels in the range j and their original pixels.
- SumNbPiX j is the number of pixels in the frame area, the pixel value of which belongs to the range j .
- a variable i used to successively consider each pixel Pi of the current frame area, is set to zero. Then, the first pixel of the frame area 803 is extracted in step 804.
- step 805 the class of the current pixel is determined by checking the conditions defined in Figure 7B. Then a test is performed in step 805. During step 805, a check is performed to determine if the class of the pixel value corresponds to the value“none of the above” of
- step 808 If the outcome is positive, then the value“i” is incremented in step 808 in order to consider the next pixels of the frame area 803.
- step 806 the next step is 807 where the related SumNbPix j (i.e. the sum of the number of pixels for the class determined in step 805) is incremented and the difference between P i and its original value P ⁇ ’ " ' is added to Sum ⁇ .
- the variable i is incremented in order to consider the next pixels of the frame area 803.
- Each offset Offset . is an optimal offset Ooptj in terms of distortion.
- the encoder uses the statistics set in table CTUStats.
- the distortion can be obtained by the following formula:
- variable Shift is designed for a distortion adjustment.
- the distortion should be negative as SAO is a post filtering.
- the same computing is applied for Chroma components.
- the Lambda of the rate distortion cost is fixed for the three components.
- the rate is only 1 flag which is CABAC coded.
- the rate distortion value Jj is initialized to the maximum possible value. Then a loop on Oj from Ooptj to 0 is applied in step 902. Note that Oj is modified by 1 at each new iteration of the loop. If Ooptj is negative, the value Oj is incremented and if Ooptj is positive, the value Oj is decremented.
- the rate distortion cost related to Oj is computed in step 903 according to the following formula:
- R(Oj) is a function which provides the number of bits needed for the code word associated with Oj.
- This algorithm of Figures 13 and 14 provides a best ORDj for each class j. This algorithm is repeated for each of the four directions of Figure 7A. Then the direction that provides the best rate distortion cost (sum of Jj for each direction) is selected as the direction to be used for the current CTU.
- the next step involves finding the best position of the SAO band position of Figure 8. This is determined with the encoding process set out in Figure 15.
- the RD cost Jj for each range has been computed with the encoding process of Figure 14 with the optimal offset ORDj in terms of rate distortion.
- the rate distortion value J is initialized to the maximum possible value.
- a loop on the 28 positions j of 4 consecutive classes is run in step 1002.
- the variable Jj corresponding to the RD cost of the band (of 4 consecutive classes) is initialized to 0 in step 1003.
- the loop on the four consecutive offset j is run in step 1004.
- Test 1008 checks whether or not the loop on the 28 positions has ended. If not, the process continues in step 1002, otherwise the encoding process returns the best band position as being the current value of sao_band_position 1009.
- the CTU Stats table in the case of determining the SAO parameters at the CTU level is created by the process of Figure 12. This corresponds to evaluating the CTU level in terms of the rate-distortion compromise. The evaluation may be performed for the whole image or for just the current slice.
- Figure 16 shows various different groupings 1201-1206 of CTUs in a slice.
- a first grouping 1201 has individual CTUs. This first grouping requires one set of SAO parameters per CTU. It corresponds to the CTU-level previously mentioned.
- a second grouping 1202 makes all CTUs of the entire image one group.
- all CTUs of the frame and hence the slice which is either the entire frame or a part thereof) share the same SAO parameters.
- To make all CTUs of the image share the same SAO parameters one of two methods can be used. In both methods, the encoder first computes a set of SAO parameters to be shared by all CTUs of the image. Then, in the first method, these SAO parameters are set for the first CTU of the slice.
- the saojnergejeft flag is set equal to 1 if the flag exists (that is, if the current CTU has a left CTU). Otherwise, the sao_merge_up flag is set equal to 1.
- Figure 17 shows an example of CTUs with SAO parameters set according to the first method. This method has the advantage that no signalling of the grouping to the decoder is required. Also, no changes to the decoder are required to introduce the groupings and only the encoder is changed. The groupings could therefore be introduced in an encoder based on HEVC without modifying the HEVC decoder. Surprisingly, groupings do not increase the rate too much. This is because the merge flags are generally CABAC coded in the same context. Since for the second group (entire image) these flags all have the same value (1), the rate consumed by these flags is very low. This follows because they always have the same value and the probability is 1.
- the grouping is signalled to the decoder in the bitstream.
- the SAO parameters are also signalled as SAO parameters for the group (whole image), for example in the slice header.
- the signalling of the grouping consumes bandwidth.
- the merge flags can be dispensed with, saving the rate related to the merge flags, so that overall the rate is reduced.
- the first and second groupings 1201 and 1202 provide very different rate-distortion compromises.
- the first grouping 1201 is at one extreme, giving very fine control of the SAO parameters (CTU by CTU), which should lower distortion, but at the expense of a lot of signalling.
- the second grouping is at the other extreme, giving very coarse control of the SAO parameters (one set for the whole image), which raises distortion but has very light signalling.
- the determination is done for a whole image and all CTUs of the slice/frame share the same SAO parameters.
- Figure 18 is an example of the setting of SAO parameters for a ffame/slice level using the first method of sharing SAO parameters (i.e. without new SAO classifications at encoder side).
- This Figure is based on Figure 17.
- the CTUStats table is set for each CTU (in the same way as the CTU level encoding choice).
- This CTUStats can be used for the traditional CTU level 1302.
- the table FrameStats is set by adding each value for all CTUs of the table CTUStats 1303.
- the same process as for CTU level is applied to find the best SAO parameters 1305 to 1315.
- the selected SAO parameters set at step 1315 is set for the first CTU of the slice/frame. Then for each CTU from the second CTU to the last CTU of the slice/frame, the saojnergejeft Jlag is set equal to 1 if it exists otherwise the
- sao_merge_up Jlag is set equal to 1 (indeed for the second CTU to the last CTU a merge Left or Up or both exist) 1317.
- the syntax of the SAO parameters set is unchanged from that presented in Figure 9. At the end of the process the SAO parameters are set for the whole slice/frame.
- CTUStats table in the case of determining the SAO parameters for the whole image (frame level) is created by the process of Figure 18. This corresponds to evaluating the frame level in terms of the rate-distortion compromise.
- the evaluations are then compared and the one with the best performance is selected.
- the example of determining the SAO parameters in Figure 18 corresponds to the first method of sharing SAO parameters as it uses the merge flags to share the SAO parameters among all CTUs of the image (see steps 1316 and 1317). These steps can be omitted if a second method of sharing SAO parameters is used as described in further embodiments below.
- FIG 19 is an example of the setting of SAO parameters sets for a third grouping 1203 at the encoder side.
- This Figure is based on Figure 12.
- the modules 1105 to 1115 have been merged in one step 1405 in this Figure 19.
- the CTUStats table is set for each CTU. This CTUStats can be used for the traditional CTU level 1302 encoding choice.
- the table ColumnStats is set by adding each value 1405 from CTUStats 1402, for each CTUs of the current column 1404. Then the new SAO parameters are determined as for CTU level 1406 encoding choice (cf. Figure 12).
- the RD cost to share the SAO parameters with the previous left column is also evaluated 1407, in the same way as the sharing of SAO parameters set between left and up CTU 1103, 1104 is evaluated. If the sharing of SAO parameters gives a better RD cost 1408 than the RD cost for the new SAO parameters set, the saojnergejeft Jlag is set equal to 1 for the first CTU of the column. This CTU has the address number equal to the value “Column”. Otherwise, the SAO parameters set for this first CTU of the column is set equal (1409) to the new SAO parameters obtained in step 1406.
- step 1412 can be processed once per frame.
- CTU grouping is another RD compromise between the CTU level encoding choice and the frame level which can be useful for some conditions.
- merge flags are used within the group, which means that the third grouping can be introduced without modifying the decoder (i.e. the grouping can be HE VC-compliant).
- the second method of sharing SAO parameters described in the third embodiment can be used instead. In that case, merge flags are not used in the group (CTU column) and steps 1411 and 1412 are omitted.
- the Merge between columns doesn’t need to be checked. It means that steps 1407 1408 1410 are removed from the process of Figure 19.
- the advantage of removing this possibility is a simplification of the implementation and the ability to parallelize the process. This has a small impact on coding efficiency.
- FIG. 16 Another possible compromise intermediate between the CTU level and the frame level can be offered by a fourth grouping 1204 in Figure 16, which makes a line of CTUs a group.
- a similar process to that of Figure 18 can be applied.
- the variable ColumnStats is changed by LineStats.
- the New SAO parameters and the merge with the up CTU is evaluated based on this LineStats table (steps 1406 1407).
- the step 1410 is replaced by setting of sao merge up flag to 1 for the first CTU of the Line. And for all CTUs of the slice/ffame except each first CTU of each Line, sao merge left flag is set equal to 1.
- the advantage of the line is another RD compromise between the CTU level and Frame level. Please note that the frame or slice are most of the time rectangles and their width is larger than their height. So the line CTUs grouping 1204 is expected to be an RD compromise closer to the frame CTU grouping 1202 than the column CTU grouping 1203. As for the other CTU groupings 1202 and 1203, the line CTU grouping can be HE VC compliant if the merge flags are used within the groups.
- RD compromises can be offered by putting two or more columns of CTUs or two or more lines of CTUs together as a group.
- the process of Figure 18 can be adapted to determine SAO parameters to such groups.
- the number N of columns or lines in a group may depend on the number of groups that are targeted.
- the merge between these groups containing two or more columns or two or more lines doesn’t need to be evaluated.
- Another possible grouping includes split columns or split lines, where the split is tailored to the current slice/frame.
- FIG. 12 Another possible compromise between the CTU level and the frame level can be offered by square CTU groupings 1205 and 1206 as illustrated in Figure 18.
- the grouping 1205 makes 2x2 CTUs a group.
- the grouping 1206 makes 3x3 CTUs a group.
- Figure 20 shows an example of how to determine the SAO parameters for such groupings.
- NxNStats 1507 is set 1504, 1505, 1506 based on CTUstats. This table is used to determine the New SAO parameters 1508 and its RD cost, in addition to the RD cost for a Feft 1510 sharing or Up 1509 sharing of SAO parameters.
- the SAO parameters of the first CTU (top left CTU) of the NxN group is set equal to this new SAO parameters 1514. If the best RD cost is the sharing of SAO parameters with the up NxN group 1512, the sao_merge_up_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1 and the
- the sao merge left flag to 0 1515. If the best RD cost is the sharing of SAO parameters with the left NxN group 1513, the sao_merge_left_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1, 1516. Then the sao merge left flag and sao merge up flag are set correctly for the other CTUs of the NxN group in order to form the SAO parameters for the current NxN group 1517.
- Figure 21 illustrates this setting for a 3x3 SAO group.
- the top left CTU is set equal to the SAO parameters determined in step 1508 to 1516.
- the saojnergejeft Jlag is set equal to 1.
- the saojnergejeft Jlag is the first flag encoded or decoded and as it is set to 1 , there is no need to set the sao merge up flag to 0.
- the saojnergejeft Jlag is set equal to 0 and sao_merge_up Jlag is set equal to 1.
- the saojnergejeft Jlag is set equal to 1.
- these groupings can be HE VC compliant if merge flags within the groups are used.
- the test of Merge left and Merge up between groups can be dispensed with in Figure 19. So steps 1509, 1510, 1512, 1513, 1515 and 1516 can be removed, especially when N is high.
- the value N depends on the size of the frame/slice.
- N 2 and 3 are evaluated. This offers an efficient compromise.
- the possible groupings are in competition with one another as the SAO parameter derivation to be selected for the current slice.
- An example about how to select the SAO parameter derivation using a rate-distortion compromise comparison is described below according to a sixth embodiment of the invention in reference to Figure 22.
- Figure 23 is a flow chart illustrating a decoding process when the CTU grouping is signaled in the slice header according to the second method of sharing SAO parameters among the CTUs of the group.
- the corresponding CTUs grouping index 1804 is used to select the CTUs grouping method 1805. This grouping method will be applied to extract the SAO syntax and to determine the SAO parameters set for each CTU 1806. Then the next slice header syntax element is decoded.
- the CTUs grouping index uses a unary max code in the slice header.
- the CTUs groupings are ordered according to their probabilities of occurrences (highest to lowest).
- at least one SAO parameter derivation is an intermediate level derivation (SAO parameters not at CTU level or at group level).
- Each subdivided part is made up of two or more said image parts (CTUs).
- CTUs said image parts
- the advantage of the intermediate level derivation(s) is introduction of one or more effective rate-distortion compromises.
- the intermediate level derivation(s) can be used without the CTU-level derivation or without the frame-level derivation or without either of those two derivations.
- the smallest grouping is the first grouping 1201 in which each CTU is a group and there is one set of SAO parameters per CTU.
- set of SAO parameters can be applied to a smaller block than the CTU.
- the derivation is not at the CTU level, frame level or an intermediate level between the CTU and frame levels but at a sub- CTU level (a level smaller than an image part).
- index 0 means that each CTU is divided into 16 blocks and each may have its own SAO parameters.
- Index 1 means that each CTU is divided into 4 blocks, again each having its own SAO parameters.
- the selected derivation is then signalled to the decoder in the bitstream.
- the signalling may comprise a depth syntax element (e.g. using the indexing scheme above).
- least one derivation when applied to a group causes the group to be subdivided into subdivided parts and derives SAO parameters for each of the subdivided parts, and each image part is made up of two or more said sub-divided parts.
- the first derivation when applied to a group causing the group to have SAO parameters at a first level and the second derivation when applied to a group causing the group to have SAO parameters at a second level different from the first level.
- the levels may any two levels from the frame level to a sub-CTU level.
- the levels may correspond to the groupings 1201-1206 in Figure 12.
- SAO parameters is signalled for a slice, which means that the derivation is used for all CTUs of the slice.
- a derivation may be selected per CTU group (e.g. each column of CTUs) of the slice or frame.
- the SAO merge flags are usable between groups of the CTUs grouping. As depicted in Figure 24, for the 2x2 CTU grouping, the SAO merge Left and SAO merge up are kept for each group of 2x2 CTUs. But they are removed for CTUs inside the group. Please note that only the saojnergejeft Jlag is used for the grouping 1203 of a column of CTUs, and only the sao_merge_up Jlag is used for the grouping 1204 of a line of CTUs.
- a flag signals if the current CTU group shares its SAO parameters or not. If it is true, a syntax element representing one of the previous groups is signalled. So each group of a slice can be predicted by a previous group except the first one. This improves the coding efficiency by adding several new possible predictors.
- a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO.
- the default set depends on the selected grouping. For example, a first default set may be associated with one grouping (or one level of SAO parameters) and a second default set may be associated with another grouping (or another level of SAO parameters).
- the size of the groups (or the level of SAO parameters) is found to have an influence of what SAO parameters work efficiently as the default set.
- the different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the grouping selected for the current slice.
- a depth of the SAO parameters was selected for a slice, including depths smaller than a CTU, making it possible to have a set of SAO parameters per block in a CTU.
- Embodiments of the present invention described below are intended to improve the coding efficiency of SAO by using various techniques for determining one or more SAO parameters of an image part in a current image.
- first group of embodiments it is proposed to improve the use of syntax elements enabling for one image part (for instance a CTU) the inferring of SAO parameters from another image part (another CTU), through the flags:“ sao_merge_up Jlag” and
- the method comprises:
- the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part,
- Figure 25 is a flow chart illustrating the steps of a process that may be implemented in an encoder, to encode SAO parameters according to the first embodiment. More precisely this figure illustrates as an example, the signalling of the inferring or not of SAO parameters provided for CTUs or group of CTUs within a bitstream 4114, based on CTU grouping.
- the encoder checks first in a test 4102 whether an information, for example, a grouping index 4101 is set equal to the predetermined level, for example CTU level, 4102.
- the SAO merge flags are not included in the bitstream if said image part is of a group comprising at least two image parts.
- the CTU is of a group made up of
- the SAO parameters for filtering the image part are included in the bitstream.
- the first syntax element may be included in the bitstream. More precisely if the image part is of a group comprising partitioned image parts, then the first syntax element may be included in the bitstream. More precisely if the image part is of a group comprising
- the first syntax element is included in the bitstream.
- the saojnergejeft Jlag is inserted in a step 4104 in the bitstream 4114.
- the saojnergejeft Jlag is not equal to false (or the value‘0’) in test 4105 and if the up CTU (meaning the CTU located above the processed CTU) exists in a test 4109, the saojnerge ip Jlag is inserted in a step 4107, in the bistream 4114.
- test 4108 a new SAO parameters set is inserted in the bitstream in the steps 4111, 4112 and 4113.
- step 4112 may also comprise the insertion of a flag
- the flag is a second syntax element, associated with the group of the image part, for signalling whether the use of the first syntax element(s) is enabled or disabled.
- the saojnerge Jlags_enabled may be included in the bitstream based on a criterion (for instance the group the CTU belongs to, or the prediction or encoding mode which is used), for signalling whether the use of the SAO merge flags for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.
- a criterion for instance the group the CTU belongs to, or the prediction or encoding mode which is used
- the index or the first or second syntax element when included in the bitstream are inserted at:
- Figure 25 is a flow chart illustrating the steps of a process to parse SAO parameters, as an alternative to the process illustrated in Figure 9, and in relation to the encoding steps implemented in an encoder described in Figure 22. The steps are preferably implemented in a decoder.
- This second embodiment proposes a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups.
- the method comprises:
- the process of Figure 25 is applied for each CTU to generate a set of SAO parameters for all components.
- the information may be the grouping index 4013 previously mentioned. It is tested whether it is set equal to a value indicating a predetermined group or a set of predetermined groups, for example corresponding to the CTU level index 4014.
- Said CTU level index may have been previously decoded from the header for example. In that case, the
- sao j nergejeft Jlag (first syntax element) is extracted in a step 4003 from a bitstream 4002 and if needed the sao_merge_up Jlag (first syntax element) is also extracted in a step 4005 from the bitstream 4002.
- the grouping index is not set equal to the CTU level index in the test 4014, the flags saojnergejeft Jlag and saojnerge up Jlag are not considered and new SAO parameters are extracted in a step 4007, from the bitstream 4002.
- steps 4004, 4006 and 4008-4012 are respectively similar to steps 504, 506, 508-512 in Figure 9 previously described.
- the merge flags are kept for CTU level but removed for all other CTU groupings, as illustrated in Figure 25.
- the advantage is a flexibility of the CTU level.
- said image part is of a group comprising at least two image parts, then the first syntax element(s) is (are) not obtained.
- the merge flags are used for CTU when the SAO signalling is lower or equal to the CTU level (1/16 CTU or 1 ⁇ 4 CTU or 1/8 CTU) and removed for other CTUs groupings having larger groups.
- said image part is of a group made up of
- the first syntax element (s) is (are) not included in the bitstream.
- Fourth embodiment Figure 26 is a flow chart illustrating a third embodiment this variant of the second embodiment in Figure 25.
- a new test 4214 evaluates if the value of a syntax element the flag saojnerge Jlags _enabled (second syntax element) is equal to true to enable the decoding of the flags saojnergejeft Jlag and sao_merge_up Jlag testes in tests 4203 and 4205, instead of checking the value of the grouping index, as illustrated in Figure 25.
- the information is a second syntax element associated with the group of the image part, signalling whether the use of the first syntax element ⁇ saojnergejeft Jlag and saojnerge up Jlag) is enabled or disabled.
- a fifth embodiment it is proposed a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups.
- SAO sample adaptive offset
- the method comprises:
- the predetermined criterion is the fact that said image part is of a predetermined group or not.
- first or second syntax elements are parsed from the bitstream.
- a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO.
- the default set depends on the selected depth of SAO parameters. For example, a first default set may be associated with one depth (for example 1/16) and a second default set may be associated with another depth (for example 1 ⁇ 4). The depth is found to have an influence of what SAO parameters work efficiently as the default set.
- the different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the depth selected for the current slice.
- one possibility is to remove the SAO merge flags for all levels. It means that steps 503 504 505 506 of Figure 9 are removed.
- the advantage is that it reduces significantly the signalling of SAO and consequently it reduces the bitrate. Moreover, it simplifies the design by removing two syntax elements at CTU level.
- the merge flags are important for small block sizes because a SAO parameters set is costly compared to the amount of samples that it can improve. In that case, these syntax elements reduce the cost of SAO parameters signalling. For large groups, the SAO parameters set is less costly so the usage of merge flags is not efficient. So the advantage of these embodiments is a coding efficiency increase.
- Figure 22 illustrates this embodiment. More precisely Figure 22 illustrates an example of how to select the SAO parameter derivation using a rate-distortion compromise comparison.
- One possibility to increase the coding efficiency at encoder side is to test all possible SAO groupings but this should increase the encoding time compared to the example of
- the current slice/frame 1701 is used to set the CTU Stats table 1703 for each CTU 1702. This table 1703 is used to evaluate the CTU level 1704, the frame/ Slice Grouping
- the best CTUs grouping is selected according to the rate distortion criterion computed for each grouping 1710.
- the SAO parameters sets for each CTU are set (1711) according to the grouping selected in step 1710. These SAO parameters 1712 are then used to apply the SAO filtering 1713 in order to obtain the filtered frame/slice.
- the SAO parameters for each CTU 1711 is then inserted inside the bitstream as described in Figure 9.
- the advantage of this embodiment is that it doesn’t require any modification of HE VC SAO at decoder side, so this method is HEVC compliant.
- the main advantage is a coding efficiency increase.
- the second advantage is that this competition method doesn’t require any additional SAO filtering or classification. Indeed, the main impacts on encoder complexity are the step 1702 which needs SAO classification for all possible SAO type and the step 1713 which filtered the samples. All other CTU groupings evaluations are only some additions of values already obtained during the CTU level encoding choice (set in the table CTUStats).
- the competition between the different permitted SAO parameters derivations is modified so that only one derivation is permitted in the encoder for any given slice or frame.
- the permitted derivation may be determined in dependence upon one or more characteristics of the slice or frame.
- the permitted derivation may be selected based on the slice type (Intra, Inter P, Inter B), quantization level (QP) of the slice, or position in the hierarchy of a Group of Pictures (GOP).
- the advantage of this embodiment is a complexity reduction. Instead of evaluating two or more competing derivations just one derivation is selected, which can be useful for a hardware encoder.
- a first derivation is associated with first groups of the image (e.g. Intra slices) and a second derivation is associated with second groups of the image (e.g.
- Inter P slices It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, the first derivation is used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, the second derivation is used to filter the image parts of the group.
- Whether a group to be filtered is determined to be a first group or a second group may depend on one or more of:
- the first derivation may have fewer image parts per group than the second derivation.
- a particular derivation of the SAO parameters was selected for a given slice or frame.
- the encoder has the capacity to evaluate a limited number of competing derivations, it is unnecessary to eliminate the competition altogether.
- the competition for a given slice or frame is still permitted but the set of competing derivations is adapted to the slice or frame.
- the set of competing derivations may depend on the slice type.
- the set preferably contains groupings with groups containing small numbers CTUs (e.g. CTU level, 2x2 CTU, 3x3 CTU, and Column). Also, if depths lower than a CTU are available (as in the tenth embodiment), these depths are preferably also included.
- the set of derivations preferably contains groupings with groups containing large numbers of CTUs such as Fine, Frame level. However, smaller groupings can also be considered down to the CTU level.
- the advantage of this embodiment is a coding efficiency increase thanks to the use of derivations adapted for a slice or frame.
- the set of derivations can be different for an Inter B slice from that for an Inter P slice.
- the set of competing derivations depends on the characteristics of the frame in the GOP. This is especially beneficial for frames which vary in quality (QP) based on a quality hierarchy. For the frames with the highest quality or highest position in the hierarchy, the set of competing derivations should include groups containing few CTUs or even sub-CTU depths (same as for Intra slices above). For frames with a lower quality or lower position in the hierarchy, the set of competing derivations should include groups with more CTUs.
- the set of competing derivations can be defined in the sequence parameters set.
- a first set of derivations is associated with first groups of the image (e.g. Intra slices) and a second set of derivations is associated with second groups of the image (e.g. Inter P slices). It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, a derivation is selected from the first set of derivations and used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, a derivation is selected from the second set of derivations and used to filter the image parts of the group. Evaluation of derivations not in the associated set of derivations is not required.
- Whether a group to be filtered is a first group or a second group may be determined in the preceding embodiment. For example, when the first groups have a higher quality or higher position in the quality hierarchy than the second groups, the first set of derivations may have at least one derivation with fewer image parts per group than the derivations of the second set of derivations.
- the set of CTUs groupings can be defined in the sequence parameters set.
- the seventh embodiment proposed a method of encoding an image comprising a plurality of image parts.
- the method comprises
- the image part is predicted from another image part within said image, using an intra prediction mode or from another image part within another reference image than said image, using an inter prediction mode.
- Figure 28 shows a system 191 195 comprising at least one of an encoder 150 or a decoder 100 and a communication network 199 according to embodiments of the present invention.
- the system 195 is for processing and providing a content (for example, a video and audio content for displaying/outputting or streaming video/audio content) to a user, who has access to the decoder 100, for example through a user interface of a user terminal comprising the decoder 100 or a user terminal that is
- a content for example, a video and audio content for displaying/outputting or streaming video/audio content
- Such a user terminal may be a computer, a mobile phone, a tablet or any other type of a device capable of providing/displaying the
- the system 195 obtains/receives a bitstream 101 (in the form of a continuous stream or a signal - e.g. while earlier video/audio are being displayed/output) via the communication network 199.
- the system 191 is for processing a content and storing the processed content, for example a video and audio content processed for displaying/outputting/streaming at a later time.
- the system 191 obtains/receives a content comprising an original sequence of images 151, which is received and processed (including filtering with a deblocking filter according to the present invention) by the encoder 150, and the encoder 150 generates a bitstream 101 that is to be communicated to the decoder 100 via a communication network 191.
- the bitstream 101 is then communicated to the decoder 100 in a number of ways, for example it may be generated in advance by the encoder 150 and stored as data in a storage apparatus in the communication network 199 (e.g. on a server or a cloud storage) until a user requests the content (i.e. the bitstream data) from the storage apparatus, at which point the data is communicated/streamed to the decoder 100 from the storage apparatus.
- the system 191 may also comprise a content providing apparatus for providing/streaming, to the user (e.g. by communicating data for a user interface to be displayed on a user terminal), content information for the content stored in the storage apparatus (e.g. the title of the content and other meta/storage location data for identifying, selecting and requesting the content), and for receiving and processing a user request for a content so that the requested content can be delivered/streamed from the storage apparatus to the user terminal.
- the encoder 150 generates the bitstream 101 and communicates/streams it directly to the decoder 100 as and when the user requests the content.
- the decoder 100 then receives the bitstream 101 (or a signal) and performs filtering with a deblocking filter according to the invention to obtain/generate a video signal 109 and/or audio signal, which is then used by a user terminal to provide the requested content to the user.
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit.
- Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol.
- computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave.
- Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure.
- a computer program product may include a computer-readable medium.
- such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer.
- any connection is properly termed a computer-readable medium.
- a computer-readable medium For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
- DSL digital subscriber line
- Disk and disc includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
- processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable logic arrays
- processors may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein.
- the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.
- the techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set).
- IC integrated circuit
- a set of ICs e.g., a chip set.
- Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
It is proposed a method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups. The method comprises: determining whether an image part is of a predetermined group, and if the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part, else not including the first syntax element.
Description
VIDEO CODING AND DECODING
Recently, the Joint Video Experts Team (JVET), a collaborative team formed by MPEG and ITU-T Study Group l6’s VCEG, commenced work on a new video coding standard referred to as Versatile Video Coding (VVC). The goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020. The main target applications and services include— but not limited to— 360-degree and high-dynamic-range (HDR) videos. In total, JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs. Some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard.
The JVET exploration model (JEM) uses all the HEVC tools. One of these tools is sample adaptive offset (SAO) filtering. However, SAO is less efficient in the JEM reference software than in the HEVC reference software. This arises from fewer evaluations and from signalling inefficiencies compared to other loop filters.
US 9769450 discloses an SAO filter for three dimensional or 3D Video Coding or 3DVC such as implemented by the HEVC standard. The filter directly re-uses SAO filter parameters of an independent view or a coded dependent view to encode another dependent view, or re-uses only part of the SAO filter parameters of the independent view or a coded dependent view to encode another dependent view. The SAO parameters are re-used by copying them from the independent view or coded dependent view.
US 2014/0192860 Al relates to the scalable extension of HEVC. HEVC scalable extension aims at allowing coding/decoding of a video made of multiple scalability layers, each layer being made up of a series of frames. Coding efficiency is improved by inferring, or deriving, SAO parameters to be used at an upper layer (e.g. an enhancement layer) from the SAO parameters actually used at a lower (e.g. base) layer. This is because inferring some SAO parameters makes it possible to avoid transmitting them.
US 2013/0051455 Al describes some the merging process for SAO parameters. SAO parameters for Coding Tree Units are grouped in order to reduce the encoder delay. It is proposed to provide SAO parameters for a region (for example a row) that enable the determination of SAO parameters for the filtering regions to be performed in parallel.
HE VC standard proposes to signal whether SAO parameter set (sao merge up flag, sao merge left flag ) are derived or not from the above Coding Tree Unit.
It is desirable to improve the coding efficiency of images subjected to the SAO filtering.
Different aspects of the present invention are described below.
In a first aspect, it is also proposed a method and corresponding device for signalling a Sample Adaptive Offset (SAO) filtering. The first aspect also concerns a method a corresponding device for performing a Sample Adaptive Offset (SAO) filtering.
The first aspect also described corresponding encoding and decoding method and associated devices.
In a second aspect, it is proposed a method and corresponding device for encoding an image comprising a plurality of image parts, one or more image parts being predicted from one or more other image parts according to a first or a second prediction mode.
According to a third aspect of the present invention there is provided a method of signalling, in a bitstream, Sample Adaptive Offset (SAO) filtering parameters for use in performing SAO filtering on an image comprising a plurality of image parts, the image parts being groupable into groups of image parts using two or more different available groupings, the method comprising: determining which of said available groupings applies to an image part to be filtered; if the determined grouping is a predetermined one of the different available grroupings, including in the bitstream an inferring-permitted syntax element, which indicates that it is permitted to infer the SAO parameters for performing SAO filtering on the image part to be filtered from the SAO parameters used for filtering another image part; and if the determined grouping is another one of the different available groupings, not including the syntax element in the bitstream.
Reference will now be made, by way of example, to the accompanying drawings, in which:
Figure 1 is a diagram for use in explaining a coding structure used in HEVC;
Figure 2 is a block diagram schematically illustrating a data communication system in which one or more embodiments of the invention may be implemented;
Figure 3 is a block diagram illustrating components of a processing device in which one or more embodiments of the invention may be implemented;
Figure 4 is a flow chart illustrating steps of an encoding method according to embodiments of the invention; Figure 5 is a flow chart illustrating steps of a loop filtering process of in accordance with one or more embodiments of the invention;
Figure 6 is a flow chart illustrating steps of a decoding method according to embodiments of the invention;
Figure 7A and 7B are diagrams for use in explaining edge-type SAO filtering in
HEVC;
Figure 8 is a diagram for use in explaining band-type SAO filtering in HEVC;
Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications;
Figure 10 is a flow chart illustrating in more detail one of the steps of the Figure 9 process;
Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications; Figure 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in a CTU-level derivation of SAO parameters;
Figure 13 shows one of the steps of Figure 12 in more detail;
Figure 14 shows another one of the steps of Figure 12 in more detail;
Figure 15 shows yet another one of the steps of Figure 12 in more detail;
Figure 16 shows various different groupings 1201-1206 of CTUs in a slice;
Figure 17 is a diagram showing image parts of a frame in a derivation of SAO parameters in which a first method of sharing SAO parameters is used;
Figure 18 is a flowchart of an example of a process for setting SAO parameters in the derivation of Figure 17;
Figure 19 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a column of CTUs;
Figure 20 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a group of NxN CTUs;
Figure 21 is a diagram showing image parts of one NxN group in the derivation of Figure 20;
Figure 22 illustrates an example of how to select the SAO parameter derivation according to the sixth embodiment of the invention;
Figure 23 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group; Figure 24 is a diagram showing image parts of multiple 2x2 groups;
Figure 25 is a flow chart illustrating a encoding process according to the first embodiment of the invention; Figure 26 is a flow chart illustrating a decoding process according to the third embodiment of the invention;
Figure 27 is a flow chart illustrating a decoding process according to the fourth embodiment of the invention; and
Figure 28 is a diagram showing a system comprising an encoder or a decoder and a communication network according to embodiments of the present invention,
Figure 1 relates to a coding structure used in the High Efficiency Video Coding (HEVC) video standard. A video sequence 1 is made up of a succession of digital images i. Each such digital image is represented by one or more matrices. The matrix coefficients represent pixels.
An image 2 of the sequence may be divided into slices 3. A slice may in some instances constitute an entire image. These slices are divided into non-overlapping Coding Tree Units (CTUs) 4. A Coding Tree Unit (CTU) is the basic processing unit of the High Efficiency Video Coding (HEVC) video standard and conceptually corresponds in structure to macroblock units that were used in several previous video standards. A CTU is also sometimes referred to as a Largest Coding Unit (LCU).
A CTU is generally of size 64 pixels x 64 pixels. Each CTU may in turn be iteratively divided into smaller variable-size Coding Units (CUs) 5 using a quadtree decomposition.
Coding units are the elementary coding elements and are constituted by two kinds of sub-unit called a Prediction Unit (PU) and a Transform Unit (TU). The maximum size of a PU or TU is equal to the CU size. A Prediction Unit corresponds to the partition of the CU for prediction of pixels values. Various different partitions of a CU into PUs are possible as shown by 6 including a partition into 4 square PUs and two different partitions into 2 rectangular PUs. A Transform Unit is an elementary unit that is subjected to spatial transformation using DCT. A CU can be partitioned into TUs based on a quadtree representation 7.
Each slice is embedded in one Network Abstraction Layer (NAL) unit. In addition, the coding parameters of the video sequence are stored in dedicated NAL units called parameter sets. In HEVC and H.264/AVC two kinds of parameter sets NAL units are employed: first, a Sequence Parameter Set (SPS) NAL unit that gathers all parameters that are unchanged during the whole video sequence. Typically, it handles the coding profile, the size of the video frames and other parameters. Secondly, a Picture Parameter Set (PPS) NAL unit includes parameters that may change from one image (or frame) to another of a sequence. HEVC also includes a Video Parameter Set (VPS) NAL unit which contains
parameters describing the overall structure of the bitstream. The VPS is a new type of parameter set defined in HEVC, and applies to all of the layers of a bitstream. A layer may contain multiple temporal sub-layers, and all version 1 bitstreams are restricted to a single layer. HEVC has certain layered extensions for scalability and multiview and these will enable multiple layers, with a backwards compatible version 1 base layer.
Figure 2 illustrates a data communication system in which one or more embodiments of the invention may be implemented. The data communication system comprises a transmission device, in this case a server 201, which is operable to transmit data packets of a data stream to a receiving device, in this case a client terminal 202, via a data communication network 200. The data communication network 200 may be a Wide Area Network (WAN) or a Local Area Network (LAN). Such a network may be for example a wireless network (Wifi / 802.1 la or b or g), an Ethernet network, an Internet network or a mixed network composed of several different networks. In a particular embodiment of the invention the data
communication system may be a digital television broadcast system in which the server 201 sends the same data content to multiple clients.
The data stream 204 provided by the server 201 may be composed of multimedia data representing video and audio data. Audio and video data streams may, in some embodiments of the invention, be captured by the server 201 using a microphone and a camera
respectively. In some embodiments data streams may be stored on the server 201 or received by the server 501 from another data provider, or generated at the server 201. The server 201 is provided with an encoder for encoding video and audio streams in particular to provide a compressed bitstream for transmission that is a more compact representation of the data presented as input to the encoder.
In order to obtain a better ratio of the quality of transmitted data to quantity of transmitted data, the compression of the video data may be for example in accordance with the HEVC format or H.264/AVC format.
The client 202 receives the transmitted bitstream and decodes the reconstructed bitstream to reproduce video images on a display device and the audio data by a loud speaker.
Although a streaming scenario is considered in the example of Figure 2, it will be appreciated that in some embodiments of the invention the data communication between an encoder and a decoder may be performed using for example a media storage device such as an optical disc.
In one or more embodiments of the invention a video image is transmitted with data representative of compensation offsets for application to reconstructed pixels of the image to provide filtered pixels in a final image.
Figure 3 schematically illustrates a processing device 300 configured to implement at least one embodiment of the present invention. The processing device 300 may be a device such as a micro-computer, a workstation or a light portable device. The device 300 comprises a communication bus 313 connected to:
-a central processing unit 311, such as a microprocessor, denoted CPU;
-a read only memory 307, denoted ROM, for storing computer programs for implementing the invention;
-a random access memory 312, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to embodiments of the invention; and
-a communication interface 302 connected to a communication network 303 over which digital data to be processed are transmitted or received
Optionally, the apparatus 300 may also include the following components:
-a data storage means 304 such as a hard disk, for storing computer programs for implementing methods of one or more embodiments of the invention and data used or produced during the implementation of one or more embodiments of the invention;
-a disk drive 305 for a disk 306, the disk drive being adapted to read data from the disk 306 or to write data onto said disk;
-a screen 309 for displaying data and/or serving as a graphical interface with the user, by means of a keyboard 310 or any other pointing means.
The apparatus 300 can be connected to various peripherals, such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.
The communication bus provides communication and interoperability between the various elements included in the apparatus 300 or connected to it. The representation of the bus is not limiting and in particular the central processing unit is operable to communicate instructions to any element of the apparatus 300 directly or by means of another element of the apparatus 300.
The disk 306 can be replaced by any information medium such as for example a compact disk (CD-ROM), rewritable or not, a ZIP disk or a memory card and, in general terms, by an information storage means that can be read by a microcomputer or by a microprocessor, integrated or not into the apparatus, possibly removable and adapted to store one or more programs whose execution enables the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to the invention to be implemented.
The executable code may be stored either in read only memory 307, on the hard disk 304 or on a removable digital medium such as for example a disk 306 as described previously. According to a variant, the executable code of the programs can be received by means of the communication network 303, via the interface 302, in order to be stored in one of the storage means of the apparatus 300 before being executed, such as the hard disk 304.
The central processing unit 311 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to the invention, instructions that are stored in one of the aforementioned storage means. On powering up, the program or programs that are stored in a non-volatile memory, for example on the hard disk 304 or in the read only memory 307, are transferred into the random access memory 312, which then contains the executable code of the program or programs, as well as registers for storing the variables and parameters necessary for implementing the invention.
In this embodiment, the apparatus is a programmable apparatus which uses software to implement the invention. However, alternatively, the present invention may be
implemented in hardware (for example, in the form of an Application Specific Integrated Circuit or ASIC).
Figure 4 illustrates a block diagram of an encoder according to at least one embodiment of the invention. The encoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, at least one corresponding step of a method implementing at least one embodiment of encoding an image of a sequence of images according to one or more embodiments of the invention.
An original sequence of digital images /Ό to m 401 is received as an input by the encoder 40. Each digital image is represented by a set of samples, known as pixels.
A bitstream 410 is output by the encoder 40 after implementation of the encoding process. The bitstream 410 comprises a plurality of encoding units or slices, each slice
comprising a slice header for transmitting encoding values of encoding parameters used to encode the slice and a slice body, comprising encoded video data.
The input digital images iO to m 401 are divided into blocks of pixels by module 402. The blocks correspond to image portions and may be of variable sizes (e.g. 4x4, 8x8, 16x16, 32x32, 64x64 pixels). A coding mode is selected for each input block. Two families of coding modes are provided: coding modes based on spatial prediction coding (Intra prediction), and coding modes based on temporal prediction (Inter coding, Merge, SKIP).
The possible coding modes are tested.
Module 403 implements an Intra prediction process, in which the given block to be encoded is predicted by a predictor computed from pixels of the neighbourhood of said block to be encoded. An indication of the selected Intra predictor and the difference between the given block and its predictor is encoded to provide a residual if the Intra coding is selected.
Temporal prediction is implemented by motion estimation module 404 and motion compensation module 405. Firstly a reference image from among a set of reference images/pictures 416 is selected, and a portion of the reference image, also called reference area or image portion, which is the closest area to the given block to be encoded, is selected by the motion estimation module 404. Motion compensation module 405 then predicts the block to be encoded using the selected area. The difference between the selected reference area and the given block, also called a residual block, is computed by the motion
compensation module 405. The selected reference area is indicated by a motion vector.
Thus in both cases (spatial and temporal prediction), a residual is computed by subtracting the prediction from the original block.
In the INTRA prediction implemented by module 403, a prediction direction is encoded. In the temporal prediction, at least one motion vector is encoded.
Information relative to the motion vector and the residual block is encoded if the Inter prediction is selected. To further reduce the bitrate, assuming that motion is homogeneous, the motion vector is encoded by difference with respect to a motion vector predictor. Motion vector predictors of a set of motion information predictors is obtained from the motion vectors field 418 by a motion vector prediction and coding module 417.
The encoder 40 further comprises a selection module 406 for selection of the coding mode by applying an encoding cost criterion, such as a rate-distortion criterion. In order to further reduce redundancies a transform (such as DCT) is applied by transform module 407 to the residual block, the transformed data obtained is then quantized by quantization module
408 and entropy encoded by entropy encoding module 409. Finally, the encoded residual block of the current block being encoded is inserted into the bitstream 410.
The encoder 40 also performs decoding of the encoded image in order to produce a reference image for the motion estimation of the subsequent images. This enables the encoder and the decoder receiving the bitstream to have the same reference frames. The inverse quantization module 411 performs inverse quantization of the quantized data, followed by an inverse transform by inverse transform module 412. The intra prediction module 413 uses the prediction information to determine which predictor to use for a given block and the motion compensation module 414 actually adds the residual obtained by module 412 to the reference area obtained from the set of reference images 416.
Post filtering is then applied by module 415 to filter the reconstructed frame of pixels. In the embodiments of the invention an SAO loop filter is used in which compensation offsets are added to the pixel values of the reconstructed pixels of the reconstructed image
Figure 5 is a flow chart illustrating steps of loop filtering process according to at least one embodiment of the invention. In an initial step 51 , the encoder generates the
reconstruction of the full frame. Next, in step 52 a deblocking filter is applied on this first reconstruction in order to generate a deblocked reconstruction 53. The aim of the deblocking filter is to remove block artifacts generated by residual quantization and block motion compensation or block Intra prediction. These artifacts are visually important at low bitrates. The deblocking filter operates to smooth the block boundaries according to the characteristics of two neighboring blocks. The encoding mode of each block, the quantization parameters used for the residual coding, and the neighboring pixel differences in the boundary are taken into account. The same criterion/classification is applied for all frames and no additional data is transmitted. The deblocking filter improves the visual quality of the current frame by removing blocking artifacts and it also improves the motion estimation and motion compensation for subsequent frames. Indeed, high frequencies of the block artifact are removed, and so these high frequencies do not need to be compensated for with the texture residual of the following frames.
After the deblocking filter, the deblocked reconstruction is filtered by a sample adaptive offset (SAO) loop filter in step 54 using SAO parameters 58 determined in accordance with embodiments of the invention. The resulting frame 55 may then be filtered with an adaptive loop filter (ALF) in step 56 to generate the reconstructed frame 57 which will be displayed and used as a reference frame for the following Inter frames.
In step 54 each pixel of the frame region is classified into a class or group. The same offset value is added to every pixel value which belongs to a certain class or group.
The determination of the SAO parameters for the sample adaptive offset filtering will be explained in more detail hereafter with reference to any one of Figures 10 to 11.
Figure 6 illustrates a block diagram of a decoder 60 which may be used to receive data from an encoder according an embodiment of the invention. The decoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, a corresponding step of a method implemented by the decoder 60.
The decoder 60 receives a bitstream 61 comprising encoding units, each one being composed of a header containing information on encoding parameters and a body containing the encoded video data. As explained with respect to Figure 4, the encoded video data is entropy encoded, and the motion vector predictors’ indexes are encoded, for a given block, on a predetermined number of bits. The received encoded video data is entropy decoded by module 62. The residual data are then dequantized by module 63 and then an inverse transform is applied by module 64 to obtain pixel values.
The mode data indicating the coding mode are also entropy decoded and based on the mode, an INTRA type decoding or an INTER type decoding is performed on the encoded blocks of image data.
In the case of INTRA mode, an INTRA predictor is determined by intra prediction module 65 based on the intra prediction mode specified in the bitstream.
If the mode is INTER, the motion prediction information is extracted from the bitstream so as to find the reference area used by the encoder. The motion prediction information is composed of the reference frame index and the motion vector residual. The motion vector predictor is added to the motion vector residual in order to obtain the motion vector by motion vector decoding module 70.
Motion vector decoding module 70 applies motion vector decoding for each current block encoded by motion prediction. Once an index of the motion vector predictor, for the current block has been obtained the actual value of the motion vector associated with the current block can be decoded and used to apply motion compensation by module 66. The reference image portion indicated by the decoded motion vector is extracted from a reference image/picture 68 to apply the motion compensation 66. The motion vector field data 71 is updated with the decoded motion vector in order to be used for the inverse prediction of subsequent decoded motion vectors.
Finally, a decoded block is obtained. Post filtering is applied by post filtering module 67 similarly to post filtering module 415 applied at the encoder as described with reference to Figure 4. A decoded video signal 69 is finally provided by the decoder 60.
The aim of SAO filtering is to improve the quality of the reconstructed frame by sending additional data in the bitstream in contrast to the deblocking filter where no information is transmitted. As mentioned above, each pixel is classified into a predetermined class or group and the same offset value is added to every pixel sample of the same class/group. One offset is encoded in the bitstream for each class. SAO loop filtering has two SAO types: an Edge Offset (EO) type and a Band Offset (BO) type. An example of Edge Offset type is schematically illustrated in Figures 7A and 7B, and an example of Band Offset type is schematically illustrated in Figure 8.
In HEVC, SAO filtering is applied CTU by CTU. In this case the parameters needed to perform the SAO filtering (set of SAO parameters) are selected for each CTU at the encoder side and the necessary parameters are decoded and/or derived for each CTU at the decoder side. This offers the possibility of easily encoding and decoding the video sequence by processing each CTU at once without introducing delays in the processing of the whole frame. Moreover, when SAO filtering is enabled, only one SAO type is used: either the Edge Offset type filter or the Band Offset type filter according to the related parameters transmitted in the bitstream for each classification. One of the SAO parameters in HEVC is an SAO type parameter saojypejdx which indicates for the CTU whether EO type, BO type or no SAO filtering is selected for the CTU concerned.
The SAO parameters for a given CTU can be copied from the upper or left CTU, for example, instead of transmitting all the SAO data. One of the SAO parameters in HEVC is a sao_merge_up flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the upper CTU. Another of the SAO parameters in HEVC is a saojnergejeft flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the left CTU.
SAO filtering may be applied independently for different color components (e.g. YUV) of the frame. For example, one set of SAO parameters may be provided for the luma component Y and another set of SAO parameters may be provided for both chroma components U and V in common. Also, within the set of SAO parameters one or more SAO parameters may be used as common filtering parameters for two or more color components, while other SAO parameters are dedicated (per-component) filtering parameters for the color components. For example, in HEVC, the SAO type parameter sao ypejdx is common to U
and V, and so is a EO class parameter which indicates a class for EO filtering (see below), whereas a BO class parameter which indicates a group of classes for BO filtering has dedicated (per-component) SAO parameters for U and V.
A description of the Edge Offset type in HEVC is now provided with reference to Figures 7A and 7B.
Edge Offset type involves determining an edge index for each pixel by comparing its pixel value to the values of two neighboring pixels. Moreover, these two neighboring pixels depend on a parameter which indicates the direction of these two neighboring pixels with respect to the current pixel. These directions are the 0-degree (horizontal direction), 45- degree (diagonal direction), 90-degree (vertical direction) and 135-degree (second diagonal direction). These four directions are schematically illustrated in Figure 7A.
The table of Figure 7B gives the offset value to be applied to the pixel value of a particular pixel“C” according to the value of the two neighboring pixels Cnl and Cn2 at the decoder side.
When the value of C is less than the two values of neighboring pixels Cnl and Cn2, the offset to be added to the pixel value of the pixel C is“+ 01”. When the pixel value of C is less than one pixel value of its neighboring pixels (either Cnl or Cn2) and C is equal to one value of its neighbors, the offset to be added to this pixel sample value is“+ 02”.
When the pixel value of c is less than one of the pixel values of its neighbors (Cnl or Cn2) and the pixel value of C is equal to one value of its neighbors, the offset to be applied to this pixel sample is“- 03”. When the value of C is greater than the two values of Cnl or Cn2, the offset to be applied to this pixel sample is“- 04”.
When none of the above conditions is met on the current sample and its neighbors, no offset value is added to the current pixel C as depicted by the Edge Index value“2” of the table.
It is important to note that for the particular case of the Edge Offset type, the absolute value of each offset (01, 02, 03, 04) is encoded in the bitstream. The sign to be applied to each offset depends on the edge index (or the Edge Index in the HEVC specifications) to which the current pixel belongs. According to the table represented in Figure 7B, for Edge Index 0 and for Edge Index 1 (01, 02) a positive offset is applied. For Edge Index 3 and Edge Index 4 (03, 04), a negative offset is applied to the current pixel.
In the HEVC specifications, the direction for the Edge Offset amongst the four directions of Figure 7A is specified in the bitstream by a“sao _eo class luma” field for the luma component and a“sao_eo_class_chroma” field for both chroma components U and V.
The SAO Edge Index corresponding to the index value is obtained by the following formula:
Edgelndex = sign (C - Cn2) - sign (Cnl- C) +2
where the definition of the function sign(.) is given by the following relationships sign(x) = 1 , when x>0
sign(x) = -1, when x<0
sign(x) = 0, when x=0.
In order to simplify the Edge Offset determination for each pixel, the difference between the pixel value of C and the pixel value of both its neighboring pixels Cnl and Cn2 can be shared for current pixel C and its neighbors. Indeed, when SAO Edge Offset filtering is applied using a raster scan order of pixels of the current CTU or frame, the term sign (Cnl- C) has already computed for the previous pixels (to be precise it was computed as C’-Cn2’ at a time when the current pixel C’ at that time was the present neighboring pixel Cnl and the neighboring pixel Cn2’ was what is now the current pixel C). As a consequence this sign (cnl- c) does not need to be computed again.
A description of the Band Offset type is now provided with reference to Figure 8. Band Offset type in SAO also depends on the pixel value of the sample to be processed. A class in SAO Band offset is defined as a range of pixel values. Conventionally, for all pixels within a range, the same offset is added to the pixel value. In the HEVC specifications, the number of offsets for the Band Offset filter is four for each reconstructed block or frame area of pixels (CTU), as schematically illustrated in Figure 8.
One implementation of SAO Band offset splits the full range of pixel values into 32 ranges of the same size. These 32 ranges are the classes of SAO Band offset. The minimum value of the range of pixel values is systematically 0 and the maximum value depends on the bit depth of the pixel values according to the following relationship Max = 2B tdepth-l Classifying the pixels into 32 ranges of the full interval includes 5 bits checking needed to classify the pixels values for fast implementation i.e. only the 5 first bits (5 most significant bits) are checked to classify a pixel into one of the 32 classes/ ranges of the full range.
For example, when the bitdepth is 8 bits per pixel, the maximum value of a pixel can be 255. Hence, the range of pixel values is between 0 and 255. For this bitdepth of 8 bits, each class contains 8 pixel values.
In conventional Band Offset type filtering, the start of the band, represented by the grey area (40), that contains four ranges or classes, is signaled in the bitstream to identify the position of the first class of pixels or the first range of pixel values. The syntax element representative of this position is the“ sao_band jpositiorT’ field in the HEVC specifications. This corresponds to the start of class 41 in Figure 8. According to the HEVC specifications,
4 consecutive classes (41, 42, 43 and 44) of pixel values are used and 4 corresponding offsets are signaled in the bitstream.
Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications. The process of Figure 9 is applied for each CTU to generate a set of SAO parameters for all components. In order to avoid encoding one set of SAO parameters per CTU (which is very costly), a predictive scheme is used for the CTU mode. This predictive mode involves checking if the CTU on the left of the current CTU uses the same SAO parameters (this is specified in the bitstream through a flag named
“ saojnergejeft Jlag”). If not, a second check is performed with the CTU above the current CTU (this is specified in the bitstream through a flag named“ sao_merge_up Jlag”). This predictive technique enables the amount of data representing the SAO parameters for the CTU mode to be reduced. Steps of the process are set out below.
In step 503, the“ saojnergejeft Jlag” is read from the bitstream 502 and decoded. If its value is true, then the process proceeds to step 504 where the SAO parameters of left CTU are copied for the current CTU. This enables the types for YUV of the SAO filter for the current CTU to be determined in step 508.
If the outcome is negative in step 503 then the“ sao_merge_up Jlag” is read from the bitstream and decoded. If its value is true, then the process proceeds to step 505 where the SAO parameters of the above CTU are copied for the current CTU. This enables the types of the SAO filter for the current CTU to be determined in step 508.
If the outcome is negative in step 505, then the SAO parameters for the current CTU are read and decoded from the bitstream in step 507 for the Luma Y component and both U and V components (501) (551) for the type. The offsets for Chroma are independent.
The details of this step are described later with reference to Figure 10. After this step, the parameters are obtained and the type of SAO filter is determined in step 508.
In subsequent step 511 a check is performed to determine if the three colour components (Y and U & V) for the current CTU have been processed. If the outcome is positive, the determination of the SAO parameters for the three components is complete and
the next CTU can be processed in step 510. Otherwise, (Only Y was processed) U and V are processed together and the process restarts from initial step 512 previously described.
Figure 10 is a flow chart illustrating steps of a process of parsing of SAO parameters in the bitstream 601 at the decoder side. In initial step 602, the”sao ypejdx_X” syntax element is read and decoded. The code word representing this syntax element can use a fixed length code or could use any method of arithmetic coding. The syntax element
sao_type_idx_X enables determination of the type of SAO applied for the frame area to be processed for the colour component Y or for both Chroma components U & V. For example, for a YUV 4:2:0 sequence, two components are considered: one for Y, and one for U and V. The“ saoJypejdxJC’ can take 3 values as follows depending on the SAO type encoded in the bitstream.‘O’ corresponds to no SAO,‘ 1’ corresponds to the Band Offset case illustrated in Figure 8 and‘2’ corresponds to the Edge Offset type filter illustrated in Figures 3 A and 3B.
In the same step 602, a test is performed to determine if the“ saoJypejdx ’ is strictly positive. If“saoJypejdxJC’ is equal to“0” signifying that there is no SAO for this frame area (CTU) for Y if X is set equal to Y and that there is no SAO for this frame area for U and V if X is set equal to U and V. The determination of the SAO parameters is complete and the process proceeds to step 608. Otherwise if the“ saojypejdx” is strictly positive, this signifies that SAO parameters exist for this CTU in the bitstream.
Then the process proceeds to step 606 where a loop is performed for four iterations.
The four iterations are carried in step 607 where the absolute value of offset j is read and decoded from the bitstream. These four offsets correspond either to the four absolute values of the offsets (01, 02, 03, 04) of the four Edge indexes of SAO Edge Offset (see Figure 7B) or to the four absolute values of the offsets related to the four ranges of the SAO band Offset (see Figure 8).
Note that for the coding of an SAO offset, a first part is transmitted in the bitstream corresponding to the absolute value of the offset. This absolute value is coded with a unary code. The maximum value for an absolute value is given by the following formula:
MAX abs SAO offset value = (1 « (Min(bitDepth, l0)-5))-l
where « is the left (bit) shift operator.
This formula means that the maximum absolute value of an offset is 7 for a pixel value bitdepth of 8 bits, and 31 for a pixel value bitdepth of 10 bits and beyond.
The current HE VC standard amendment addressing extended bitdepth video sequences provides similar formula for a pixel value having a bitdepth of 12 bits and beyond.
The absolute value decoded may be a quantized value which is dequantized before it is applied to pixel values at the decoder for SAO filtering. An indication of use or not of this quantification is transmitted in the slice header.
For Edge Offset type, only the absolute value is transmitted because the sign can be inferred as explained previously.
For Band Offset type, the sign is signaled in the bitstream as a second part of the offset if the absolute value of the offset is not equal to 0. The bit of the sign is bypassed when CABAC is used.
After step 607, the process proceeds to step 603 where a test is performed to determine if the type of SAO corresponds to the Band Offset type (sao_type_idx_X 1).
If the outcome is positive, the signs of the offsets for the Band Offset mode are decoded in steps 609 and 610, except for each offset that has a zero value, before the following step 604 is performed in order to read in the bitstream and to decode the position “ sao_band _position_X” of the SAO band as illustrated in Figure 8.
If the outcome is negative in step 603 ^ saoJypejdxJC’ is set equal to 2), this signifies that the Edge Offset type is used. Consequently, the Edge Offset class
(corresponding to the direction 0, 45, 90 and 135 degrees) is extracted from the bitstream 601 in step 605. If X is equal to Y, the read syntax element is“sao eo class luma” and if X is set equal to U and V, the read syntax element is“sao eo class chroma”.
When the four offsets have been decoded, the reading of the SAO parameters is complete and the process proceeds to step 608.
Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications, for example during the step 907 in Figure 6. In HEVC, this image part is a CTU. This same process is also applied in the decoding loop (step 715 in Figure 4) at the encoder in order to produce the reference frames used for the motion estimation and compensation of the following frames. This process is related to the SAO filtering for one color component (thus suffix“_X” in the syntax elements has been omitted below).
An initial step 701 comprises determining the SAO filtering parameters according to processes depicted in Figures 9 and 10. The SAO filtering parameters are determined by the encoder and the encoded SAO parameters are included in the bitstream. Accordingly, on the decoder side in step 701 the decoder reads and decodes the parameters from the bitstream. Step 701 gives the sao ypejdx and if it equals 1 the sao_band _position 702 and if it equals 2 the sao_eo_class Junta or sao _eo class chroma (according to the colour component
processed). It may be noted that if the element sao ypejdx is equal to 0 the SAO filtering is not applied. Step 701 gives also the offsets table of the 4 offsets 703.
A variable i, used to successively consider each pixel Pi of the current block or frame area (CTU), is set to 0 in step 704. In step 706, pixel p is extracted from the frame area 705
(the current CTU in the HEVC standard) which contains N pixels. This pixel p is classified in step 707 according to the Edge offset classification described with reference to Figures 7A & 7B or Band offset classification as described with reference to Figure 8. The decision module 708 tests if R is in a class that is to be filtered using the conventional SAO filtering.
If p is in a filtered class, the related class number j is identified and the related offset value ffset / js extracted in step 710 from the offsets table 703. In the case of the
Offset P.
conventional SAO filtering this J is then added to the pixel value ' in step 711 in
P' P'
order to produce the filtered pixel value 712. This filtered pixel is inserted in step 713 into the filtered frame area 716. In embodiments of the invention, steps 710 and 711 are carried out differently, as will be explained later in the description of those embodiments.
If p is not in a class to be SAO filtered then p (709) is inserted in step 713 into the filtered frame area 716 without filtering.
After step 713, the variable i is incremented in step 714 in order to filter the subsequent pixels of the current frame area 705 (if any - test 715). After all the pixels have been processed (i>=N) in step 715, the filtered frame area 716 is reconstructed and can be added to the SAO reconstructed frame (see frame 908 of Figure 6 or 716 of Figure 4).
As noted above, the JVET exploration model (JEM) for the future VVC standard uses all the HEVC tools. One of these tools is sample adaptive offset (SAO) filtering. However, SAO is less efficient in the JEM reference software than in the HEVC reference software. This arises from fewer evaluations and from signalling inefficiencies compared to other loop filters.
Figure 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in the CTU-level. The process starts with a current CTU 1101. First the statistics for all possible SAO types and classes are accumulated in a variable CTUStats 1102. The process of Step 1102 is described below with reference to Figure 13. According to the value set in the variable CTUStats, the RD cost for
the SAO merge Left is evaluated if the Left CTU is in the current Slice 1103 as the RD cost of the SAO Merge UP (1104). Thanks to the statistics in CTUStats 1102, new SAO parameters are evaluated for Luma 1105 and for both Chroma components 1109. (Both Chroma components because the Chroma components share the same SAO type in the HE VC standard). For each SAO type 1006, the best RD offsets and other parameters for Band offset classification are obtained 1107. Steps 1107 and 1110 are explained below for Edge and Band classification with reference to Figure 14 and Figure 15 respectively. All RD costs are computed thanks to their respective SAO parameters (1108). In the same way for both Chroma components, the optimal RD offsets and parameters are selected 1111. All this RD costs are compared in order to select the best SAO parameters set 1115. These RD costs are also compared to disable SAO independently for the Luma and the Chroma components 1113, 1114. The use of a new SAO parameters set 1115 is compared to the SAO parameters set“Merging” or sharing 1116 from the left and up CTU. Figure 13 is a flow chart illustrating steps of an example of a statistics computed at the encoder side that can be applied for the Edge Offset type filter, in the case of the conventional SAO filtering. The similar approach may also be used for the Band Offset type filter.
Figure 13 illustrates the setting of the variable CTUStats containing all information needed to derive each best rate distortion offsets for each class. Moreover, it illustrates the selection of the best SAO parameters set for the current CTU. For each colour component Y, U, V (or RGB) 811 each SAO type is evaluated. For each SAO type 812 the variables Sum j and SumNbPiXj are set to zero in an initial step 801. The current frame area 803 contains N pixels.
j is the current range number to determine the four offsets (related to the four edge indexes shown in Figure 7B for Edge Offset type or to the 32 ranges of pixel values shown in Figure 8 for Band Offset type). Sum ; is the sum of the differences between the pixels in the range j and their original pixels. SumNbPiXj is the number of pixels in the frame area, the pixel value of which belongs to the range j . In step 802, a variable i, used to successively consider each pixel Pi of the current frame area, is set to zero. Then, the first pixel of the frame area 803 is extracted in step 804.
In step 805, the class of the current pixel is determined by checking the conditions defined in
Figure 7B. Then a test is performed in step 805. During step 805, a check is performed to determine if the class of the pixel value corresponds to the value“none of the above” of
Figure 7B.
If the outcome is positive, then the value“i” is incremented in step 808 in order to consider the next pixels of the frame area 803.
Otherwise, if the outcome is negative in step 806, the next step is 807 where the related SumNbPixj (i.e. the sum of the number of pixels for the class determined in step 805) is incremented and the difference between Pi and its original value P·’"' is added to Sum } . In the next step 808, the variable i is incremented in order to consider the next pixels of the frame area 803.
Then a test is performed to determine if all pixels have been considered and classified. If the outcome is negative, the process loops back to step 804 described above. Otherwise, if the outcome is positive, the process proceeds to step 810 where the variable CTUStats for the current colour component X and the SAO type SAO type and the current class j are set equal to Sum } for the first value and SumNbPiXj for the second value. These variables can be used to compute for example the optimal offset parameter Offset . of each class j. This offset Offset . may be the average of the differences between the pixels of class j and their original values. Thus, Offset . is given by the following formula:
Sum ,
Offset = - - -
SumNbPixj Note that the offset Offset . is an integer value. As a consequence, the ratio defined in this formula may be rounded, either to the closest value or using the ceiling or floor function.
Each offset Offset . is an optimal offset Ooptj in terms of distortion.
To evaluate an RD cost for a merge of SAO parameters, the encoder uses the statistics set in table CTUStats. According to the following examples for the SAO Merge Left and by considering the type for Luma Left Type Y and the four related offsets O Left O, O Left l, 0_Left_2, 0_Left_3, the distortion can be obtained by the following formula:
Distortion Left Y =
(CTUStats[Y][ Left_Type_Y][0][l] x O Left O x O Left O - CTUStats[Y][ Left_Type_Y][0][0] x O Left O x 2)» Shift
+ (CTUStats[Y][ Left_Type_Y][l][l] x O Left l x O Left O - CTUStats[Y][ Left_Type_Y][l][0] x O Left l x 2)» Shift
+ (CTUStats[Y][ Left_Type_Y][2][l] x 0_Left_2 x O Left O -
CTUStats[Y][ Left_Type_Y][2][0] x 0_Left_2 x 2)» Shift
+ (CTUStats[Y][ Left_Type_Y][3][l] x 0_Left_3 x O Left O - CTUStats[Y][ Left_Type_Y][3][0] x 0_Left_3 x 2)» Shift
The variable Shift is designed for a distortion adjustment. The distortion should be negative as SAO is a post filtering.
The same computing is applied for Chroma components. The Lambda of the rate distortion cost is fixed for the three components. For an SAO parameters merged with the left CTU, the rate is only 1 flag which is CABAC coded.
The encoding process illustrated in Figure 14 is applied in order to find the best offset in terms of rate distortion criterion, offset referred to as ORDj. This process is applied in steps 1109 to 1112.
In an initial step 901 of the encoding process of Figure 14, the rate distortion value Jj is initialized to the maximum possible value. Then a loop on Oj from Ooptj to 0 is applied in step 902. Note that Oj is modified by 1 at each new iteration of the loop. If Ooptj is negative, the value Oj is incremented and if Ooptj is positive, the value Oj is decremented. The rate distortion cost related to Oj is computed in step 903 according to the following formula:
J(Oj)= SumNbPixj x Oj x Oj - Sumj x Oj x 2 + l R(Oj)
where l is the Lagrange parameter and R(Oj) is a function which provides the number of bits needed for the code word associated with Oj.
Formula‘SumNbPixj x Oj x Oj - Sumj x Oj x 2’ gives the improvement in terms of the distortion provided by the use of the offset Oj. If J(Oj) is inferior to Jj then Jj = J(Oj) and ORDj is equal to Oj in step 904. If Oj is equal to 0 in step 905, the loop ends and the best ORDj for the class j is selected.
This algorithm of Figures 13 and 14 provides a best ORDj for each class j. This algorithm is repeated for each of the four directions of Figure 7A. Then the direction that provides the best rate distortion cost (sum of Jj for each direction) is selected as the direction to be used for the current CTU.
This algorithm (Figures 13 and 14) for selecting the offset values at the encoder side for the Edge offset tool can be easily applied to the Band Offset filter to select the best
position (SAO_band_position) where j is in the interval [0,32[ instead of the interval [l,4[ in Figure 13. It involves changing the value 4 to 32 in modules 801, 810, 811. More
specifically, for the 32 classes of Figure 8, the parameter Sumj (j=[0,32[) is computed. This corresponds to computing for each range j, the difference between the current pixel value (Pi) and its original value (Porgi), each pixel of the image belonging to a single range j. Then the best offset in terms of rate distortion ORDj is computed for the 32 classes, with the same process as described in Figure 14.
The next step involves finding the best position of the SAO band position of Figure 8. This is determined with the encoding process set out in Figure 15. The RD cost Jj for each range has been computed with the encoding process of Figure 14 with the optimal offset ORDj in terms of rate distortion. In Figure 15, in an initial step 1001 the rate distortion value J is initialized to the maximum possible value. Then a loop on the 28 positions j of 4 consecutive classes is run in step 1002. Next, the variable Jj corresponding to the RD cost of the band (of 4 consecutive classes) is initialized to 0 in step 1003. Then the loop on the four consecutive offset j is run in step 1004. Ji is incremented by the RD costs of the four classes Jj in step 1005 (j=i to i+4).
If this cost Ji is inferior to the Best RD cost J, J is set to Ji, and sao_band_position = i in step 1007, and the next step is step 1008.
Otherwise, the next step is step 1008.
Test 1008 checks whether or not the loop on the 28 positions has ended. If not, the process continues in step 1002, otherwise the encoding process returns the best band position as being the current value of sao_band_position 1009.
Thus, the CTU Stats table in the case of determining the SAO parameters at the CTU level is created by the process of Figure 12. This corresponds to evaluating the CTU level in terms of the rate-distortion compromise. The evaluation may be performed for the whole image or for just the current slice.
Figure 16 shows various different groupings 1201-1206 of CTUs in a slice.
A first grouping 1201 has individual CTUs. This first grouping requires one set of SAO parameters per CTU. It corresponds to the CTU-level previously mentioned.
A second grouping 1202 makes all CTUs of the entire image one group. Thus, in contrast to the CTU-level, all CTUs of the frame (and hence the slice which is either the entire frame or a part thereof) share the same SAO parameters.
To make all CTUs of the image share the same SAO parameters one of two methods can be used. In both methods, the encoder first computes a set of SAO parameters to be shared by all CTUs of the image. Then, in the first method, these SAO parameters are set for the first CTU of the slice. For each remaining CTU from the second CTU to the last CTU of the slice, the saojnergejeft flag is set equal to 1 if the flag exists (that is, if the current CTU has a left CTU). Otherwise, the sao_merge_up flag is set equal to 1. Figure 17 shows an example of CTUs with SAO parameters set according to the first method. This method has the advantage that no signalling of the grouping to the decoder is required. Also, no changes to the decoder are required to introduce the groupings and only the encoder is changed. The groupings could therefore be introduced in an encoder based on HEVC without modifying the HEVC decoder. Surprisingly, groupings do not increase the rate too much. This is because the merge flags are generally CABAC coded in the same context. Since for the second group (entire image) these flags all have the same value (1), the rate consumed by these flags is very low. This follows because they always have the same value and the probability is 1.
In the second method of making all CTUs of the image share the same SAO parameters, the grouping is signalled to the decoder in the bitstream. The SAO parameters are also signalled as SAO parameters for the group (whole image), for example in the slice header. In this case, the signalling of the grouping consumes bandwidth. However, the merge flags can be dispensed with, saving the rate related to the merge flags, so that overall the rate is reduced.
The first and second groupings 1201 and 1202 provide very different rate-distortion compromises. The first grouping 1201 is at one extreme, giving very fine control of the SAO parameters (CTU by CTU), which should lower distortion, but at the expense of a lot of signalling. The second grouping is at the other extreme, giving very coarse control of the SAO parameters (one set for the whole image), which raises distortion but has very light signalling.
Next, a description will be given of how to determine in the encoder the SAO parameters for the second grouping 1202. In the second grouping 1202 the determination is done for a whole image and all CTUs of the slice/frame share the same SAO parameters.
Figure 18 is an example of the setting of SAO parameters for a ffame/slice level using the first method of sharing SAO parameters (i.e. without new SAO classifications at encoder side). This Figure is based on Figure 17. At the beginning of the process, the
CTUStats table is set for each CTU (in the same way as the CTU level encoding choice). This CTUStats can be used for the traditional CTU level 1302. Then the table FrameStats is set by adding each value for all CTUs of the table CTUStats 1303. Then the same process as for CTU level is applied to find the best SAO parameters 1305 to 1315. To set the SAO parameters for all CTUs of the frame, the selected SAO parameters set at step 1315 is set for the first CTU of the slice/frame. Then for each CTU from the second CTU to the last CTU of the slice/frame, the saojnergejeft Jlag is set equal to 1 if it exists otherwise the
sao_merge_up Jlag is set equal to 1 (indeed for the second CTU to the last CTU a merge Left or Up or both exist) 1317. The syntax of the SAO parameters set is unchanged from that presented in Figure 9. At the end of the process the SAO parameters are set for the whole slice/frame.
Thus, the CTUStats table in the case of determining the SAO parameters for the whole image (frame level) is created by the process of Figure 18. This corresponds to evaluating the frame level in terms of the rate-distortion compromise.
The evaluations are then compared and the one with the best performance is selected.
The example of determining the SAO parameters in Figure 18 corresponds to the first method of sharing SAO parameters as it uses the merge flags to share the SAO parameters among all CTUs of the image (see steps 1316 and 1317). These steps can be omitted if a second method of sharing SAO parameters is used as described in further embodiments below.
Figure 19 is an example of the setting of SAO parameters sets for a third grouping 1203 at the encoder side. This Figure is based on Figure 12. To reduce the amount of steps in the figure, the modules 1105 to 1115 have been merged in one step 1405 in this Figure 19. At the beginning of the process, the CTUStats table is set for each CTU. This CTUStats can be used for the traditional CTU level 1302 encoding choice. For each column 1403 of the current slice/frame, the table ColumnStats is set by adding each value 1405 from CTUStats 1402, for each CTUs of the current column 1404. Then the new SAO parameters are determined as for CTU level 1406 encoding choice (cf. Figure 12). If it is not the first column, the RD cost to share the SAO parameters with the previous left column is also evaluated 1407, in the same way as the sharing of SAO parameters set between left and up CTU 1103, 1104 is evaluated. If the sharing of SAO parameters gives a better RD cost 1408 than the RD cost for the new SAO parameters set, the saojnergejeft Jlag is set equal to 1 for the first CTU of the column. This CTU has the address number equal to the value
“Column”. Otherwise, the SAO parameters set for this first CTU of the column is set equal (1409) to the new SAO parameters obtained in step 1406.
For all other CTUs of the column 1411, their SAO merge Left saojnergejeft Jlag is set equal to 0 if it exists and the SAO merge up sao_merge_up Jlag is set equal to 1. Then the SAO parameters set for the next column can be processed 1403. Please note that, except for the first line of CTU all other CTUs of the frame have the sao merge left flag equall to 0 if it exists and the sao merge up flag equals to 1. So, step 1412 can be processed once per frame.
The advantage of this CTU grouping is another RD compromise between the CTU level encoding choice and the frame level which can be useful for some conditions. Also, in this example, merge flags are used within the group, which means that the third grouping can be introduced without modifying the decoder (i.e. the grouping can be HE VC-compliant). Of course, the second method of sharing SAO parameters described in the third embodiment can be used instead. In that case, merge flags are not used in the group (CTU column) and steps 1411 and 1412 are omitted.
In one variant, the Merge between columns doesn’t need to be checked. It means that steps 1407 1408 1410 are removed from the process of Figure 19. The advantage of removing this possibility is a simplification of the implementation and the ability to parallelize the process. This has a small impact on coding efficiency.
Another possible compromise intermediate between the CTU level and the frame level can be offered by a fourth grouping 1204 in Figure 16, which makes a line of CTUs a group. To determine the SAO parameters for this fourth grouping, a similar process to that of Figure 18 can be applied. In that case, the variable ColumnStats is changed by LineStats. The step 1403 is replaced by“For Line = 0 to Num CTU in Height”. The step 1404 is replaced by“For CTU_in_line= 0 to Num CTU in Width”. Step 1405 by ColumnStats[] [][][] += CTUStats[Line* Num CTU in Width + CTU in line] [][][][]. The New SAO parameters and the merge with the up CTU is evaluated based on this LineStats table (steps 1406 1407). The step 1410 is replaced by setting of sao merge up flag to 1 for the first CTU of the Line. And for all CTUs of the slice/ffame except each first CTU of each Line, sao merge left flag is set equal to 1.
The advantage of the line is another RD compromise between the CTU level and Frame level. Please note that the frame or slice are most of the time rectangles and their width is larger than their height. So the line CTUs grouping 1204 is expected to be an RD compromise closer to the frame CTU grouping 1202 than the column CTU grouping 1203.
As for the other CTU groupings 1202 and 1203, the line CTU grouping can be HE VC compliant if the merge flags are used within the groups.
As for the column CTU grouping 1203 the evaluation of merging 2 lines can be removed.
Further RD compromises can be offered by putting two or more columns of CTUs or two or more lines of CTUs together as a group. The process of Figure 18 can be adapted to determine SAO parameters to such groups.
In one embodiment, the number N of columns or lines in a group may depend on the number of groups that are targeted.
The use of several columns or lines for the CTU groupings may be particularly advantageous when the slices or frames are large (for HD, 4K or beyond).
As described previously, in one variant, the merge between these groups containing two or more columns or two or more lines doesn’t need to be evaluated.
Another possible grouping includes split columns or split lines, where the split is tailored to the current slice/frame.
Another possible compromise between the CTU level and the frame level can be offered by square CTU groupings 1205 and 1206 as illustrated in Figure 18. The grouping 1205 makes 2x2 CTUs a group. The grouping 1206 makes 3x3 CTUs a group. Figure 20 shows an example of how to determine the SAO parameters for such groupings. For each NxN group 1503, the table NxNStats 1507 is set 1504, 1505, 1506 based on CTUstats. This table is used to determine the New SAO parameters 1508 and its RD cost, in addition to the RD cost for a Feft 1510 sharing or Up 1509 sharing of SAO parameters. If the Best RD cost is the new SAO parameters 1511, the SAO parameters of the first CTU (top left CTU) of the NxN group is set equal to this new SAO parameters 1514. If the best RD cost is the sharing of SAO parameters with the up NxN group 1512, the sao_merge_up_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1 and the
sao merge left flag to 0 1515. If the best RD cost is the sharing of SAO parameters with the left NxN group 1513, the sao_merge_left_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1, 1516. Then the sao merge left flag and sao merge up flag are set correctly for the other CTUs of the NxN group in order to form the SAO parameters for the current NxN group 1517. Figure 21 illustrates this setting for a 3x3 SAO group. The top left CTU is set equal to the SAO parameters determined in step 1508 to 1516. For the two other top CTUs, the saojnergejeft Jlag is set equal to 1. As the saojnergejeft Jlag is the first
flag encoded or decoded and as it is set to 1 , there is no need to set the sao merge up flag to 0. For the two other CTUs in the first row, the saojnergejeft Jlag is set equal to 0 and sao_merge_up Jlag is set equal to 1. For the other CTUs, the saojnergejeft Jlag is set equal to 1.
The advantage of the NxN CTU groupings is to create several RD compromises for
SAO. As for the other groupings, these groupings can be HE VC compliant if merge flags within the groups are used. As for the other groupings, the test of Merge left and Merge up between groups can be dispensed with in Figure 19. So steps 1509, 1510, 1512, 1513, 1515 and 1516 can be removed, especially when N is high.
In one variant, the value N depends on the size of the frame/slice. The advantage of this embodiment is to obtain an efficient RD compromise.
In a preferred variant, only N equal to 2 and 3 are evaluated. This offers an efficient compromise.
The possible groupings are in competition with one another as the SAO parameter derivation to be selected for the current slice. An example about how to select the SAO parameter derivation using a rate-distortion compromise comparison is described below according to a sixth embodiment of the invention in reference to Figure 22.
Figure 23 is a flow chart illustrating a decoding process when the CTU grouping is signaled in the slice header according to the second method of sharing SAO parameters among the CTUs of the group. First the flag SaoEnabledFlag is extracted from the bitstream 1801. If SAO is not enabled, the next slice header syntax element is decoded 1807 and SAO will not be applied to the current slice. Otherwise the decoder extracts N bits form the slice header 1803. N depends on the number of available CTUs groupings. Ideally, the number of CTUs groupings should be equal to 2 power of N. The corresponding CTUs grouping index 1804 is used to select the CTUs grouping method 1805. This grouping method will be applied to extract the SAO syntax and to determine the SAO parameters set for each CTU 1806. Then the next slice header syntax element is decoded.
The advantage of the signalling at slice header of the CTUs grouping is its low impact on the bitrate.
But when the number of slices is significant for a frame, it may be desirable to reduce this signalling. So, in one variant, the CTUs grouping index uses a unary max code in the slice header. In that case, the CTUs groupings are ordered according to their probabilities of occurrences (highest to lowest).
For example, at least one SAO parameter derivation is an intermediate level derivation (SAO parameters not at CTU level or at group level). When applied to a group it causes the group (e.g. frame or slice) to be subdivided into subdivided parts (CTU groupings 1203-1206, e.g. columns of CTUs, lines of CTUs, NxN CTUs, etc.) and derives SAO parameters for each of the subdivided parts. Each subdivided part is made up of two or more said image parts (CTUs). The advantage of the intermediate level derivation(s) is introduction of one or more effective rate-distortion compromises. The intermediate level derivation(s) can be used without the CTU-level derivation or without the frame-level derivation or without either of those two derivations.
Preferably, the smallest grouping is the first grouping 1201 in which each CTU is a group and there is one set of SAO parameters per CTU. However, set of SAO parameters can be applied to a smaller block than the CTU. In this case, the derivation is not at the CTU level, frame level or an intermediate level between the CTU and frame levels but at a sub- CTU level (a level smaller than an image part).
In this case, instead of signalling a grouping it is effective to signal an index representing a depth of the SAO parameters.
Table below shows one example of a possible indexing scheme:
The index 0 means that each CTU is divided into 16 blocks and each may have its own SAO parameters. Index 1 means that each CTU is divided into 4 blocks, again each having its own SAO parameters.
The selected derivation is then signalled to the decoder in the bitstream. The signalling may comprise a depth syntax element (e.g. using the indexing scheme above).
In a variant, least one derivation when applied to a group causes the group to be subdivided into subdivided parts and derives SAO parameters for each of the subdivided parts, and each image part is made up of two or more said sub-divided parts.
In a variant the first derivation when applied to a group causing the group to have SAO parameters at a first level, and the second derivation when applied to a group causing the group to have SAO parameters at a second level different from the first level. The levels may any two levels from the frame level to a sub-CTU level. The levels may correspond to the groupings 1201-1206 in Figure 12.
Preferably SAO parameters is signalled for a slice, which means that the derivation is used for all CTUs of the slice.
Also, when the selected level of the SAO parameters for a slice is an intermediate level between the slice level and the CTU level, a derivation may be selected per CTU group (e.g. each column of CTUs) of the slice or frame.
In Figure 24 the SAO merge flags are usable between groups of the CTUs grouping. As depicted in Figure 24, for the 2x2 CTU grouping, the SAO merge Left and SAO merge up are kept for each group of 2x2 CTUs. But they are removed for CTUs inside the group. Please note that only the saojnergejeft Jlag is used for the grouping 1203 of a column of CTUs, and only the sao_merge_up Jlag is used for the grouping 1204 of a line of CTUs.
In a variant, a flag signals if the current CTU group shares its SAO parameters or not. If it is true, a syntax element representing one of the previous groups is signalled. So each group of a slice can be predicted by a previous group except the first one. This improves the coding efficiency by adding several new possible predictors.
Referring back to a previous example, it was mentioned that a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO. In a variant, the default set depends on the selected grouping. For example, a first default set may be associated with one grouping (or one level of SAO parameters) and a second default set may be associated with another grouping (or another level of SAO parameters). The size of the groups (or the level of SAO parameters) is found to have an influence of what SAO parameters work efficiently as the default set. The different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the grouping selected for the current slice.
In a variant, a depth of the SAO parameters was selected for a slice, including depths smaller than a CTU, making it possible to have a set of SAO parameters per block in a CTU.
Embodiments of the present invention described below are intended to improve the coding efficiency of SAO by using various techniques for determining one or more SAO parameters of an image part in a current image.
First group of embodiments
In first group of embodiments, it is proposed to improve the use of syntax elements enabling for one image part (for instance a CTU) the inferring of SAO parameters from another image part (another CTU), through the flags:“ sao_merge_up Jlag” and
“ saojnergejeft Jlag” described by reference to the Figure 9.
First embodiment
In a first embodiment, it is proposed a method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image , the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups. The method comprises:
determining whether an image part is of a predetermined group, and
if the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part,
else not including the first syntax element.
An example of the first embodiment is illustrated in Figure 25, which is a flow chart illustrating the steps of a process that may be implemented in an encoder, to encode SAO parameters according to the first embodiment. More precisely this figure illustrates as an example, the signalling of the inferring or not of SAO parameters provided for CTUs or group of CTUs within a bitstream 4114, based on CTU grouping.
Preferably, when there is no signalling for inferring the SAO parameters, for example, if the image part is not of the predetermined group, then SAO parameters for filtering the image part are included in the bitstream.
More precisely, the encoder checks first in a test 4102 whether an information, for example, a grouping index 4101 is set equal to the predetermined level, for example CTU level, 4102. As a variant, the SAO merge flags (first syntax elements) are not included in the bitstream if said image part is of a group comprising at least two image parts. As another variant, if the CTU is of a group made up of
- 2*2 CTU, or
- 3*3 CTU, or
a line of CTUs (or partial line), or
a column of CTUs (or partial column), or
- all the CTUs of the image,
then the SAO parameters for filtering the image part are included in the bitstream.
Moreover, if the image part is of a group comprising partitioned image parts, then the first syntax element may be included in the bitstream. More precisely if the image part is of a group comprising
image parts partitioned in 16 portions, or
image parts partitioned in 8 portions, or
image parts partitioned in 4 portions,
then the first syntax element is included in the bitstream.
If the test result is false, then for the three components Y,U,V 4111 , a new set of SAO parameters is inserted 4112 in the bitstream 4114. The steps 4111 and 4112 are repeated until the last CTU is processed.
If the grouping index is set equal to the CTU level value (“Yes” for the test 4102), and if the left CTU (meaning the CTU located at the left side of the processed CTU) exists (test 4103), the saojnergejeft Jlag is inserted in a step 4104 in the bitstream 4114.
If the saojnergejeft Jlag is not equal to false (or the value‘0’) in test 4105 and if the up CTU (meaning the CTU located above the processed CTU) exists in a test 4109, the saojnerge ip Jlag is inserted in a step 4107, in the bistream 4114.
If the flag saojnerge ip Jlag is set equal to false (or the value‘0’), test 4108, a new SAO parameters set is inserted in the bitstream in the steps 4111, 4112 and 4113.
In a variant, step 4112 may also comprise the insertion of a flag
saojnerge Jlags_enabled. The flag is a second syntax element, associated with the group of the image part, for signalling whether the use of the first syntax element(s) is enabled or disabled.
Second embodiment
In the second embodiment, the saojnerge Jlags_enabled may be included in the bitstream based on a criterion (for instance the group the CTU belongs to, or the prediction or encoding mode which is used), for signalling whether the use of the SAO merge flags for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.
The new steps of the process that may be implemented in an encoder.
For example, the index or the first or second syntax element when included in the bitstream, are inserted at:
the image level, or
the sequence level, when the image is of a video sequence. Third embodiment
Figure 25 is a flow chart illustrating the steps of a process to parse SAO parameters, as an alternative to the process illustrated in Figure 9, and in relation to the encoding steps implemented in an encoder described in Figure 22. The steps are preferably implemented in a decoder.
This second embodiment proposes a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups. The method comprises:
obtaining information indicating whether the group of an image part is of
predetermined groups, and
if the group of the image part is of the predetermined groups, then obtaining a first syntax element signalling inferring SAO parameters for performing SAO filtering on said image part, from SAO parameters used for filtering another image part,
else not obtaining the first syntax element.
Preferably, the process of Figure 25 is applied for each CTU to generate a set of SAO parameters for all components. Before decoding SAO parameters for a CTU or a group of CTUs (when there is grouping of CTUs), it is tested in a test 4014 if the information. For instance the information may be the grouping index 4013 previously mentioned. It is tested whether it is set equal to a value indicating a predetermined group or a set of predetermined
groups, for example corresponding to the CTU level index 4014. Said CTU level index may have been previously decoded from the header for example. In that case, the
saojnergejeft Jlag (first syntax element) is extracted in a step 4003 from a bitstream 4002 and if needed the sao_merge_up Jlag (first syntax element) is also extracted in a step 4005 from the bitstream 4002.
If, the grouping index is not set equal to the CTU level index in the test 4014, the flags saojnergejeft Jlag and saojnerge up Jlag are not considered and new SAO parameters are extracted in a step 4007, from the bitstream 4002.
The other steps 4004, 4006 and 4008-4012 are respectively similar to steps 504, 506, 508-512 in Figure 9 previously described.
In one variant, the merge flags are kept for CTU level but removed for all other CTU groupings, as illustrated in Figure 25. The advantage is a flexibility of the CTU level.
In one variant, if said image part is of a group comprising at least two image parts, then the first syntax element(s) is (are) not obtained.
In another variant, the merge flags are used for CTU when the SAO signalling is lower or equal to the CTU level (1/16 CTU or ¼ CTU or 1/8 CTU) and removed for other CTUs groupings having larger groups.
The following table illustrates this embodiment:
In other words, if said image part is of a group made up of
2*2 image parts, or
3*3 image parts, or
a line of image parts (or partial line), or
a column of image parts (or partial column), or
all the image parts of the image,
then the first syntax element (s) is (are) not included in the bitstream.
Fourth embodiment Figure 26 is a flow chart illustrating a third embodiment this variant of the second embodiment in Figure 25. However a new test 4214 evaluates if the value of a syntax element the flag saojnerge Jlags _enabled (second syntax element) is equal to true to enable the decoding of the flags saojnergejeft Jlag and sao_merge_up Jlag testes in tests 4203 and 4205, instead of checking the value of the grouping index, as illustrated in Figure 25.
In other words, the information is a second syntax element associated with the group of the image part, signalling whether the use of the first syntax element {saojnergejeft Jlag and saojnerge up Jlag) is enabled or disabled.
Fifth embodiment
In a fifth embodiment, it is proposed a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups.
The method comprises:
obtaining a syntax element in the bitstream based on a predetermined criterion, for signalling whether the use of one or more other syntax elements for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.
For instance, the predetermined criterion is the fact that said image part is of a predetermined group or not.
In the third, fourth and fifth embodiments, when the steps are implemented in a decoder for decoding an image from a bitstream (respectively 4002 or 4202), said
information or SAO parameters, or first or second syntax elements are parsed from the bitstream.
As for the encoder side (Figure 25), said information or SAO parameters, or first or second syntax elements when parsed from the bitstream, concerns:
the image level, or
the sequence level, when the image is of a video sequence.
In the previous embodiments, a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO. In a variant, the default set depends on the selected depth of SAO parameters. For example, a first default set may be associated with one depth (for example 1/16) and a second default set may be associated with another depth (for example ¼). The depth is found to have an influence of what SAO parameters work efficiently as the default set. The different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the depth selected for the current slice.
In a variant, one possibility is to remove the SAO merge flags for all levels. It means that steps 503 504 505 506 of Figure 9 are removed. The advantage is that it reduces significantly the signalling of SAO and consequently it reduces the bitrate. Moreover, it simplifies the design by removing two syntax elements at CTU level.
The merge flags are important for small block sizes because a SAO parameters set is costly compared to the amount of samples that it can improve. In that case, these syntax elements reduce the cost of SAO parameters signalling. For large groups, the SAO parameters set is less costly so the usage of merge flags is not efficient. So the advantage of these embodiments is a coding efficiency increase.
Sixth embodiment In the sixth embodiment, several SAO derivation are evaluated at encoder and the relative SAO syntax at CTU level is inserted in the bitstream. Consequently, the decoder is not modified. One of the advantages is that this embodiment can be used with an HE VC compliant decoder.
All these different groupings, defined in the previous embodiments, can be compared at encoder side to select the one which gives the best RD compromise for the current frame. Figure 22 illustrates this embodiment. More precisely Figure 22 illustrates an example of how to select the SAO parameter derivation using a rate-distortion compromise comparison.
One possibility to increase the coding efficiency at encoder side is to test all possible SAO groupings but this should increase the encoding time compared to the example of
Figure 22 where a small subset of groupings is evaluated.
The current slice/frame 1701 is used to set the CTU Stats table 1703 for each CTU 1702. This table 1703 is used to evaluate the CTU level 1704, the frame/ Slice Grouping
1705, the Column grouping 1706, the line grouping 1707, the 2x2 CTUs grouping 1708 or 3x3 CTU grouping 1709 or all other described CTUs groupings as described previously (in an non limitative way). The best CTUs grouping is selected according to the rate distortion criterion computed for each grouping 1710. The SAO parameters sets for each CTU are set (1711) according to the grouping selected in step 1710. These SAO parameters 1712 are then used to apply the SAO filtering 1713 in order to obtain the filtered frame/slice. The SAO parameters for each CTU 1711 is then inserted inside the bitstream as described in Figure 9.
The advantage of this embodiment is that it doesn’t require any modification of HE VC SAO at decoder side, so this method is HEVC compliant.
The main advantage is a coding efficiency increase. The second advantage is that this competition method doesn’t require any additional SAO filtering or classification. Indeed, the main impacts on encoder complexity are the step 1702 which needs SAO classification for all possible SAO type and the step 1713 which filtered the samples. All other CTU groupings evaluations are only some additions of values already obtained during the CTU level encoding choice (set in the table CTUStats).
One other possibility to increase the coding efficiency at encoder side is to test all possible SAO groupings but this should increase the encoding time compared to the example of Figure 12 where a small subset of groupings is evaluated. Second group of embodiments
Seventh embodiment
Accordingly, in the seventh embodiment, the competition between the different permitted SAO parameters derivations is modified so that only one derivation is permitted in the encoder for any given slice or frame. The permitted derivation may be determined in dependence upon one or more characteristics of the slice or frame. For example, the permitted derivation may be selected based on the slice type (Intra, Inter P, Inter B), quantization level (QP) of the slice, or position in the hierarchy of a Group of Pictures (GOP).
The advantage of this embodiment is a complexity reduction. Instead of evaluating two or more competing derivations just one derivation is selected, which can be useful for a hardware encoder.
Thus, in a variant, a first derivation is associated with first groups of the image (e.g. Intra slices) and a second derivation is associated with second groups of the image (e.g.
Inter P slices). It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, the first derivation is used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, the second derivation is used to filter the image parts of the group.
Evaluation of the two derivations is not required.
Whether a group to be filtered is determined to be a first group or a second group may depend on one or more of:
a slice type;
a frame type of the image to which the group to be filtered belongs;
a position in a quality hierarchy of a Group of Pictures of the image to which the group to be filtered belongs;
a quality of the image to which the group to be filtered belongs; and
a quantisation parameter applicable to the group to be filtered.
For example, when the first groups have a higher quality or higher position in the quality hierarchy than the second groups, the first derivation may have fewer image parts per group than the second derivation.
In a variant, a particular derivation of the SAO parameters was selected for a given slice or frame. However, if the encoder has the capacity to evaluate a limited number of competing derivations, it is unnecessary to eliminate the competition altogether. The competition for a given slice or frame is still permitted but the set of competing derivations is adapted to the slice or frame.
The set of competing derivations may depend on the slice type.
For Intra slices, the set preferably contains groupings with groups containing small numbers CTUs (e.g. CTU level, 2x2 CTU, 3x3 CTU, and Column). Also, if depths lower than a CTU are available (as in the tenth embodiment), these depths are preferably also included.
For Inter slices, the set of derivations preferably contains groupings with groups containing large numbers of CTUs such as Fine, Frame level. However, smaller groupings can also be considered down to the CTU level.
The advantage of this embodiment is a coding efficiency increase thanks to the use of derivations adapted for a slice or frame.
In one variant, the set of derivations can be different for an Inter B slice from that for an Inter P slice.
In another variant, the set of competing derivations depends on the characteristics of the frame in the GOP. This is especially beneficial for frames which vary in quality (QP) based on a quality hierarchy. For the frames with the highest quality or highest position in the hierarchy, the set of competing derivations should include groups containing few CTUs or even sub-CTU depths (same as for Intra slices above). For frames with a lower quality or lower position in the hierarchy, the set of competing derivations should include groups with more CTUs.
The set of competing derivations can be defined in the sequence parameters set.
Thus, in the seventh embodiment a first set of derivations is associated with first groups of the image (e.g. Intra slices) and a second set of derivations is associated with second groups of the image (e.g. Inter P slices). It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, a derivation is selected from the first set of derivations and used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, a derivation is selected from the second set of derivations and used to filter the image parts of the group. Evaluation of derivations not in the associated set of derivations is not required.
Whether a group to be filtered is a first group or a second group may be determined in the preceding embodiment. For example, when the first groups have a higher quality or higher position in the quality hierarchy than the second groups, the first set of derivations may have at least one derivation with fewer image parts per group than the derivations of the second set of derivations.
The set of CTUs groupings can be defined in the sequence parameters set.
In other words, the seventh embodiment proposed a method of encoding an image comprising a plurality of image parts. The method comprises
predicting one or more image parts from one or more other image parts according to a first or a second prediction mode,
grouping the predicted image parts into one or more groups of a plurality of groups, according to the used prediction mode, and
performing sample adaptive offset (SAO) filtering on predicted image parts based on the grouping.
For example, the image part is predicted from another image part within said image, using an intra prediction mode or from another image part within another reference image than said image, using an inter prediction mode.
Figure 28 shows a system 191 195 comprising at least one of an encoder 150 or a decoder 100 and a communication network 199 according to embodiments of the present invention. According to an embodiment, the system 195 is for processing and providing a content (for example, a video and audio content for displaying/outputting or streaming video/audio content) to a user, who has access to the decoder 100, for example through a user interface of a user terminal comprising the decoder 100 or a user terminal that is
communicable with the decoder 100. Such a user terminal may be a computer, a mobile phone, a tablet or any other type of a device capable of providing/displaying the
(provided/streamed) content to the user. The system 195 obtains/receives a bitstream 101 (in the form of a continuous stream or a signal - e.g. while earlier video/audio are being displayed/output) via the communication network 199. According to an embodiment, the system 191 is for processing a content and storing the processed content, for example a video and audio content processed for displaying/outputting/streaming at a later time. The system 191 obtains/receives a content comprising an original sequence of images 151, which is received and processed (including filtering with a deblocking filter according to the present invention) by the encoder 150, and the encoder 150 generates a bitstream 101 that is to be communicated to the decoder 100 via a communication network 191. The bitstream 101 is then communicated to the decoder 100 in a number of ways, for example it may be generated in advance by the encoder 150 and stored as data in a storage apparatus in the communication network 199 (e.g. on a server or a cloud storage) until a user requests the content (i.e. the bitstream data) from the storage apparatus, at which point the data is communicated/streamed to the decoder 100 from the storage apparatus. The system 191 may also comprise a content providing apparatus for providing/streaming, to the user (e.g. by communicating data for a user interface to be displayed on a user terminal), content information for the content stored in the storage apparatus (e.g. the title of the content and other meta/storage location data for identifying, selecting and requesting the content), and for receiving and processing a user request for a content so that the requested content can be delivered/streamed from the storage apparatus to the user terminal. Alternatively, the encoder 150 generates the bitstream 101
and communicates/streams it directly to the decoder 100 as and when the user requests the content. The decoder 100 then receives the bitstream 101 (or a signal) and performs filtering with a deblocking filter according to the invention to obtain/generate a video signal 109 and/or audio signal, which is then used by a user terminal to provide the requested content to the user.
In the preceding embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit.
Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.
By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transient, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually
reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term "processor," as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.
The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
Claims
1. A method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups,
the method comprising:
determining whether an image part is of a predetermined group, and
if the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part,
else not including the first syntax element.
2. The method of claim 1, wherein if the image part is not of the predetermined group, then including SAO parameters for filtering the image part in the bitstream.
3. The method of claim 1 or 2, wherein if said image part is of a group comprising one image part only, then the first syntax element is included in the bitstream.
4. The method of claim 1 or 2, wherein if said image part is of a group comprising at least two image parts, then the first syntax element is not included in the bitstream.
5. The method of claim 1 or 2, wherein the image is made up of lines and columns of image parts, and if said image part is of a group made up of
2*2 image parts, or
3*3 image parts, or
a line of image parts, or
a column of image parts, or
all the image parts of the image,
then not including the first syntax element in the bitstream.
6. The method of claim 1, 2 or 4, wherein if the image part is of a group comprising partitioned image parts, then the first syntax element is included in the bitstream.
7. The method of claim 6, wherein if the image part is of a group comprising image parts partitioned in 16 portions, or
image parts partitioned in 8 portions, or
image parts partitioned in 4 portions,
then the first syntax element is included in the bitstream.
8. The method of claim 1, further comprising including in the bitstream an index for indicating the group of the image part.
9. The method of claim 1, further comprising, including in the bitstream a second syntax element associated with the group of the image part, for signalling whether the use of the first syntax element is enabled or disabled.
10. The method of claim 1 or 8 or 9, wherein the index or the first or second syntax element when included in the bitstream, are inserted at:
the image level, or
the sequence level, when the image is of a video sequence.
11. A method of encoding an image comprising signalling sample adaptive offset (SAO) filtering in a bitstream, using the method of any one of claims 1 to 10.
12. A method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups,
the method comprising:
determining whether an image part satisfies a predetermined criterion, and including a syntax element in the bitstream based on said criterion, for signalling whether the use of one or more other syntax elements for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.
13. The method of claim 12, wherein the image part satisfies a predetermined criterion if said image part is of a predetermined group.
14. A method of encoding an image comprising signalling sample adaptive offset (SAO) filtering in a bitstream, using the method of any one of claims 12 to 13.
15. A method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups,
the method comprising:
obtaining information indicating whether the group of an image part is of
predetermined groups, and
if the group of the image part is of the predetermined groups, then obtaining a first syntax element signalling inferring SAO parameters for performing SAO filtering on said image part, from SAO parameters used for filtering another image part,
else not obtaining the first syntax element.
16. The method of claim 15, wherein if the group of the image part is not of the predetermined groups, then SAO parameters for filtering the image part are obtained.
17. The method of claim 15 or 16, wherein if said image part is of a group comprising one image part only, then the first syntax element is obtained.
18. The method of claim 15 or 16, wherein if said image part is of a group comprising at least two image parts, then the first syntax element is not obtained.
19. The method of claim 15 or 16, wherein the image is made up of lines and columns of image parts, and if said image part is of a group made up of
2*2 image parts, or
3*3 image parts, or
a line of image parts, or
a column of image parts, or
all the image parts of the image,
then not including the first syntax element in the bitstream.
20. The method of claim 15 or 16, wherein if the image part is of a group comprising partitioned image parts, then the first syntax element is obtained.
21. The method of claim 20, wherein if the image part is of a group comprising image parts partitioned in 16 portions, or
image parts partitioned in 8 portions, or
image parts partitioned in 4 portions,
then the first syntax element is obtained.
22. The method of claim 15 or 16, wherein said information is an index for indicating the group of the image part.
23. The method of claim 15 or 16, wherein said information is a second syntax element associated with the group of the image part, signalling whether the use of the first syntax element is enabled or disabled.
24. A method of decoding from a bitstream, an image comprising performing sample adaptive offset (SAO) filtering using the method of any one of claims 15 to 23, wherein said information or SAO parameters, or first or second syntax elements are parsed from the bitstream.
25. The method of claim 24, wherein said information or SAO parameters, or first or second syntax elements when parsed from the bitstream, concerns:
the image level, or
the sequence level, when the image is of a video sequence.
26. A method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups,
the method comprising:
obtaining a syntax element in the bitstream based on a predetermined criterion, for signalling whether the use of one or more other syntax elements for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.
27. The method of claim 26, wherein the predetermined criterion is the fact that said image part is of a predetermined group or not.
28. A method of decoding from a bitstream, an image comprising performing sample adaptive offset (SAO) filtering using the method of any one of claims 26 to 27, wherein said syntax elements or other syntax elements are parsed from the bitstream.
29. A method of applying a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups,
the method comprising for an image part:
determining statistics about the image part,
evaluating based on the determined statistics of the image part, values of a predetermined criterion, when the image part is grouped according to at least two different groups,
selecting based on said values a best group for the image part,
filtering the image part by applying SAO parameters to the image part, based on the selected group, and
providing the filtered image part.
30. A method of encoding an image made up of image part, comprising a step for applying a Sample Adaptive Offset (SAO) filtering according to claim 29.
31. A method of encoding an image comprising a plurality of image parts, the method comprising
predicting one or more image parts from one or more other image parts according to a first or a second prediction mode,
grouping the predicted image parts into one or more groups of a plurality of groups, according to the used prediction mode, and
performing sample adaptive offset (SAO) filtering on predicted image parts based on the grouping.
32. The method of 31 , wherein the image part is predicted from another image part within said image, using an intra prediction mode or from another image part within another reference image than said image, using an inter prediction mode.
33. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing each of the steps of the method according to any one of claims 1 to 32 when loaded into and executed by the programmable apparatus.
34. A non-transitory computer-readable storage medium storing instructions of a computer program for implementing each of the steps of the method according to any one of claims 1 to 32.
35. A device for signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, according to any one of claims 1 to 32.
36. A device for encoding an image or a sequence of images according to claim 14 or anyone of claims 30 to 32.
37. A device for performing sample adaptive offset (SAO) filtering on an image, according to any one of claims 26 to 27.
38. A device for applying a Sample Adaptive Offset (SAO) filtering on an image, according to claim 29.
39. A device for decoding an image or a sequence of images according to claim 28.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1809236.1A GB2574425A (en) | 2018-06-05 | 2018-06-05 | Video coding and decoding |
GB1809236.1 | 2018-06-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019234001A1 true WO2019234001A1 (en) | 2019-12-12 |
Family
ID=62975664
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2019/064458 WO2019234001A1 (en) | 2018-06-05 | 2019-06-04 | Video coding and decoding |
Country Status (3)
Country | Link |
---|---|
GB (1) | GB2574425A (en) |
TW (1) | TW202005370A (en) |
WO (1) | WO2019234001A1 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130051455A1 (en) | 2011-08-24 | 2013-02-28 | Vivienne Sze | Flexible Region Based Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) |
US20140192860A1 (en) | 2013-01-04 | 2014-07-10 | Canon Kabushiki Kaisha | Method, device, computer program, and information storage means for encoding or decoding a scalable video sequence |
US20140192869A1 (en) * | 2013-01-04 | 2014-07-10 | Canon Kabushiki Kaisha | Method, device, computer program, and information storage means for encoding or decoding a video sequence |
US20170223352A1 (en) * | 2014-07-31 | 2017-08-03 | Samsung Electronics Co., Ltd. | Video encoding method using in-loop filter parameter prediction and apparatus therefor, and video decoding method and apparatus therefor |
US9769450B2 (en) | 2012-07-04 | 2017-09-19 | Intel Corporation | Inter-view filter parameters re-use for three dimensional video coding |
WO2018054286A1 (en) * | 2016-09-20 | 2018-03-29 | Mediatek Inc. | Methods and apparatuses of sample adaptive offset processing for video coding |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6176614B2 (en) * | 2012-05-25 | 2017-08-09 | サン パテント トラスト | Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding / decoding device |
WO2013177975A1 (en) * | 2012-05-29 | 2013-12-05 | Mediatek Inc. | Method and apparatus for coding of sample adaptive offset information |
TWI618404B (en) * | 2012-06-27 | 2018-03-11 | Sony Corp | Image processing device and method |
CN108235030B (en) * | 2012-07-16 | 2020-10-09 | 三星电子株式会社 | SAO encoding method and apparatus and SAO decoding method and apparatus |
US20150350650A1 (en) * | 2014-05-29 | 2015-12-03 | Apple Inc. | Efficient sao signaling |
US10623737B2 (en) * | 2016-10-04 | 2020-04-14 | Qualcomm Incorporated | Peak sample adaptive offset |
WO2018068263A1 (en) * | 2016-10-13 | 2018-04-19 | 富士通株式会社 | Image coding method and device, and image processing apparatus |
-
2018
- 2018-06-05 GB GB1809236.1A patent/GB2574425A/en not_active Withdrawn
-
2019
- 2019-05-28 TW TW108118389A patent/TW202005370A/en unknown
- 2019-06-04 WO PCT/EP2019/064458 patent/WO2019234001A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130051455A1 (en) | 2011-08-24 | 2013-02-28 | Vivienne Sze | Flexible Region Based Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) |
US9769450B2 (en) | 2012-07-04 | 2017-09-19 | Intel Corporation | Inter-view filter parameters re-use for three dimensional video coding |
US20140192860A1 (en) | 2013-01-04 | 2014-07-10 | Canon Kabushiki Kaisha | Method, device, computer program, and information storage means for encoding or decoding a scalable video sequence |
US20140192869A1 (en) * | 2013-01-04 | 2014-07-10 | Canon Kabushiki Kaisha | Method, device, computer program, and information storage means for encoding or decoding a video sequence |
US20170223352A1 (en) * | 2014-07-31 | 2017-08-03 | Samsung Electronics Co., Ltd. | Video encoding method using in-loop filter parameter prediction and apparatus therefor, and video decoding method and apparatus therefor |
WO2018054286A1 (en) * | 2016-09-20 | 2018-03-29 | Mediatek Inc. | Methods and apparatuses of sample adaptive offset processing for video coding |
Non-Patent Citations (4)
Title |
---|
ESENLIK S ET AL: "Refinement for SAO and ALF syntax in the APS", 100. MPEG MEETING; 30-4-2012 - 4-5-2012; GENEVA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m24410, 28 April 2012 (2012-04-28), XP030052755 * |
LAROCHE (CANON) G ET AL: "Non-CE2: On SAO parameter signalling", no. JVET-K0201, 8 July 2018 (2018-07-08), XP030199094, Retrieved from the Internet <URL:http://phenix.int-evry.fr/jvet/doc_end_user/documents/11_Ljubljana/wg11/JVET-K0201-v2.zip JVET-K0201-v2.docx> [retrieved on 20180708] * |
WENGER S ET AL: "Adaptation Parameter Set (APS)", 97. MPEG MEETING; 18-7-2011 - 22-7-2011; TORINO; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m21385, 21 July 2011 (2011-07-21), XP030049948 * |
Y-W CHEN ET AL: "Description of SDR, HDR and 360° video coding technology proposal by Qualcomm and Technicolor "" low and high complexity versions", 10. JVET MEETING; 10-4-2018 - 20-4-2018; SAN DIEGO; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://PHENIX.INT-EVRY.FR/JVET/,, no. JVET-J0021-v5, 14 April 2018 (2018-04-14), XP030151184 * |
Also Published As
Publication number | Publication date |
---|---|
GB201809236D0 (en) | 2018-07-25 |
GB2574425A (en) | 2019-12-11 |
TW202005370A (en) | 2020-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11601687B2 (en) | Method and device for providing compensation offsets for a set of reconstructed samples of an image | |
WO2020002117A2 (en) | Methods and devices for performing sample adaptive offset (sao) filtering | |
WO2019233997A1 (en) | Prediction of sao parameters | |
WO2019234000A1 (en) | Prediction of sao parameters | |
WO2019233999A1 (en) | Video coding and decoding | |
WO2019234001A1 (en) | Video coding and decoding | |
WO2019234002A1 (en) | Video coding and decoding | |
WO2019233998A1 (en) | Video coding and decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19728942 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19728942 Country of ref document: EP Kind code of ref document: A1 |