US20130051472A1 - Quality Scalable Video Data Stream - Google Patents
Quality Scalable Video Data Stream Download PDFInfo
- Publication number
- US20130051472A1 US20130051472A1 US12/523,308 US52330807A US2013051472A1 US 20130051472 A1 US20130051472 A1 US 20130051472A1 US 52330807 A US52330807 A US 52330807A US 2013051472 A1 US2013051472 A1 US 2013051472A1
- Authority
- US
- United States
- Prior art keywords
- sub
- transform
- quality
- transformation coefficient
- scan
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/34—Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the current Joint Video Team “JVT” of the ITU-T Video Coding Experts Group in the ISO/IEC Moving Pictures Expert Group (MPEG) is currently specifying a scalable extension of the H.264/MPEG4-AVC video coding standard.
- the key feature of the scalable video coding (SVC) in comparison to conventional single layer encoding is that various representations of a video source with different resolutions, frame rates and/or bit-rates are provided inside a single bit stream.
- a video representation with a specific spatio-temporal resolution and bit-rate can be extracted from a global SVC bit-stream by simple stream manipulations as packet dropping.
- most components of H.264/MPG4-AVC are used as specified in the standard.
- the temporal prediction structures of the SNR layers should be temporally aligned for an efficient use of the inter-layer prediction. It should be noted that all NAL units for a time instant form an excess unit and thus have to follow each other inside an SVC bit-stream. The following three inter-layer predication techniques are included in the SVC design.
- the first one is called inter-layer motion prediction.
- an additional macroblock mode has been introduced into SNR enhancement layers.
- the macroblock partitioning is obtained by copying the partitioning of the co-located macroblock in the base layer.
- the reference picture indices as well as the associated motion vectors are copied from the co-located base layer blocks.
- a motion vector of the base layer can be used as a motion vector predictor for the conventional macroblock modes.
- both modes are signaled via a single syntax element base_mode_flag on a macroblock level.
- base_mode_flag When this flag is equal to 1, inter-layer intraprediction is chosen when the base layer macroblock is intra-coded. Otherwise, the macroblock mode as well as the reference indices and motion vectors are copied from the base layer macroblock.
- an apparatus for generating a quality-scalable video data stream may have: a coder for coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and a generator for forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer, such that the transform coefficient value of the one possible scan position is
- an apparatus for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream may have: a parser for parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; a constructor for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and a reconstructor for reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parser is configured such that the transform coefficient information of more than one of the quality layers may have a contribution value relating to
- a method for generating a quality-scalable video data stream may have the steps of: coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer such that the transform coefficient value of the one possible scan position is derivable based on
- a method for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream may have the steps of: parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parsing the video sub-data streams is performed such that the transform coefficient information of more than one of the quality layers may have a contribution value relating to one transformation coefficient value, and the
- a computer-program may have a program code for performing, when running on a computer, a method for generating a quality-scalable video data stream, the method having the steps of: coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of
- a computer-program may have a program code for performing, when running on a computer, a method for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream, the method having the steps of: parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parsing the video sub-data streams is performed such that the transform coefficient information of
- an apparatus for reconstructing a video signal from a quality-scalable video data stream comprising, for each of a plurality of quality layers, a video sub-data stream, comprises means for parsing the video sub-data streams of the plurality of quality layers, to obtain, for each quality layer, a scan range information and transform coefficient information on transformation coefficient values of different transform blocks, a predetermined scan order with possible scan positions being defined among the transformation coefficient values within the transform blocks so that in each transform block, for each possible scan position, at least one of the transformation coefficient values within the respective transform block belongs to the respective possible scan position, and the scan range information indicating a sub-set of the possible scan positions; means for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and means for reconstructing a picture of the video signal by a back-transformation of the transform blocks.
- FIG. 1 shows a block diagram of an encoder generating a quality-scalable video data stream according to an embodiment
- FIG. 2 shows a block diagram of a higher-layer hybrid coder of FIG. 1 according to an embodiment
- FIG. 3 shows a block diagram of a base-layer hybrid coder of FIG. 1 according to an embodiment
- FIG. 4 shows a block diagram of a layer coding unit of the higher quality layer of FIG. 1 according to an embodiment
- FIG. 5 shows a schematic diagram illustrating the structure of a picture as well as its bock-wise transformation according to an embodiment
- FIGS. 6 a - 6 g show schematic diagrams of a scanned portion of a transform block and its partitioning into sub-layers according to several embodiments
- FIG. 7 shows a schematic diagram illustrating the construction of sub-data streams according to an embodiment
- FIG. 8 shows a pseudo-code illustrating the coding of the transform coefficient levels belonging to a specific sub-data stream according to an embodiment
- FIG. 9 shows a pseudo-code illustrating another example for coding the transform coefficient levels belonging to a specific sub-data stream
- FIG. 10 a block diagram of a decoder according to another embodiment.
- FIG. 11 a block diagram of an embodiment for the decoding unit of FIG. 10 .
- FIG. 1 shows an encoder for generating a quality-scalable bit-stream.
- the encoder 10 of FIG. 1 is dedicated for generating a scalable bit-stream supporting two different spatial layers and N+1 SNR layers.
- the encoder 10 is structured into a base layer part 12 and a spatial enhancement layer part 14 .
- a quality reduction unit 16 of encoder 10 receives the original or higher quality video 18 representing a sequence of pictures and reduces its quality—in the sense of spatial resolution in the example of FIG. 1 —to obtain a lower quality version 22 of the original video 18 consisting of a sequence of pictures 24 , the lower quality version 22 being input into the base layer part 12 .
- the quality reduction unit 16 performs, for example, a sub-sampling of the pictures by a sub-sampling factor of 2, for example.
- a sub-sampling factor of 2 for example.
- FIG. 1 shows an example supporting two spatial layers 12 , 14
- the embodiment of FIG. 1 may readily be applied to applications where the quality reduction performed between the original video 18 and the lower quality video 22 does not comprise a sub-sampling, but for example, a reduction in the bit-depth of the representation of the pixel values, or the quality reduction unit simply copies the input signal to the output signal.
- the base layer part 12 receives the lower quality video 22
- the original video 18 is input into the higher quality part 14 , with both parts 12 , 14 performing a hybrid coding on the video respectively input.
- the base layer part 12 receives the lower quality video 22 and generates a base layer bit-stream 26 .
- the higher quality layer part 14 receives at its input the original video 18 and generates, besides a spatial enhancement layer bit-stream 28 , N SNR refinement layer bit-streams 30 .
- the generation and the interrelationship between bit-streams 28 and 26 will be described in more detail below.
- the base layer part 12 could also accompany the base layer bit-stream 26 by several SNR refinement layers 32 .
- All bit-streams 26 to 32 are input into a multiplexer 34 which generates a scalable bit-stream 36 from the data streams at its input, eventually arranged in packets, as will be described in more detail below.
- base layer part 12 comprises a hybrid coder 38 and a layer coding unit 40 connected in series, in the order mentioned, between the input to which the low quality video 24 is applied, on the one hand and the multiplexer 34 , on the other hand.
- the higher quality layer part 14 comprises a hybrid coder 42 and a layer coding unit 44 connected between the input to which the original video 18 is applied and the multiplexer 44 .
- Each hybrid coder 42 and 38 respectively, codes its video input signal by hybrid coding, i.e. motion compensated prediction is used along with block-wise transformation of the prediction residual.
- each hybrid coder 38 and 42 respectively, outputs motion information data 46 and 48 , respectively, as well as residual data 50 and 52 , respectively, into the input of the subsequent layer coding unit 40 and 44 , respectively.
- hybrid coder 42 can choose between several interlayer prediction options. For example, the hybrid coder 42 can decide to use or adopt the base layer motion data 46 as the motion data 48 for the higher quality layer. Alternatively, the hybrid coder 42 may decide to use the base layer motion data 46 as predictor for the motion data 48 . As a further alternative, the hybrid coder 42 may code the motion data 48 completely anew, i.e. independent from the base layer motion data.
- the hybrid coder 42 may code the residual data for the higher quality layer predictively as the prediction residual relative to the base layer residual data 50 as a predictor.
- the hybrid coder 42 may also use a reconstruction of the picture content of the base layer as a predictor for the picture content of the original video data so that in this case motion data and/or residual data 48 and 52 , respectively, merely code the residual relative to the reconstructed base layer data.
- the reconstructed base layer picture information may be received by the base layer hybrid coder 38 or a dedicated reconstruction unit 54 coupled between the base-layer coding unit 40 and a higher quality layer hybrid coder 42 .
- layer coding unit 40 in the following it is assumed that same merely generates base layer data-stream 26 .
- layer coding unit 40 in the following it is assumed that same merely generates base layer data-stream 26 .
- SNR refinement layer data-streams 32 is readily derivable from the following description with respect to the layer coding unit 44 .
- the base layer hybrid coder 38 comprises an input 56 for receiving the lower quality video signals 24 , an output 58 for the motion data 46 , an output 60 for the residual data 50 , an output 62 for coupling the motion data 58 to hybrid coder 42 , an output 64 for coupling reconstructed base layer picture data to hybrid coder 42 , and an output 66 for coupling residual data 50 to hybrid coder 42 .
- hybrid coder 38 comprises a transformation unit 68 , a back-transformation unit 70 , a subtractor 72 , an adder 74 , and a motion prediction unit 76 .
- the subtractor 72 and the transformation unit 68 are coupled, in the order mentioned, between the input 56 and the output 60 .
- the subtractor 72 subtracts from the input video signal motion-predicted video content received from the motion prediction unit 76 and forwards the difference signal to transformation unit 68 .
- the transformation unit 68 performs a block-wise transformation on the difference/residual signal along with, optionally, a quantization of the transform coefficients.
- the transformation result is output by the transformation unit 68 to output 60 as well as an input of back-transformation unit 70 .
- the back-transformation unit 70 performs an inverse transformation on the transform blocks of transformation coefficients with, eventually, a preceding dequantization.
- the result is a reconstructed residual signal which is, by adder 74 , additively combined with the motion-predicted video content output by motion prediction unit 76 .
- the result of the addition performed by adder 74 is a reconstructed video in base quality.
- the output of adder 74 is coupled to an input of motion prediction unit 76 as well as output 64 .
- the motion prediction unit 76 performs a motion-compensated prediction based on the reconstructed pictures in order to predict other pictures of the video input to input 56 .
- the motion prediction unit 76 produces, while performing motion-prediction, motion data including, for example, motion vectors and motion picture reference indices and outputs this mode motion data to output 62 as well as output 58 .
- the output of the transformation unit 68 is also coupled to the output 66 in order to forward the transform residual data to the hybrid coder 42 of the higher quality layer.
- the functionality of both hybrid coders 38 and 42 of FIG. 1 is similar to each other.
- the hybrid coder 42 of the higher quality layer also uses inter layer prediction.
- the structure of the hybrid coder 42 shown in FIG. 2 is similar to the structure of hybrid coder 38 shown in FIG. 3 .
- hybrid coder 42 comprises an input 86 for the original video signal 18 , an output 88 for the motion data 48 , an output 90 for the residual data 52 , and three inputs 92 , 94 and 96 for being coupled with the respective outputs 62 , 64 and 66 of base layer hybrid coder 38 .
- hybrid coder 42 comprises two switches or selectors 98 and 100 for connecting one of two paths 102 and 104 between input 86 and output 90 .
- path 104 comprises a subtractor 106 , a transformation unit 108 and a residual predictive coder 110 being connected, in the order mentioned, between input 86 and output 90 via switches 98 and 100 .
- Subractor 106 and transformation unit 108 form, along with a back-transformation unit 112 , an adder 114 and a motion prediction unit 116 , a prediction loop such as that formed by elements 68 to 76 in hybrid coder 38 of FIG. 3 . Accordingly, at the output of the transformation unit 108 a transformed version of the motion-predicted residual results which is input into residual predictive coder 110 .
- the residual predictive coder 110 is also connected to the input 96 in order to receive the base layer residual data. By use of this base layer residual data as a predictor, the residual predictive coder 110 codes a part of the residual data output by transformation unit 108 as a prediction residual relative to the residual data at input 96 .
- the residual predictive coder 110 up-samples the base layer residual data and subtracts the upsampled residual data from the residual data output by transformation unit 108 .
- the residual predictor coder 110 may perform the prediction only for a part of the residual data output by transformation unit 108 .
- Other paths pass residual predictive coder 110 unchanged.
- the granularity of these parts may be macro blocks.
- the decision as to whether the residual data at input 96 may be used as a predictor or not may be conducted on a macroblock basis and the result of the decision may be indicated by a respective syntax element residual_prediction_flag.
- the hybrid coder 42 comprises a motion parameter predictive coder 118 in order to receive the motion data at input 92 from the base layer as well as the motion information obtained from motion prediction unit 116 and switches, on a macroblock basis, between passing the motion data from motion prediction unit 116 unchanged to output 88 , or predictively coding the motion data by use of the motion information from the base layer at input 92 as a predictor.
- motion parameter predictive coder 118 may code motion vectors from motion prediction unit 116 as offset vectors relative to motion vectors contained in the base layer motion data at input 92 .
- motion parameter predictive coder 118 passes the base layer information from input 92 to motion prediction unit 116 to be used for the motion prediction in the higher quality layer.
- the motion parameter predictive coder 118 ignores the existence of the motion data at input 92 and codes the motion data from the motion prediction unit 116 directly to output 88 . The decision among these possibilities is coded into the resulting quality scalability bit-stream.
- the predictive coder 120 is provided in path 102 and coupled with input 94 .
- Predictive coder 120 predicts portions of the higher quality layer based on respective portions of the reconstructed base layer video signal so that at the output of predictive coder 120 merely the respective residual or difference is forwarded.
- Predictive coder 120 does also operate on a macroblock-wise basis in cooperation with switches 98 and 100 .
- the layer coding unit 44 of the higher quality layer comprises an input 122 for receiving the transform coefficients of residual data from output 90 and an input 124 for receiving the motion data from output 88 .
- a distributing unit 126 receives the transformation coefficients and distributes them to several enhancement layers. The transformation coefficients thus distributed are output to a formation unit 128 .
- the formation unit 128 receives the motion data from input 124 .
- the formation unit 128 combines both data and forms, based on these data inputs, the zero-order enhancement layer data stream 28 as well as refinement-layer data streams 30 .
- FIG. 5 represents a picture 140 .
- Picture 140 is, for example, part of the high quality video data 18 ( FIG. 1 ).
- the pixels are, for example, arranged in lines and columns.
- the picture 140 is, for example, partitioned into macroblocks 142 , which may also be arranged regularly in lines and columns.
- Each macroblock 142 may, for example, spatially cover a rectangular picture area in order to comprise, for example, 16 ⁇ 16 samples of, for example, the luma component of the picture.
- the macroblocks 142 may be organized in pairs of macroblocks.
- the vertically adjacent pair of macroblocks 142 may form such a pair of macroblocks and may assume, spatially, a macroblock pair region 144 of picture 140 .
- hybrid coder 42 FIG. 1
- hybrid coder 42 FIG. 1
- the video 18 is assumed to contain two interleaved fields, a top and a bottom field, where the top field contains the even numbered rows of pixels, and the bottom field contains the odd numbered rows starting with the second line of the picture 140 .
- the top macroblock of region 144 relates to the pixel values of the top field lines within region 144 whereas the bottom macroblock of region 144 relates to the content of the remaining lines.
- both macroblocks spatially assume substantially the whole area of region 144 with a reduced vertically resolution.
- the top macroblock is defined to spatially encompass the upper half of the rows within region 144 whereas the bottom macroblock comprises the remaining picture samples in region 144 .
- the transformation unit 108 performs a block-wise transformation of the residual signal output by subtractor 106 .
- the block basis for the transformation within transformation unit 108 may differ from the macroblock size of the macroblocks 142 .
- each of the macroblocks 142 may be partitioned into four, i.e. 2 ⁇ 2, transform blocks 146 or 16, i.e. 4 ⁇ 4, transform blocks 148 .
- the transformation unit 108 would transform the macroblocks 142 of picture 140 block-wise in blocks of size of 4 ⁇ 4 pixel samples or 8 ⁇ 8 pixel samples.
- the transformation unit 108 outputs, for a certain macroblock 142 , several transform blocks 146 and 148 respectively, namely 16 4 ⁇ 4 transform coefficient blocks or 4 8 ⁇ 8 transform coefficient blocks 146 .
- each transform coefficient is assigned to and represented by a scan position number, these numbers ranging from 0 to 63.
- the respective transformation coefficients are associated with a different spatial frequency component.
- the frequency associated with a respective one of the transform coefficients increases in magnitude from an upper left corner to the bottom right hand corner of the transform block 150 .
- the scan order defined by the scan positions among the transform coefficients of transform block 150 scans the transform coefficients from the upper left hand corner in a zig-zag manner to the lower right-hand corner, this zig-zag scan being illustrated by arrows 154 .
- the scan among the transform coefficients may be differently defined among the transform coefficients of a transform block of a field-coded macroblock.
- the transform coefficient scan 158 scans the transform coefficients from the upper left-hand corner to the lower right-hand corner in a zig-zag manner with a reciprocating or zig-zag direction which is steeper than the 45° zig-zag direction used in case of the frame-coded macroblock at 150 .
- a coefficient scan 158 scans the transform coefficients in column direction twice as fast than in line direction in order to take into account the fact that field-coded macroblocks encompass picture samples having a column pitch twice the horizontal or line pitch.
- coefficient scan 158 scans the transform coefficients in a way so that the frequency increases as the position scan number increases.
- the picture 140 may be subdivisioned, on a macroblock basis, into several slices 168 .
- One such slice 168 is exemplarily shown in FIG. 5 .
- a slice 168 is a sequence of macroblocks 142 .
- the picture 140 may be split into one or several slices 168 .
- the functionality of the distributing unit 126 and the formation unit 128 is described in the following in more detail.
- the scan order defined among the transform coefficients enables the two-dimensionally arranged transform coefficients to be ordered into a linear sequence of transform coefficients with monotonously increasing frequency contents to which they refer.
- the distributing unit 126 operates to distribute the transform coefficient of several macroblocks 142 to different quality layers, i.e. any of the zero order layer associated with a data stream 28 and the refinement layers 30 .
- the distributing layer 126 tries to distribute the transform coefficients to the data streams 28 and 30 in such a way, that with increasing number of contributing layers from the zero or the layer 28 to the highest quality refinement layer 30 , the SNR quality of the video reconstructable from the respective data streams increases. In general, this will lead to a distribution where the lower frequency transform coefficients corresponding to lower scan positions are distributed to lower quality layers whereas higher frequency transform coefficients are distributed to higher quality layers.
- distributing unit 126 will tend to distribute transform coefficients with higher transform coefficient values to lower quality layers and transform coefficients with lower transform coefficient values or energies to higher quality layers.
- the distribution formed by distributing unit 126 may be performed in such a way that each of the transform coefficients is distributed to one single layer.
- the formation unit 128 uses the distribution resulting from distributing unit 126 in order to form respective sub-data streams 28 and 30 .
- sub-data stream forms the lowest quality layer refinement sub-data stream and contains, for example, the motion data input at input 124 .
- This zero-order sub-data stream 128 may also be provided with a first distributed portion of the transform coefficient values.
- sub-data stream 28 allows for a refinement of the base-quality layer data stream 26 to a higher quality—in the instance of FIG. 1 to a higher spatial quality—but a further SNR quality enhancement may be obtained by accompanying the sub-data stream 28 with any of the further higher quality refinement sub-data streams 30 .
- the number of these refinement quality sub-data streams 30 is N, where N may be one or more than one.
- the transform coefficients are thereby—for example, in the order of increasing importance for the SNR quality—“distributed” to these sub-data streams 28 and 30 .
- FIG. 6 a shows an example for a distribution of the first 26 transform coefficient values of an 8 ⁇ 8 transform block.
- FIG. 6 a shows a table where the first line of the table lists the respective scan positions according to the scan order 154 and 158 , respectively ( FIG. 5 ). It can be seen that the scan positions shown extend, exemplarily from 0 to 25.
- the following three lines show the corresponding contribution values incorporated into the respective sub-data streams 28 and 30 , respectively, for the individual transform coefficient values.
- the second line corresponds to, for example, the zero order sub-data stream 28 whereas the penultimate line belongs to the next higher refinement layer 30 and the last line refers to the even next quality layer refinement data-stream.
- the last line refers to the even next quality layer refinement data-stream.
- a “ 122 ” is coded into the sub-data streams 128 for the DC component, i.e. transform coefficient value belonging to scan position 0.
- the contribution values for this transform coefficient having scan position 0 within the following two sub-data streams 30 are set to zero as indicated by the hashing of the respective table entries.
- the zero order enhancement layer sub-data stream 28 comprises a distribution value for each of the transform coefficient values.
- the transform coefficient values of scan positions 0 to 6, 8 and 9 belong to the zero order quality layers. Further transform coefficient values are set to zero.
- the transform coefficient values belonging to the zero order quality layer may belong to other scan positions.
- the transform coefficient values of scan positions 7, 10 to 12, 15 to 18 and 21 belong to the next higher quality layer.
- the remaining transform coefficient values are set to zero.
- the remaining coefficient values of the remaining scan positions are included in the next higher quality layer sub-data stream.
- a certain transform coefficient value is actually zero. In the example of FIG. 6 a , this is the case for scanning position 23.
- the corresponding contribution values within the preceding quality layers are set to zero and the transform coefficient value for the scan position 23 in the last quality layer (last line) for scan position 23 is zero itself.
- the contribution values included in the various quality layers sub-bit streams 28 and 30 sum up to the actual transform coefficient value so that, at decoder side, the actual transform block may be reconstructed by summing up the contribution values for the individual scan positions of the different quality layers.
- each of the sub-data streams 28 and 30 comprises a contribution value for all the transform coefficients and for all the scan positions, respectively.
- this is not necessarily the case.
- the zero order sub-data stream 28 contains any transform coefficient or contribution value. So in the latter case, the last three lines of the table of FIG. 6 a could be seen as belonging to the first refinement layer sub-data streams 30 with the zero order sub-data stream 28 merely comprising the motion information from input 124 .
- FIG. 6 a contribution values having been set to zero and actual transform coefficient values actually being zero have been distinguished by use of hashed table entries merely for sake of an easier understanding of the functionality of information unit 128 .
- the sub-data streams 28 and 30 may be construed such that the just-mentioned distinction between contribution values having been set to zero and contribution values naturally being zero is transparent for the decoder.
- some of the respective contribution values for respective scan positions i.e. the numbers from the second to the fourth line below a respective scan position in the first line of FIG. 6 a reveals the transform coefficient value independent from individual contribution values in the sum being set to zero or naturally being zero.
- the formation unit 128 coded into a respective one of the sub-data stream 28 and 30 respectively, a contribution value for each of the scan positions. This is not necessary.
- the consecutive quality layer sub-data streams comprise merely those transform coefficient values belonging to the respective quality layer.
- the order, in which the contribution values and transform coefficient values are coded into the sub-data streams 28 and 30 respectively, may vary in the embodiments of FIG. 6 a and FIG. 6 b , respectively.
- the sub-data streams 28 and 30 may be packetized data streams where each packet corresponds to one slice 168 .
- the transform coefficient values may be coded into the respective packets macroblock-wise. That is, a scan order may be defined among the macroblocks 142 within a slice 168 with the transform coefficient values for a predetermined macroblock 142 being completely coded into the respective packet before the first transform coefficient value of a macroblock following a macroblock scan order.
- a scan order may be defined among the respective transform blocks 146 and 148 , respectively, within the respective macroblock.
- the transform coefficient values may be coded into a respective one of the sub-data streams 28 and 30 , respectively by formation unit 128 such that the transform coefficient values of a respective one of the transform blocks are all coded into the respective sub-data stream before the first transform coefficient value of a next transform block is coded into the same.
- a coding of the transform coefficient values and contribution values, respectively may be conducted in a way explained in the following with respect to FIG. 8 or 9 .
- the set of scan positions belonging to this layer may be different.
- the distributing unit 126 distributes the transform coefficient values of the different transform blocks within a slice 168 such that for all transform blocks, the transform coefficient values of the same set of scan positions belongs to the same quality layer. For example, in FIG. 6 c the transform coefficient values of the scan positions from 0 to 11 belong to the zero order sub-data stream 28 with this being true for all transform blocks within slice 168 .
- the transform coefficient values belonging to a specific one of the quality layers extend over a continuous sequence of consecutive scan positions. This, however, needs not to be the case.
- transform coefficient values belonging to a scan position between the first and the last scan position belonging to a specific quality layer may belong to one of the other quality layers such as shown in FIG. 6 b .
- FIG. 6 d shows an embodiment where the distributing unit 126 has distributed the transform co-efficients over the quality layers as it was shown with respect to FIG. 6 a .
- This distribution differs from transform block to transform block.
- each of the quality layers is assigned a specific portion of the scan positions in common for all transform blocks.
- the lowest quality layer is assigned the full set of scan positions from scan position 0 to scan position 63.
- the lowest quality layer comprises 64 contribution values.
- the next higher quality layer sub-data stream comprises contribution or transform coefficient values for all transform blocks in a specific scan position range which extends from scan position 6 to 63.
- the scan position range of the next quality layer extends from scan position 13 to 63.
- the decoder does not need to know as to whether a specific one of the contribution values is a contribution value that has been set to 0 (hashed entry) or is actually indicating a 0 transform coefficient value or insignificant transform coefficient value.
- the syntax element scan_idx_start that indicates for the respective slice 168 from which scan position on the transform coefficient or contribution values contained in the respective sub-data stream are to be used for. To be more precise, in the embodiment of FIG.
- the sub-data stream corresponding to the penultimate line comprises, for an individual transform block 58 , transform coefficient or contribution values.
- the first one in case of the transform block of FIG. 6 d , is 0, while the second one is 22.
- the syntax element scan_idx_start at a decoder side, it is known that the first transform coefficient value of the respective quality layer corresponds to scan position 6, while the remaining transform coefficient values of this quality layer refer to the following scan positions.
- FIG. 6 e shows an embodiment where a syntax element scan_idx_end indicates for the individual sub-data streams the last scan position up to which the respective quality layer sub-data stream comprises sub-coefficients or contribution values.
- FIG. 6 f A combination of the embodiments of FIGS. 6 d and 6 e is shown in FIG. 6 f .
- the respective set of scan positions belonging to a specific one of the quality layers extends from a first scan position indicated by a syntax element scan_idx_start to a last scan position indicated by the syntax element last_idx_end.
- the respective set of scan position extends from scan position 6 to scan position 21.
- the embodiment of FIG. 6 g shows that the use of the syntax element scan_idx_start and/or scan_idx_end may be combined with the focus of the embodiment of FIG.
- the formation unit 28 is designed such that the individual sub-data streams 28 and 30 , respectively, are packetized, i.e. they comprise one or more packets.
- the formation unit 128 may be designed to generate a packet for each slice 168 within a picture 140 within each sub-bit stream 28 and 30 , respectively.
- a packet may comprise a slice header 170 on the one hand and residual data 172 on the other hand, except sub-bit stream 28 which optionally comprises merely the slice header within each one of the packets.
- residual data 172 i.e. residual data # 1 , residual data residual data #N
- FIGS. 6 a to 6 g where for example, the second to fourth lines in these tables correspond to residual data # 1 , residual data # 2 and residual data # 3 , for example.
- residual data 172 indicated in FIG. 7 includes the transform coefficient values discussed in FIGS. 6 a to 6 g , the distribution of which among the respective sub-data streams 28 and 30 is not again described here.
- FIG. 7 shows further syntax elements contained in the slice header 170 and the residual data 172 which stem from hybrid coder 42 .
- the hybrid coder 42 switches, on a macroblock basis between several inter-layer prediction modes so as to rely on the motion information from the base layer, or generate new motion information for a respective motion block of the higher refinement layer with predictively coding the motion information as a residual to the motion information from the base layer, or with coding this motion information anew.
- the residual data 172 may comprise, for each macroblock, syntax elements indicating motion parameters, macroblock modes such as field or frame coded, or an inferring mode indicating the reuse of the motion parameters of the base layer with the respective macroblock. This is especially true for the zero or the sub-data stream 28 .
- the formation unit 128 is designed to leave these macroblock-wise syntax elements concerning macroblock modes, motion parameters and inferring mode indication in the residual data of these sub-data streams 30 1 to 30 N away or to set the syntax elements in these sub-data streams 30 1 to 30 N to be either equal to the macroblock modes and motion parameters for the respective macroblock contained in sub-data stream 28 or indicate the inferring mode for the respective macroblock in order to indicate that the same settings are to be used in the respective refinement layer.
- the slice header data 170 may comprise merely one of scan_idx_start and scan_idx_end.
- scan_idx_start and/or scan_idx_end may be provided once per transform block size category, i.e. 4 ⁇ 4 and 8 ⁇ 8, or just once for each slice/picture/sub-data stream commonly for all transform block size categories, with respective measures being taken to transfer scan_idx_start and scan_idx_end to other block sizes as will be described in the following.
- the slice header data may comprise a syntax element indicating the quality level.
- the formation unit 128 may be designed such that the syntax element or quality indicator merely distinguishes between the zero order quality level 28 on the one hand and the refinement layers 30 1 to 30 N on the other hand.
- the quality indicator may distinguish all quality layers among the refinement layers 28 and 30 1 to 30 N .
- the quality indicator would enable the omission of any macroblock-wise defined macroblock modes, motion parameters and/or inferring modes within the packets of the sub-data streams 30 1 to 30 N since in this case, at the decoder side, it is known that these refinement layers sub-data streams 30 1 to 30 N merely refine the transform coefficients with using the macroblock modes, motion parameters and inferring modes from the zero mode sub-data stream 28 .
- the formation unit 28 may be designed to entropy code the packets within the sub-data streams 28 and 30 1 to 30 N .
- FIGS. 8 and 9 show possible examples for coding the transform coefficients within the residual data pertaining to one transform block according to two embodiments.
- FIG. 8 shows a pseudo code of a first example for a possible coding of the transform coefficients within a transform block in any of the residual data 172 .
- the following example applies:
- total_coeff(coeff_token) is 5 (transform coefficient numbers 0,1,2,4 and 7) and trailing_ones(coeff_token) is 2 (transform coefficient number 4 and 7).
- the positions of the significant transform coefficients have been determined to the extent that no more than total_coeff(coeff_token) non-zero transform coefficients exist.
- the non-transform coefficients are stepped through in a reverse scan order 244 .
- the reverse scan order is not yet obvious from just viewing the counting parameter incrementation i++ in the for-loop 244 but will become clear from the following evaluation.
- stepping through these non-transform coefficients in reverse scan order for the first of these non-zero transform coefficients, just their transform coefficient sign is provided 248 . This is done for the first number of trailing_ones(coeff_token) of the non-zero transform coefficients when stepping through them in a reverse scan order, since for these transform coefficients it is already known that the absolute value of these transform coefficients is one (compare with the above definition of trailing_ones(coeff_token)).
- the coefficient signs thus provided are used to temporarily store in auxiliary vector coefficients level[i] for the transform coefficient level of the non-zero transform coefficient levels having absolute value of 1 wherein i is a numbering of the non-zero transform coefficients when scanned in reverse scan order ( 250 ).
- the coefficient levels coeff_level for the remaining non-zero transform coefficients are provided ( 252 ) in reverse scan order and temporarily stored in the auxiliary vector coefficients level [i] ( 254 ).
- a parameter run_before is provided ( 64 ) indicating the length of the run of zero-level transform coefficients arranged directly in front of the respective non-zero transform coefficient when seen in scan order.
- the last non-zero transform coefficient with respect to the scan order is the non-zero transform coefficient in question.
- this is transform coefficient having the number 7 and having the level 1.
- the run of zeros in front of this transform coefficient has a length of 2, i.e. transform coefficients 5 and 6.
- the first run_before-parameter is 2.
- This parameter is temporarily stored in auxiliary vector coefficient run[ 0 ] ( 266 ). This is repeated in reverse scan order for run[i], with i being the count of the non-zero transform coefficients when scanned in reverse scan order.
- auxiliary parameter zerosLeft By decreasing the auxiliary parameter zerosLeft by the parameter run_before in each round of the for-loop ( 261 ) it is determined for each round as to how many zero-level transform coefficients are left. If zerosLeft is zero, no run_before-parameter is provided anymore ( 270 ) and the remaining coefficients of the vector run are set to zero ( 272 ).
- no run_before-parameter is provided for the last non-zero transform coefficient when stepped through in reverse scan order, i.e. non run_before-parameter for the first non-zero transform coefficient with respect to the scan order.
- This parameter is deduced from the number of zero-level transform coefficients left, as indicated by the auxiliary parameter zerosLeft ( 274 ).
- the values of the transform coefficient levels as stored in auxiliary vector level are assigned to their positions by copying the values of the coefficients of vector level to the respective position in the one-dimensional array coeffLevel.
- This is repeated for the next auxiliary vector coefficients level [ 3 ] to level [ 0 ]. Since the remaining positions of the array coeffLevel have been initialised to the value of zero ( 280 ) all transform coefficients have been coded.
- the bold written syntax elements in FIG. 8 may be coded into the respective sub-data stream by means of variable length coding, for example.
- FIG. 9 shows another example for coding a transform block.
- the scanning order manifests itself in “i++” within the while-loop 310 indicating that counting parameters i is incremented per while-loop iteration.
- numCoeff is i+1 ( 320 ) and the levels of the subsequent transform coefficients can be deduced to be zero ( 322 ).
- the syntax elements last_significant_coeff_flag and significant_coeff_flag may be seen as a significance map.
- the absolute value of the level minus 1, i.e. coeff_abs_level_minus 1 , and its sign, i.e. coeff_sign_flag is provided ( 324 ), thereby indicating the transform coefficient level of this last significant transform coefficient ( 326 ).
- the parsing of the syntax elements coeff_abs_level_minus 1 begins with deriving a binarization for the possible values of the syntax element.
- the binarization scheme may be a UEG 0 , i.e. a concatenated unary/zero-th order Exp-Golomb binarization process.
- the respective syntax element may be binary arithmetically coded bin by bin.
- a context adaptive binary arithmetic coding scheme may be used for a prefix part of the binarization of coeff_abs_level_minus 1 while using a decode bypass process having no adaptation for a suffix part.
- the number of distinguishable scan positions within the 8 ⁇ 8 transform blocks is 64 whereas the number of distinguishable scan positions within the 4 ⁇ 4 transform blocks is merely 16.
- the abovementioned syntax element scan_idx_start and scan_idx_end may either be defined in an accuracy enabling a distinction between all 64 scan positions, or merely a distinction between 16 scan positions.
- the syntax elements may be applied to each quadruple of consecutive transform coefficients within the 8 ⁇ 8 transform blocks.
- 8 ⁇ 8 transform blocks may be coded by use of
- residual_block being either residual_block_cavlc or residual_block_cabac
- LumaLevel4 ⁇ 4 and LumaLevel8 ⁇ 8 indicating an array of luma samples of the respective 4 ⁇ 4 and 8 ⁇ 8 transform block, respectively.
- scan_idx_start and scan_idx_end are defined to discriminate between 16 scan positions so that they indicate the range of positions in 4 ⁇ 4 blocks exactly. However, in 8 ⁇ 8 blocks, the accuracy of these syntax elements is not sufficient so that in these blocks the range is adjusted quadruple wise.
- 8 ⁇ 8 blocks of transform coefficients can also be encoded by partitioning the 64 coefficients of an 8 ⁇ 8 block into 4 sets of 16 coefficients, for example by placing every fourth coefficient into the n-th set starting with coefficient n with n in the range of 0 to 3, inclusive, and coding each set of 16 coefficients using the residual block syntax for 4 ⁇ 4 blocks.
- these 4 sets of 16 coefficients are re-combined to form a set of 64 coefficients representing an 8 ⁇ 8 block.
- FIG. 10 shows the general construction of a decoder 400 .
- the decoder 400 comprises a demultiplexer 402 having an input 404 for receiving the scalable bit-stream 36 .
- the demultiplexer 402 demulitplexes the input signal 36 into the data streams 26 to 32 .
- the demultiplexer may—perform a decoding and/or parsing function.
- the demultiplexer 402 may decode the transform block codings of FIGS. 8 and 9 . Further, recall FIG. 6 a - 6 g .
- demultiplexer 402 may use information of preceding sub-data streams in order to, in parsing a current sub-data stream, know how many transform coefficient values or contribution values are to be expected for a specific transform block.
- the data-streams thus retrieved are received by a decoding unit 406 which, based on these data-streams, reconstructs the video 18 and outputs the respective reconstructed video 408 at a respective output 410 .
- the decoding unit 406 comprises a base layer motion data input 412 , a base layer residual data input 414 , a zero order refinement layer motion data input 416 , an optional transform coefficient zero order refinement transform coefficient data input 418 and an input 420 for the sub-data streams 30 .
- inputs 412 and 414 are for receiving data-stream 26
- inputs 416 and 418 cooperate to receive data-stream 28 .
- the decoding unit 406 comprises a lower quality reconstruction video signal output 422 , a higher quality interlayer coded reconstruction video signal output 424 , and an internally coded reconstruction video signal output 426 , the latter ones providing the information for a higher quality video signal.
- the combiner may use the knowledge of the transform coefficient values within the individual transform blocks received so far from lower quality or SNR layers.
- the transform blocks output by combiner 428 are received by a residual predictive decoder 430 and an adder 432 .
- an back or inverse transformation unit 432 is connected in order to forward inversely transformed residual data to the residual predictive decoder 430 .
- the latter uses the inversely transformed residual data in order to obtain a predictor to be added to the transform coefficients of the transform blocks output by combiner 428 , eventually after performing an up-sampling or another quality adaptation.
- a motion prediction unit 434 is connected between the input 412 and an input of an adder 436 .
- Another input of the adder 436 is connected to the output of a back-transformation unit 432 .
- the motion prediction unit 434 uses the motion data on input 412 to generate a prediction signal for the inversely transformed residual signal output by the back-transformation unit 432 .
- a result of adder 436 at the output of adder 436 is a reconstructed base layer video signal.
- the output of adder 436 is connected to the output 432 as well as in input of predictive decoder 432 .
- the predictive decoder 432 uses the reconstructed base layer signal as a prediction for the intra layer coded portions of the video content output by combiner 428 , eventually by use of an up-sampling.
- the output of adder 436 is also connected to an input of motion prediction units 434 in order to enable that the motion prediction unit 434 uses the motion data at input 412 to generate a prediction signal to the second input of adder 436 based on the reconstructed signals from the base layer data stream.
- the predictively decoded transform coefficient values output by residual predictive decoder 430 are back-transfomed by back-transformation unit 438 .
- back-transformation unit 438 At the output of back-transformation unit 438 , a higher quality residual video signal data results. This higher quality residual data video signal is added by an adder 440 with a motion predicted video signal output by a motion prediction unit 442 .
- the reconstructed high quality video signal results which reaches output 424 as well as a further input of motion prediction unit 442 .
- the motion prediction unit 442 performs the motion prediction based on the reconstructed video signal output by adder 440 as well as the motion information output by a motion parameter prediction decoder 444 which is connected between input 416 and a respective input of motion prediction unit 442 .
- the motion parameter predictive decoder 444 uses, on a macroblock selective basis, motion data from the base layer motion data input 412 as a predictor, and dependent on this data, outputs the motion data to the motion prediction unit 442 with using, for example, the motion vectors at input 416 as offset vectors to motion vectors at input 412 .
- the above described embodiments enable an increase in the granularity of SNR scalable coding on a picture/slice level in comparison to CGS/MGS coding as described in the introductory portion, but without the significant increase in complexity that is present in FGS coding. Furthermore, since it is believed that the feature of FGS that packets can be truncated will not widely be used, the bit-stream adaptation is possible by simple packet dropping.
- the above described embodiments have the basic idea in common, to partition the transform coefficient levels of a traditional CGS/MGS packet as it is currently specified in the SVC draft into subsets, which are transmitted in different packets and different SNR refinement layers.
- the above described embodiments concerned the CGS/MGS coding with one base and one enhancement layer.
- the enhancement layer including, for each picture, macroblock modes, intra prediction modes, motion vectors, reference picture indices, other control parameters as well as transforms coefficient levels for all macroblocks, in order to increase the granularity of the SNR scalable coding, these data were distributed over different slices, different packets, and different enhancement layers.
- the macroblock modes, motion parameter, other control parameters as well as, optionally, a first subset of transform coefficient levels are transmitted.
- the same macroblock modes and motion vectors are used, but a second subset of transform coefficient levels are encoded. All transform coefficients that have already been transmitted in the first enhancement layer may be set to zero in the second and all following enhancement layers.
- the macroblock modes and motion parameters of the first enhancement layer are again used, but further subsets of transform coefficient levels are encoded.
- a flag is encoded at the slice level, which signals whether all macroblock modes and motion parameters are inferred from the base layer.
- this flag should usually set to 0, since for this enhancement it should be possible to transmit motion vectors that are different from the base layer in order to improve the coding efficiency. But in all further enhancement layers, this flag is set equal to 1, since these enhancement layers only represent a refinement of transform coefficient levels of scan positions that haven't been encoded in the previous SNR enhancement layers. And by setting this flag equal to 1, the coding efficiency can be improved for this case, since no transmission of non-required syntax elements is necessary and thus associated bit-rate is saved.
- the first scanning position x for the transform coefficient levels in the various transform blocks may be transmitted at a slice level, with no syntax elements being transmitted at a macroblock level for transform coefficients with a scanning position that is smaller than x.
- the first scanning position is transmitted only for a specific transform size and the first scanning position for other transform sizes is inferred based on the transmitted value, it would be possible to transmit a first scanning position for all supported transform sizes.
- the last scanning position y for the transform coefficient levels in the various transform blocks may be transmitted at a slice level, with no syntax elements being transmitted at a macroblock level for transform coefficients with a scanning position that is greater than y.
- the first scanning position for each transform block in an SNR enhancement layer may alternatively be inferred based on the transform coefficients that have been transmitted in a previous enhancement layer.
- This inference rule may independently applied to all transform blocks, and in each block a different first transform coefficient can be derived by, for example, combiner 428 .
- first scanning position may basically inferred based on already transmitted transform coefficient levels in previous SNR enhancement layers, but for this the additional knowledge is used that the first scanning position cannot be smaller than a value x, which is transmitted in the slice header.
- first scanning index in each transform block, which can be chosen in order to maximize the coding efficiency.
- the signaling of the first scan position, the inference of the first scan position, or the combination of them may be combined with the signaling of the last scanning position.
- slice header syntax elements which specify that macroblock modes and motion parameters are inferred for all macroblock types and/or that transform coefficients for several scanning positions are not present at a transform block level.
- a slice level syntax element may be used that signals that the macroblock modes and motion parameters for all macroblock are inferred from the co-located base layer macroblocks. Specifically, the same macroblock modes and motion parameters may be used, and the corresponding syntax elements may not be transmitted at a slice level.
- the first scanning position x for all transform blocks may be signaled by slice header syntax elements.
- no syntax elements are transmitted for transform coefficient values of scanning positions smaller than x.
- the first scanning position for a transform block may be inferred based on the transmitted transform coefficient levels of the base layer. A combination of the latter alternatives is also possible.
- the last scanning position y for all transform blocks may be signaled by slice header syntax elements, wherein, at the macroblock level, no syntax elements are transmitted for transform coefficient values of scanning positions greater than y.
- FIG. 1-11 may be varied in various ways.
- the above embodiments were exemplified with respect to a two spatial layer environment, the above embodiments are readily transferable to an embodiment with only one quality layer or with more than one quality layer but with the N+1 SNR scalable refinement layers. Imagine, for example, that part 12 in FIG. 1 is missing.
- hybrid coder 42 acts as a coding means for coding the video signal 18 using block-wise transformation to obtain transform blocks 146 , 148 of transformation coefficient values for a picture 140 of the video signal while unit 44 acts as a means for forming, for each of a plurality of quality layers, a video sub-data stream 30 or 28 plus 30 containing scan range information indicating a sub-set of the possible scan positions, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions. No inter layer prediction would be involved. Moreover, coder 42 may be simplified to perform no motion prediction but merely block-wise transformation.
- demultiplexer 402 would act as a parsing means for parsing the video sub-data streams of the plurality of quality layers, to obtain, for each quality layer, the scan range information and the transform coefficient information
- the combiner 428 would act as a means for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions, with the back-transformation unit 438 reconstructing the picture of the video signal by a back-transformation of the transform blocks.
- the embodiment in FIG. 1 may be varied in a way that the base layer coder 12 operates with the same spatial resolution and the same bit depth as the enhancement layer coder 14 .
- the embodiment represents SNR scalable coding with a standard base layer and various enhancement layers 28 , 30 that contain partitions of the transform coefficients.
- the inventive scheme can be implemented in hardware or in software. Therefore, the present invention also relates to a computer program, which can be stored on a computer-readable medium such as a CD, a disk or any other data carrier.
- the present invention is, therefore, also a computer program having a program code which, when executed on a computer, performs the inventive method in connection with the above figures.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Picture Signal Circuits (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Television Systems (AREA)
- Image Processing (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Control Of Indicators Other Than Cathode Ray Tubes (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
An apparatus for generating a quality-scalable video data stream includes a coder for coding a video signal using block-wise transformation to obtain transform blocks of transformation coefficient values for a picture of the video signal, a predetermined scan order with possible scan positions being defined among the transformation coefficient values within the transform blocks so that in each transform block, for each possible scan position, at least one of the transformation coefficient values within the respective transform block belongs to the respective possible scan position; and a generator for forming, for each of a plurality of quality layers, a video sub-data stream containing scan range information indicating a sub-set of the possible scan positions, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions such that the sub-set of each quality layer includes at least one possible scan position not included by the sub-set of any other of the plurality of quality layers.
Description
- This application is a U.S. national entry of PCT Patent Application Serial No. PCT/EP2007/003411 filed 18 Apr. 2007, and claims priority to U.S. Patent Application No. 60/885,534 filed 18 Jan. 2007, which are incorporated herein by references in their entirety.
- The present invention relates to quality-scalable video data streams, their generation and decoding such as the generation and decoding of video data streams obtained by use of block-wise transformation.
- The current Joint Video Team “JVT” of the ITU-T Video Coding Experts Group in the ISO/IEC Moving Pictures Expert Group (MPEG) is currently specifying a scalable extension of the H.264/MPEG4-AVC video coding standard. The key feature of the scalable video coding (SVC) in comparison to conventional single layer encoding is that various representations of a video source with different resolutions, frame rates and/or bit-rates are provided inside a single bit stream. A video representation with a specific spatio-temporal resolution and bit-rate can be extracted from a global SVC bit-stream by simple stream manipulations as packet dropping. As an important feature of the SVC design, most components of H.264/MPG4-AVC are used as specified in the standard. This includes the motion-compensated and intra prediction, the transform and entropy coding, the deblocking as well as the NAL unit packetization (NAL=Network Abstraction Layer). The base layer of an SVC bit-stream is generally coded in compliance with the H.264-MPEG4-AVC, and thus each standard conforming H.264-MPEG4-AVC decoder is capable of decoding the base layer representation when it is provided with an SVC bit-stream. New tools are only added for supporting spatial and SNR scalability.
- For SNR scalability, coarse-grain/medium-grain scalability (CGS/MGS) and fine-grain scalability (FGS) are distinguished in the current Working Draft. Coarse-grain or medium-grain SNR scalable coding is achieved by using similar concepts as for spatial scalability. The pictures of different SNR layers are independently coded with layer specific motion parameters. However, in order to improve the coding efficiency of the enhanced layers in comparison to simulcast, additional inter-layer prediction mechanisms have been introduced. These prediction mechanisms have been made switchable so that an encoder may freely choose which base layer information should be exploited for an efficient enhancement layer coding. Since the incorporated inter-layer prediction concepts include techniques for motion parameter and residual prediction, the temporal prediction structures of the SNR layers should be temporally aligned for an efficient use of the inter-layer prediction. It should be noted that all NAL units for a time instant form an excess unit and thus have to follow each other inside an SVC bit-stream. The following three inter-layer predication techniques are included in the SVC design.
- The first one is called inter-layer motion prediction. In order to employ base-layer motion data for the enhancement layer coding, an additional macroblock mode has been introduced into SNR enhancement layers. The macroblock partitioning is obtained by copying the partitioning of the co-located macroblock in the base layer. The reference picture indices as well as the associated motion vectors are copied from the co-located base layer blocks. Additionally, a motion vector of the base layer can be used as a motion vector predictor for the conventional macroblock modes.
- The second technique of redundancy reduction among the various quality layers is called inter-layer residual prediction. The usage of inter-layer residual prediction is signaled by a flag (residual_prediction_flag) that is transmitted for all inter-coded macroblocks. When this flag is true, the base layer signal of the co-located block is used as prediction for the residual signal of the current macroblock, so that only the corresponding difference signal is coded.
- Finally, inter-layer intra prediction is used in order to exploit redundancy among the layers. In this intra-macroblock mode, the prediction signal is built by the co-located reconstruction signal of the base layer. For the inter-layer intraprediction it is generally necessitated that base layers are completely decoded including the computationally complex operations of motion-compensation prediction and deblocking. However, it has been shown that this problem can be circumvented when the inter-layer intra prediction is restricted to those parts of the lower layer picture that are intra-coded. With this restriction, each supported target layer can be decoded with a single motion compensation loop. This single-loop decoding mode is mandatory in the scalable H.264-MPEG4-AVC extension.
- Since inter-layer intraprediction can only be applied when the co-located macroblock is intra-coded and the inter-layer motion prediction with inferring the macroblock type can be only applied when the base layer macroblock is inter-coded, both modes are signaled via a single syntax element base_mode_flag on a macroblock level. When this flag is equal to 1, inter-layer intraprediction is chosen when the base layer macroblock is intra-coded. Otherwise, the macroblock mode as well as the reference indices and motion vectors are copied from the base layer macroblock.
- In order to support a finer granularity than CGS/MGS coding, so-called progressive refinement slices have been introduced which enable finer granular SNR scalable coding (FGS). Each progressive refinement slice represents a refinement of the residual signal that corresponds to a bisection of the quantization steps size (QP increase of 6). These signals are represented in a way that only a single inverse transform has to be performed for each transform block at the decoder side. The ordering of transform coefficient levels in progressive refinements slices allows the corresponding NAL units to be truncated at any arbitrary byte-aligned point, so that the quality of the SNR base layer can be refined in a fine-granular way. In addition to a refinement of the residual signal, it is also possible to transmit a refinement of motion parameters as part of the progressive refinement slices.
- One drawback of the FGS coding in the current SVC draft is that it significantly increases the decoder complexity in comparison to CGS/MGS coding. On the one side the transform coefficients in a progressive refinement slice are coded using several scans over the transform blocks, and in each scan only a few transform coefficient levels are transmitted. For the decoder this increases the complexity since a higher memory bandwidth is needed, because all transform coefficient levels from different scans need to be collected before the inverse transform can be carried out. On the other side, the parsing process for progressive refinement slices is dependent on the syntax elements of the corresponding base layer slices. The order of syntax elements as well as the codeword tables for VLC coding or the probability model selection for arithmetic coding depend on the syntax elements in the base layer. This further increases the memory bandwidth for decoding, since the syntax elements of the base layer need to be accessed during the parsing of the enhancement layer.
- Furthermore, the special property of progressive refinement slices that they can be truncated is difficult to use in today's packet switch networks. Usually, a media aware network device will either deliver or drop a packet of a scalable bit-stream. And the only error that will be visible at the application layer is a packet loss.
- Therefore, not only in view of the above H.264-MPEG4-AVC but also with other video compression techniques, it would be desirable to have a coding scheme that is better adapted to the today's needs showing packet loss rather than byte-wise truncation problems.
- According to an embodiment, an apparatus for generating a quality-scalable video data stream, may have: a coder for coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and a generator for forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer, such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
- According to another embodiment, an apparatus for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream, may have: a parser for parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; a constructor for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and a reconstructor for reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parser is configured such that the transform coefficient information of more than one of the quality layers may have a contribution value relating to one transformation coefficient value, and wherein the constructor is configured to derive the value for the one the transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
- According to another embodiment, a method for generating a quality-scalable video data stream, may have the steps of: coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
- According to another embodiment, a method for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream, may have the steps of: parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parsing the video sub-data streams is performed such that the transform coefficient information of more than one of the quality layers may have a contribution value relating to one transformation coefficient value, and the constructing the transform blocks has the step of deriving the value for the one transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
- According to another embodiment, a quality-scalable video data stream enabling a reconstruction of a video signal may have, for each of a plurality of quality layers, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions, wherein the transform coefficient information concerns transformation coefficient values belonging to the sub-set of possible scan positions, wherein the transform coefficient information of more than one of the quality layers has a contribution value relating to one transformation coefficient value, and the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
- According to another embodiment, a computer-program may have a program code for performing, when running on a computer, a method for generating a quality-scalable video data stream, the method having the steps of: coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and forming, for each of a plurality of quality layers, a video sub-data stream having scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers has at least one possible scan position not included by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is included by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, having a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
- According to another embodiment, a computer-program may have a program code for performing, when running on a computer, a method for reconstructing a video signal from a quality-scalable video data stream having, for each of a plurality of quality layers, a video sub-data stream, the method having the steps of: parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions; using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parsing the video sub-data streams is performed such that the transform coefficient information of more than one of the quality layers may have a contribution value relating to one transformation coefficient value, and the constructing the transform blocks has the step of deriving the value for the one transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
- In accordance with an embodiment of the present invention, an apparatus for generating a quality-scalable video data stream, comprises means for coding a video signal using block-wise transformation to obtain transform blocks of transformation coefficient values for a picture of the video signal, a predetermined scan order with possible scan positions being defined among the transformation coefficient values within the transform blocks so that in each transform block, for each possible scan position, at least one of the transformation coefficient values within the respective transform block belongs to the respective possible scan position; and means for forming, for each of a plurality of quality layers, a video sub-data stream containing scan range information indicating a sub-set of the possible scan positions, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions such that the sub-set of each quality layer comprises at least one possible scan position not comprised by the sub-set of any other of the plurality of quality layers.
- Further, in accordance with an embodiment of the present invention, an apparatus for reconstructing a video signal from a quality-scalable video data stream comprising, for each of a plurality of quality layers, a video sub-data stream, comprises means for parsing the video sub-data streams of the plurality of quality layers, to obtain, for each quality layer, a scan range information and transform coefficient information on transformation coefficient values of different transform blocks, a predetermined scan order with possible scan positions being defined among the transformation coefficient values within the transform blocks so that in each transform block, for each possible scan position, at least one of the transformation coefficient values within the respective transform block belongs to the respective possible scan position, and the scan range information indicating a sub-set of the possible scan positions; means for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and means for reconstructing a picture of the video signal by a back-transformation of the transform blocks.
- Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
-
FIG. 1 shows a block diagram of an encoder generating a quality-scalable video data stream according to an embodiment; -
FIG. 2 shows a block diagram of a higher-layer hybrid coder ofFIG. 1 according to an embodiment; -
FIG. 3 shows a block diagram of a base-layer hybrid coder ofFIG. 1 according to an embodiment; -
FIG. 4 shows a block diagram of a layer coding unit of the higher quality layer ofFIG. 1 according to an embodiment; -
FIG. 5 shows a schematic diagram illustrating the structure of a picture as well as its bock-wise transformation according to an embodiment; -
FIGS. 6 a-6 g show schematic diagrams of a scanned portion of a transform block and its partitioning into sub-layers according to several embodiments; -
FIG. 7 shows a schematic diagram illustrating the construction of sub-data streams according to an embodiment; -
FIG. 8 shows a pseudo-code illustrating the coding of the transform coefficient levels belonging to a specific sub-data stream according to an embodiment; -
FIG. 9 shows a pseudo-code illustrating another example for coding the transform coefficient levels belonging to a specific sub-data stream; -
FIG. 10 a block diagram of a decoder according to another embodiment; and -
FIG. 11 a block diagram of an embodiment for the decoding unit ofFIG. 10 . -
FIG. 1 shows an encoder for generating a quality-scalable bit-stream. Exemplarily, theencoder 10 ofFIG. 1 is dedicated for generating a scalable bit-stream supporting two different spatial layers and N+1 SNR layers. To this end, theencoder 10 is structured into abase layer part 12 and a spatialenhancement layer part 14. Aquality reduction unit 16 ofencoder 10 receives the original orhigher quality video 18 representing a sequence of pictures and reduces its quality—in the sense of spatial resolution in the example of FIG. 1—to obtain alower quality version 22 of theoriginal video 18 consisting of a sequence ofpictures 24, thelower quality version 22 being input into thebase layer part 12. - The
quality reduction unit 16 performs, for example, a sub-sampling of the pictures by a sub-sampling factor of 2, for example. However, it is to be understood that althoughFIG. 1 shows an example supporting twospatial layers FIG. 1 may readily be applied to applications where the quality reduction performed between theoriginal video 18 and thelower quality video 22 does not comprise a sub-sampling, but for example, a reduction in the bit-depth of the representation of the pixel values, or the quality reduction unit simply copies the input signal to the output signal. - While the
base layer part 12 receives thelower quality video 22, theoriginal video 18 is input into thehigher quality part 14, with bothparts base layer part 12 receives thelower quality video 22 and generates a base layer bit-stream 26. On the other hand, the higherquality layer part 14 receives at its input theoriginal video 18 and generates, besides a spatial enhancement layer bit-stream 28, N SNR refinement layer bit-streams 30. The generation and the interrelationship between bit-streams base layer part 12 could also accompany the base layer bit-stream 26 by several SNR refinement layers 32. However, in order to ease the illustration of the principles of the present embodiment, it is assumed that SNR scalability is restricted to theenhancement layer part 14. However, the following discussion will reveal that the functionality described below with respect to the higherquality layer part 14 with regard to the SNR refinement layers is readily transferable to thebase layer part 12. This is indicated inFIG. 1 by a dottedline 32. - All bit-
streams 26 to 32, are input into amultiplexer 34 which generates a scalable bit-stream 36 from the data streams at its input, eventually arranged in packets, as will be described in more detail below. - Internally,
base layer part 12 comprises ahybrid coder 38 and alayer coding unit 40 connected in series, in the order mentioned, between the input to which thelow quality video 24 is applied, on the one hand and themultiplexer 34, on the other hand. Similarly, the higherquality layer part 14 comprises ahybrid coder 42 and alayer coding unit 44 connected between the input to which theoriginal video 18 is applied and themultiplexer 44. Eachhybrid coder hybrid coder motion information data residual data layer coding unit - Naturally, redundancy exists between the
motion data 46 on the one hand and 48 on the other hand, as well as theresidual data hybrid coder 42. In particular, on a macroblock basis, thehybrid coder 42 can choose between several interlayer prediction options. For example, thehybrid coder 42 can decide to use or adopt the baselayer motion data 46 as themotion data 48 for the higher quality layer. Alternatively, thehybrid coder 42 may decide to use the baselayer motion data 46 as predictor for themotion data 48. As a further alternative, thehybrid coder 42 may code themotion data 48 completely anew, i.e. independent from the base layer motion data. - Similarly, the
hybrid coder 42 may code the residual data for the higher quality layer predictively as the prediction residual relative to the base layerresidual data 50 as a predictor. - However, the
hybrid coder 42 may also use a reconstruction of the picture content of the base layer as a predictor for the picture content of the original video data so that in this case motion data and/orresidual data FIG. 2 , the reconstructed base layer picture information may be received by the baselayer hybrid coder 38 or adedicated reconstruction unit 54 coupled between the base-layer coding unit 40 and a higher qualitylayer hybrid coder 42. - In the following, an internal structure and the functionality of the
hybrid coders layer coding unit 44 will be described in more detail. With regard tolayer coding unit 40, in the following it is assumed that same merely generates base layer data-stream 26. However, as indicated above, an alternative of an embodiment according to which thelayer coding unit 40 also generates SNR refinement layer data-streams 32 is readily derivable from the following description with respect to thelayer coding unit 44. - Firstly, the internal structure and functionality of the base
layer hybrid coder 38 is described. As shown inFIG. 3 , the baselayer hybrid coder 38 comprises aninput 56 for receiving the lower quality video signals 24, anoutput 58 for themotion data 46, anoutput 60 for theresidual data 50, anoutput 62 for coupling themotion data 58 tohybrid coder 42, anoutput 64 for coupling reconstructed base layer picture data tohybrid coder 42, and anoutput 66 for couplingresidual data 50 tohybrid coder 42. - Internally,
hybrid coder 38 comprises atransformation unit 68, a back-transformation unit 70, asubtractor 72, anadder 74, and amotion prediction unit 76. Thesubtractor 72 and thetransformation unit 68 are coupled, in the order mentioned, between theinput 56 and theoutput 60. Thesubtractor 72 subtracts from the input video signal motion-predicted video content received from themotion prediction unit 76 and forwards the difference signal totransformation unit 68. Thetransformation unit 68 performs a block-wise transformation on the difference/residual signal along with, optionally, a quantization of the transform coefficients. The transformation result is output by thetransformation unit 68 tooutput 60 as well as an input of back-transformation unit 70. The back-transformation unit 70 performs an inverse transformation on the transform blocks of transformation coefficients with, eventually, a preceding dequantization. The result is a reconstructed residual signal which is, byadder 74, additively combined with the motion-predicted video content output bymotion prediction unit 76. The result of the addition performed byadder 74 is a reconstructed video in base quality. The output ofadder 74 is coupled to an input ofmotion prediction unit 76 as well asoutput 64. Themotion prediction unit 76 performs a motion-compensated prediction based on the reconstructed pictures in order to predict other pictures of the video input to input 56. Themotion prediction unit 76 produces, while performing motion-prediction, motion data including, for example, motion vectors and motion picture reference indices and outputs this mode motion data tooutput 62 as well asoutput 58. The output of thetransformation unit 68 is also coupled to theoutput 66 in order to forward the transform residual data to thehybrid coder 42 of the higher quality layer. As already mentioned above, the functionality of bothhybrid coders FIG. 1 is similar to each other. However, thehybrid coder 42 of the higher quality layer also uses inter layer prediction. Thus, the structure of thehybrid coder 42 shown inFIG. 2 is similar to the structure ofhybrid coder 38 shown inFIG. 3 . In particular, thehybrid coder 42 comprises aninput 86 for theoriginal video signal 18, anoutput 88 for themotion data 48, anoutput 90 for theresidual data 52, and threeinputs respective outputs layer hybrid coder 38. Internally,hybrid coder 42 comprises two switches orselectors 98 and 100 for connecting one of twopaths input 86 andoutput 90. In particular,path 104 comprises asubtractor 106, atransformation unit 108 and a residualpredictive coder 110 being connected, in the order mentioned, betweeninput 86 andoutput 90 viaswitches 98 and 100.Subractor 106 andtransformation unit 108 form, along with a back-transformation unit 112, anadder 114 and amotion prediction unit 116, a prediction loop such as that formed byelements 68 to 76 inhybrid coder 38 ofFIG. 3 . Accordingly, at the output of the transformation unit 108 a transformed version of the motion-predicted residual results which is input into residualpredictive coder 110. The residualpredictive coder 110 is also connected to theinput 96 in order to receive the base layer residual data. By use of this base layer residual data as a predictor, the residualpredictive coder 110 codes a part of the residual data output bytransformation unit 108 as a prediction residual relative to the residual data atinput 96. For example, the residualpredictive coder 110 up-samples the base layer residual data and subtracts the upsampled residual data from the residual data output bytransformation unit 108. Of course, theresidual predictor coder 110 may perform the prediction only for a part of the residual data output bytransformation unit 108. Other paths pass residualpredictive coder 110 unchanged. The granularity of these parts may be macro blocks. In other words, the decision as to whether the residual data atinput 96 may be used as a predictor or not may be conducted on a macroblock basis and the result of the decision may be indicated by a respective syntax element residual_prediction_flag. - Similarly, the
hybrid coder 42 comprises a motion parameterpredictive coder 118 in order to receive the motion data atinput 92 from the base layer as well as the motion information obtained frommotion prediction unit 116 and switches, on a macroblock basis, between passing the motion data frommotion prediction unit 116 unchanged tooutput 88, or predictively coding the motion data by use of the motion information from the base layer atinput 92 as a predictor. For example, motion parameterpredictive coder 118 may code motion vectors frommotion prediction unit 116 as offset vectors relative to motion vectors contained in the base layer motion data atinput 92. Alternatively, motion parameterpredictive coder 118 passes the base layer information frominput 92 tomotion prediction unit 116 to be used for the motion prediction in the higher quality layer. In this case, no motion data has to be transmitted for the respective portion of the higher quality layer video signal. As a further alternative, the motion parameterpredictive coder 118 ignores the existence of the motion data atinput 92 and codes the motion data from themotion prediction unit 116 directly tooutput 88. The decision among these possibilities is coded into the resulting quality scalability bit-stream. - Finally, the
predictive coder 120 is provided inpath 102 and coupled withinput 94.Predictive coder 120 predicts portions of the higher quality layer based on respective portions of the reconstructed base layer video signal so that at the output ofpredictive coder 120 merely the respective residual or difference is forwarded.Predictive coder 120 does also operate on a macroblock-wise basis in cooperation withswitches 98 and 100. - As may be seen from
FIG. 4 , thelayer coding unit 44 of the higher quality layer comprises aninput 122 for receiving the transform coefficients of residual data fromoutput 90 and aninput 124 for receiving the motion data fromoutput 88. A distributingunit 126 receives the transformation coefficients and distributes them to several enhancement layers. The transformation coefficients thus distributed are output to aformation unit 128. Along with the distributed transformation coefficients, theformation unit 128 receives the motion data frominput 124. Theformation unit 128 combines both data and forms, based on these data inputs, the zero-order enhancementlayer data stream 28 as well as refinement-layer data streams 30. - In order to enable a more detailed description of the functionality of the distributing
unit 126 and theformation unit 128, in the following the block-basis underlying the transformation performed by thetransformation unit 108 and its interrelationship to the distribution performed by the distributingunit 126 will be described in more detail with respect toFIG. 5 .FIG. 5 represents apicture 140.Picture 140 is, for example, part of the high quality video data 18 (FIG. 1 ). Withinpicture 140, the pixels are, for example, arranged in lines and columns. Thepicture 140 is, for example, partitioned intomacroblocks 142, which may also be arranged regularly in lines and columns. Eachmacroblock 142 may, for example, spatially cover a rectangular picture area in order to comprise, for example, 16×16 samples of, for example, the luma component of the picture. To be even more precise, themacroblocks 142 may be organized in pairs of macroblocks. In particular, the vertically adjacent pair ofmacroblocks 142 may form such a pair of macroblocks and may assume, spatially, amacroblock pair region 144 ofpicture 140. On a macroblock pair basis, hybrid coder 42 (FIG. 1 ) may handle themacroblocks 142 within therespective region 144 in field—mode or frame-mode. In case of field-mode, thevideo 18 is assumed to contain two interleaved fields, a top and a bottom field, where the top field contains the even numbered rows of pixels, and the bottom field contains the odd numbered rows starting with the second line of thepicture 140. In this case, the top macroblock ofregion 144, relates to the pixel values of the top field lines withinregion 144 whereas the bottom macroblock ofregion 144 relates to the content of the remaining lines. Thus, in this case, both macroblocks spatially assume substantially the whole area ofregion 144 with a reduced vertically resolution. In case of frame-mode, the top macroblock is defined to spatially encompass the upper half of the rows withinregion 144 whereas the bottom macroblock comprises the remaining picture samples inregion 144. - As already noted above, the
transformation unit 108 performs a block-wise transformation of the residual signal output bysubtractor 106. In this regard, the block basis for the transformation withintransformation unit 108 may differ from the macroblock size of themacroblocks 142. In particular, each of themacroblocks 142 may be partitioned into four, i.e. 2×2, transformblocks transformation unit 108 would transform themacroblocks 142 ofpicture 140 block-wise in blocks of size of 4×4 pixel samples or 8×8 pixel samples. Thus, thetransformation unit 108 outputs, for acertain macroblock 142,several transform blocks - At 150 in
FIG. 5 , an instance of an 8×8 transform coefficient block of a frame-coded macroblock is illustrated. In particular, at 150, each transform coefficient is assigned to and represented by a scan position number, these numbers ranging from 0 to 63. As illustrated by theaxes 152, the respective transformation coefficients are associated with a different spatial frequency component. In particular, the frequency associated with a respective one of the transform coefficients increases in magnitude from an upper left corner to the bottom right hand corner of thetransform block 150. The scan order defined by the scan positions among the transform coefficients oftransform block 150, scans the transform coefficients from the upper left hand corner in a zig-zag manner to the lower right-hand corner, this zig-zag scan being illustrated byarrows 154. - For sake of completeness only, it is noted that the scan among the transform coefficients may be differently defined among the transform coefficients of a transform block of a field-coded macroblock. For example, as it is shown at 156 in
FIG. 5 in case of a field-coded macroblock, the transform coefficient scan 158 scans the transform coefficients from the upper left-hand corner to the lower right-hand corner in a zig-zag manner with a reciprocating or zig-zag direction which is steeper than the 45° zig-zag direction used in case of the frame-coded macroblock at 150. In particular, acoefficient scan 158 scans the transform coefficients in column direction twice as fast than in line direction in order to take into account the fact that field-coded macroblocks encompass picture samples having a column pitch twice the horizontal or line pitch. Thus, as it is the case with thecoefficient scan 154, coefficient scan 158 scans the transform coefficients in a way so that the frequency increases as the position scan number increases. - At 150 and 158, examples for coefficient scans of 8×8 transform coefficient blocks are shown. However, as already noted above, transform blocks of smaller size, i.e. 4×4 transform coefficients may also exist. For these cases, respective position scans are shown in
FIGS. 5 at 160 and 162, respectively, with thescan 164 in case of 160 being dedicated for frame-coded macroblocks, whereas thescan 166 illustrated at 162 is dedicated for field-coded macroblocks. - It is to be emphasized, that the specific examples shown in
FIG. 5 with respect to the sizes and arrangements of the macroblocks and transform blocks are of illustrative nature only, and that different variations are readily applicable. Before starting with the description of the subsequent figures, it is noted that thepicture 140 may be subdivisioned, on a macroblock basis, intoseveral slices 168. Onesuch slice 168 is exemplarily shown inFIG. 5 . Aslice 168 is a sequence ofmacroblocks 142. Thepicture 140 may be split into one orseveral slices 168. - After having described the subdivision of a picture into macroblock pair regions, macroblocks and transform blocks as well as slices, respectively, the functionality of the distributing
unit 126 and theformation unit 128 is described in the following in more detail. As may be seen fromFIG. 5 , the scan order defined among the transform coefficients enables the two-dimensionally arranged transform coefficients to be ordered into a linear sequence of transform coefficients with monotonously increasing frequency contents to which they refer. The distributingunit 126 operates to distribute the transform coefficient ofseveral macroblocks 142 to different quality layers, i.e. any of the zero order layer associated with adata stream 28 and the refinement layers 30. In particular, the distributinglayer 126 tries to distribute the transform coefficients to the data streams 28 and 30 in such a way, that with increasing number of contributing layers from the zero or thelayer 28 to the highestquality refinement layer 30, the SNR quality of the video reconstructable from the respective data streams increases. In general, this will lead to a distribution where the lower frequency transform coefficients corresponding to lower scan positions are distributed to lower quality layers whereas higher frequency transform coefficients are distributed to higher quality layers. On the other hand, distributingunit 126 will tend to distribute transform coefficients with higher transform coefficient values to lower quality layers and transform coefficients with lower transform coefficient values or energies to higher quality layers. The distribution formed by distributingunit 126 may be performed in such a way that each of the transform coefficients is distributed to one single layer. However, it is also possible that the distribution performed by the distributingunit 126 is performed in such a way that the amount of a transform coefficient may also be distributed to different quality layers in parts such that the distributed parts sum up to the transform coefficient value. Details of the different possibilities for the distribution performed by distributingunit 126 will be described in the following with respect toFIG. 6 a-g. Theformation unit 128 uses the distribution resulting from distributingunit 126 in order to form respectivesub-data streams input 124. This zero-ordersub-data stream 128 may also be provided with a first distributed portion of the transform coefficient values. Thus,sub-data stream 28 allows for a refinement of the base-qualitylayer data stream 26 to a higher quality—in the instance ofFIG. 1 to a higher spatial quality—but a further SNR quality enhancement may be obtained by accompanying thesub-data stream 28 with any of the further higher quality refinement sub-data streams 30. The number of these refinement quality sub-data streams 30 is N, where N may be one or more than one. The transform coefficients are thereby—for example, in the order of increasing importance for the SNR quality—“distributed” to thesesub-data streams -
FIG. 6 a shows an example for a distribution of the first 26 transform coefficient values of an 8×8 transform block. In particular,FIG. 6 a shows a table where the first line of the table lists the respective scan positions according to thescan order FIG. 5 ). It can be seen that the scan positions shown extend, exemplarily from 0 to 25. The following three lines show the corresponding contribution values incorporated into the respectivesub-data streams sub-data stream 28 whereas the penultimate line belongs to the nexthigher refinement layer 30 and the last line refers to the even next quality layer refinement data-stream. According to the example ofFIG. 6 a, a “122” is coded into thesub-data streams 128 for the DC component, i.e. transform coefficient value belonging to scanposition 0. The contribution values for this transform coefficient havingscan position 0 within the following twosub-data streams 30, are set to zero as indicated by the hashing of the respective table entries. In this way, according to the example ofFIG. 6 a, the zero order enhancement layersub-data stream 28 comprises a distribution value for each of the transform coefficient values. However, within the transform block ofFIG. 6 a, merely the transform coefficient values ofscan positions 0 to 6, 8 and 9 belong to the zero order quality layers. Further transform coefficient values are set to zero. It is to be emphasized, that in other transform blocks, the transform coefficient values belonging to the zero order quality layer may belong to other scan positions. Similarly, the transform coefficient values ofscan positions FIG. 6 a, this is the case for scanningposition 23. The corresponding contribution values within the preceding quality layers are set to zero and the transform coefficient value for thescan position 23 in the last quality layer (last line) forscan position 23 is zero itself. - Thus, for each of the scan positions, the contribution values included in the various quality layers sub-bit streams 28 and 30, sum up to the actual transform coefficient value so that, at decoder side, the actual transform block may be reconstructed by summing up the contribution values for the individual scan positions of the different quality layers.
- According to the embodiment of
FIG. 6 a, each of thesub-data streams sub-data stream 28 contains any transform coefficient or contribution value. So in the latter case, the last three lines of the table ofFIG. 6 a could be seen as belonging to the first refinement layersub-data streams 30 with the zero ordersub-data stream 28 merely comprising the motion information frominput 124. - Moreover, it is noted that the
FIG. 6 a contribution values having been set to zero and actual transform coefficient values actually being zero have been distinguished by use of hashed table entries merely for sake of an easier understanding of the functionality ofinformation unit 128. However, thesub-data streams FIG. 6 a reveals the transform coefficient value independent from individual contribution values in the sum being set to zero or naturally being zero. - In the embodiment of
FIG. 6 a, theformation unit 128 coded into a respective one of thesub-data stream FIG. 6 b, for example, the consecutive quality layer sub-data streams comprise merely those transform coefficient values belonging to the respective quality layer. - The order, in which the contribution values and transform coefficient values are coded into the
sub-data streams FIG. 6 a andFIG. 6 b, respectively. For example, thesub-data streams slice 168. Within oneslice 168, the transform coefficient values may be coded into the respective packets macroblock-wise. That is, a scan order may be defined among themacroblocks 142 within aslice 168 with the transform coefficient values for apredetermined macroblock 142 being completely coded into the respective packet before the first transform coefficient value of a macroblock following a macroblock scan order. Within each macroblock, a scan order may be defined among the respective transform blocks 146 and 148, respectively, within the respective macroblock. Again, the transform coefficient values may be coded into a respective one of thesub-data streams formation unit 128 such that the transform coefficient values of a respective one of the transform blocks are all coded into the respective sub-data stream before the first transform coefficient value of a next transform block is coded into the same. Within each transform block, a coding of the transform coefficient values and contribution values, respectively, may be conducted in a way explained in the following with respect toFIG. 8 or 9. - According to the embodiments of
FIGS. 6 a and 6 b, the transform coefficient values of the different transform blocks of theslice 168 belonging to a respective one of the quality layers, extended over a different portion of the scan order. To be more precise, although in the specific transform block exemplarily shown inFIGS. 6 a and 6 b, scanpositions 0 to 6, 8 and 9 belong to the zero order quality layer, in another transform block, the set of scan positions belonging to this layer may be different. According to the embodiment ofFIG. 6 c, however, the distributingunit 126 distributes the transform coefficient values of the different transform blocks within aslice 168 such that for all transform blocks, the transform coefficient values of the same set of scan positions belongs to the same quality layer. For example, inFIG. 6 c the transform coefficient values of the scan positions from 0 to 11 belong to the zero ordersub-data stream 28 with this being true for all transform blocks withinslice 168. - According to the embodiment of
FIG. 6 c, in addition, the transform coefficient values belonging to a specific one of the quality layers extend over a continuous sequence of consecutive scan positions. This, however, needs not to be the case. In particular, transform coefficient values belonging to a scan position between the first and the last scan position belonging to a specific quality layer may belong to one of the other quality layers such as shown inFIG. 6 b. However, in case of the embodiment ofFIG. 6 c it is possible to indicate the scan positions incorporated into any one of the quality layersub-data streams - The reservation of a specific set of scan positions to a respective one of the quality layers on the one hand and the quality-importance dependent distribution of the transform coefficients to the individual quality layers on the other hand, may be mixed up as shown in the following embodiment. For example,
FIG. 6 d shows an embodiment where the distributingunit 126 has distributed the transform co-efficients over the quality layers as it was shown with respect toFIG. 6 a. This distribution differs from transform block to transform block. However, on the other hand, each of the quality layers is assigned a specific portion of the scan positions in common for all transform blocks. For example, the lowest quality layer is assigned the full set of scan positions fromscan position 0 to scanposition 63. Thus, for each transform block, the lowest quality layer comprises 64 contribution values. The next higher quality layer sub-data stream comprises contribution or transform coefficient values for all transform blocks in a specific scan position range which extends fromscan position 6 to 63. The scan position range of the next quality layer extends fromscan position 13 to 63. Again, the decoder does not need to know as to whether a specific one of the contribution values is a contribution value that has been set to 0 (hashed entry) or is actually indicating a 0 transform coefficient value or insignificant transform coefficient value. However, he needs to know the syntax element scan_idx_start that indicates for therespective slice 168 from which scan position on the transform coefficient or contribution values contained in the respective sub-data stream are to be used for. To be more precise, in the embodiment ofFIG. 6 d, for example, the sub-data stream corresponding to the penultimate line comprises, for anindividual transform block 58, transform coefficient or contribution values. The first one, in case of the transform block ofFIG. 6 d, is 0, while the second one is 22. By use of the syntax element scan_idx_start at a decoder side, it is known that the first transform coefficient value of the respective quality layer corresponds to scanposition 6, while the remaining transform coefficient values of this quality layer refer to the following scan positions. Similar to the embodiments ofFIG. 6 d,FIG. 6 e shows an embodiment where a syntax element scan_idx_end indicates for the individual sub-data streams the last scan position up to which the respective quality layer sub-data stream comprises sub-coefficients or contribution values. - A combination of the embodiments of
FIGS. 6 d and 6 e is shown inFIG. 6 f. According to this embodiment, the respective set of scan positions belonging to a specific one of the quality layers extends from a first scan position indicated by a syntax element scan_idx_start to a last scan position indicated by the syntax element last_idx_end. For example, in the quality layer corresponding to the penultimate line, the respective set of scan position extends fromscan position 6 to scanposition 21. Finally, the embodiment ofFIG. 6 g shows that the use of the syntax element scan_idx_start and/or scan_idx_end may be combined with the focus of the embodiment ofFIG. 6 c according to which the distribution of the individual transformation coefficient values of the different transform block within aslice 168 is common for the transform blocks. Accordingly, according to the embodiment ofFIG. 6 g, within a specific one of the quality layers, all transform coefficient values within scan_idx_start to scan_idx_end are distributed to the respective quality layer. Therefore, differing from the embodiment ofFIG. 6 f, in the embodiment ofFIG. 6 g, all the transfer coefficient values withinscan position 6 to scanposition 21 are assigned to the quality layer corresponding to the penultimate line inFIG. 6 g. Differing therefrom, in the embodiment ofFIG. 6 f, several ones of the contribution values within this position scan range from 6 to 21 may be set to 0 wherein the distribution of transform coefficient values having been set to 0 and transform coefficient values having not been set to 0 within this position scan range from 6 to 21, may be different than any one of the other transform blocks within the current slice. - In the following, the cooperation between
hybrid coder 42,layer coding unit 44, distributingunit 126 andformation unit 128 is described illustratively with respect toFIG. 7 which shows an example for the structure of thesub-data streams FIG. 7 , theformation unit 28 is designed such that the individualsub-data streams formation unit 128 may be designed to generate a packet for eachslice 168 within apicture 140 within eachsub-bit stream FIG. 7 , a packet may comprise aslice header 170 on the one hand andresidual data 172 on the other hand, exceptsub-bit stream 28 which optionally comprises merely the slice header within each one of the packets. - With respect to the description of the
residual data 172, i.e.residual data # 1, residual data residual data #N, reference is made to the above description with respect toFIGS. 6 a to 6 g, where for example, the second to fourth lines in these tables correspond toresidual data # 1,residual data # 2 andresidual data # 3, for example. In even other words,residual data 172 indicated inFIG. 7 includes the transform coefficient values discussed inFIGS. 6 a to 6 g, the distribution of which among the respectivesub-data streams FIG. 7 shows further syntax elements contained in theslice header 170 and theresidual data 172 which stem fromhybrid coder 42. As described above, thehybrid coder 42 switches, on a macroblock basis between several inter-layer prediction modes so as to rely on the motion information from the base layer, or generate new motion information for a respective motion block of the higher refinement layer with predictively coding the motion information as a residual to the motion information from the base layer, or with coding this motion information anew. Thus, as indicated inFIG. 7 , theresidual data 172 may comprise, for each macroblock, syntax elements indicating motion parameters, macroblock modes such as field or frame coded, or an inferring mode indicating the reuse of the motion parameters of the base layer with the respective macroblock. This is especially true for the zero or thesub-data stream 28. However, this motion information is not again refined in the following refinement layers and the following higher qualitiessub-data streams 30 1 to 30 N, and therefore, theformation unit 128 is designed to leave these macroblock-wise syntax elements concerning macroblock modes, motion parameters and inferring mode indication in the residual data of thesesub-data streams 30 1 to 30 N away or to set the syntax elements in thesesub-data streams 30 1 to 30 N to be either equal to the macroblock modes and motion parameters for the respective macroblock contained insub-data stream 28 or indicate the inferring mode for the respective macroblock in order to indicate that the same settings are to be used in the respective refinement layer. According to the embodiment of the present invention, all theresidual data 172 within the varioussub-data streams - As also derivable from
FIG. 7 , theformation unit 128 may be designed to provide theslice header 170 with the syntax element scan_idx_start and/or scan_idx_end. Alternatively, theslice header data 170 may comprise other syntax elements defining for each individual slice or packet, a set of scan positions the residual data corresponding to the respective slice header data relate to. As already indicated above, the slice header data of packets of thesub-data stream 28 may not comprise such syntax elements concerning the definition of layer specific scan positions in case thesub-data stream 28 does not comprise any residual data, but merely macroblock modes and/or motion parameters and inferring mode indications, respectively. Further, as already noted above, theslice header data 170 may comprise merely one of scan_idx_start and scan_idx_end. Finally, scan_idx_start and/or scan_idx_end may be provided once per transform block size category, i.e. 4×4 and 8×8, or just once for each slice/picture/sub-data stream commonly for all transform block size categories, with respective measures being taken to transfer scan_idx_start and scan_idx_end to other block sizes as will be described in the following. - Further, the slice header data may comprise a syntax element indicating the quality level. To this end, the
formation unit 128 may be designed such that the syntax element or quality indicator merely distinguishes between the zeroorder quality level 28 on the one hand and the refinement layers 30 1 to 30 N on the other hand. Alternatively, the quality indicator may distinguish all quality layers among the refinement layers 28 and 30 1 to 30 N. In the latter two cases, the quality indicator would enable the omission of any macroblock-wise defined macroblock modes, motion parameters and/or inferring modes within the packets of thesub-data streams 30 1 to 30 N since in this case, at the decoder side, it is known that these refinement layerssub-data streams 30 1 to 30 N merely refine the transform coefficients with using the macroblock modes, motion parameters and inferring modes from the zero modesub-data stream 28. - Although not described in further detail above, the
formation unit 28 may be designed to entropy code the packets within thesub-data streams FIGS. 8 and 9 show possible examples for coding the transform coefficients within the residual data pertaining to one transform block according to two embodiments.FIG. 8 shows a pseudo code of a first example for a possible coding of the transform coefficients within a transform block in any of theresidual data 172. Imagine, that the following example applies: -
Scan Position 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Coefficient Number 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Transform 7 6 −2 0 −1 0 0 1 0 0 0 0 0 0 0 0 coefficient level - Based on this example, in the following, the pseudo code of
FIG. 8 is explained showing the way in which theformation unit 128 may code the residual data within one of the transform blocks into any of the sub-data streams. - In order to convey the information of the transform coefficients, in accordance with
FIG. 8 , firstly a parameter coeff_token is provided 240. The parameter coeff_token is a code word indicating the number of non-zero coefficients, i.e. total_coeff(coeff_token), and the number of transform coefficients in the run of transform coefficients having an absolute value equal to one at the end of the sequence of non-zero transform coefficients, i.e. trailing_ones(coeff_token). In our example, total_coeff(coeff_token) is 5 (transformcoefficient numbers coefficient number 4 and 7). Thus, by providing theparameter coeff_token 240, the positions of the significant transform coefficients have been determined to the extent that no more than total_coeff(coeff_token) non-zero transform coefficients exist. - Then, the values of the levels of these non-zero transform coefficients are provided. This is done in reverse scan order. To be more specific, firstly it is checked as to whether the total number of non-zero transform coefficients is greater than zero 242. This is the case in the above example, since total_coeff(coeff_token) is 5.
- Then, the non-transform coefficients are stepped through in a
reverse scan order 244. The reverse scan order is not yet obvious from just viewing the counting parameter incrementation i++ in the for-loop 244 but will become clear from the following evaluation. While stepping through these non-transform coefficients in reverse scan order, for the first of these non-zero transform coefficients, just their transform coefficient sign is provided 248. This is done for the first number of trailing_ones(coeff_token) of the non-zero transform coefficients when stepping through them in a reverse scan order, since for these transform coefficients it is already known that the absolute value of these transform coefficients is one (compare with the above definition of trailing_ones(coeff_token)). The coefficient signs thus provided are used to temporarily store in auxiliary vector coefficients level[i] for the transform coefficient level of the non-zero transform coefficient levels having absolute value of 1 wherein i is a numbering of the non-zero transform coefficients when scanned in reverse scan order (250). In our example, after the first two rounds of the for-loop 244, level[0]=1 and level[1]32 −1 is obtained. - Next, the coefficient levels coeff_level for the remaining non-zero transform coefficients are provided (252) in reverse scan order and temporarily stored in the auxiliary vector coefficients level [i] (254). The remaining for-loop rounds result in level[2]=−2, level[3]=6 and level[4]=7.
- Now, in order to render the determination of the positions of the significant transform coefficients unique, two further parameters called total_zeros and run_before are provided unless total_coeff(coeff_token) is already equal to the maximum number of transform coefficients in a transform block, i.e. is equal to maxNumCoeff. To be more specific, it is checked as to whether total_coeff(coeff_token) is equal to maxNumCoeff (256). If this is not the case, the parameter total_zeros is provided (258) and an auxiliary parameter zerosLeft is initialised to the value of total_zero (260). The parameter total_zeros specifies the number of zeros between the last non-zero coefficient in scan order and the start of the scan. In the above example, total_zeros is 3 (
coefficient numbers - For each of the non-zero transform coefficients except the last one with respect to the reverse scan order (coefficient number 0), beginning with the last non-zero transform coefficient (coefficient number 7) with respect to the scan order (62), a parameter run_before is provided (64) indicating the length of the run of zero-level transform coefficients arranged directly in front of the respective non-zero transform coefficient when seen in scan order. For example, for i being equal to zero, the last non-zero transform coefficient with respect to the scan order is the non-zero transform coefficient in question. In our example, this is transform coefficient having the
number 7 and having thelevel 1. The run of zeros in front of this transform coefficient has a length of 2, i.e.transform coefficients - At the end, in the for-loop indicated at 276, the values of the transform coefficient levels as stored in auxiliary vector level are assigned to their positions by copying the values of the coefficients of vector level to the respective position in the one-dimensional array coeffLevel. To be more specific, in the first round of the for-
loop 276, i=4 and coeffNum which has been initialised to 0 (278) is incremented by run[4]+1=0+1=1 resulting in coeffNum=0 and coeffLevel[0] being assigned the value of level[4]=7. This is repeated for the next auxiliary vector coefficients level [3] to level [0]. Since the remaining positions of the array coeffLevel have been initialised to the value of zero (280) all transform coefficients have been coded. - The bold written syntax elements in
FIG. 8 may be coded into the respective sub-data stream by means of variable length coding, for example. -
FIG. 9 shows another example for coding a transform block. In this example, the scanning order manifests itself in “i++” within the while-loop 310 indicating that counting parameters i is incremented per while-loop iteration. - For each coefficient in scanning order, a one-bit symbol significant_coeff_flag is provided (312). If the significant_coeff_flag symbol is 1 (314), i.e., if a non-zero coefficient exists at this scanning position i, a further one-bit symbol last_significant_coeff_flag is provided (316). This symbol indicates if a current significant coefficient is the last one inside the block or if further significant coefficients follow in scan order. Thus, if the last_significant_coeff_flag symbol is one (318), this indicates that the number of coefficients, i.e. numCoeff, is i+1 (320) and the levels of the subsequent transform coefficients can be deduced to be zero (322). In so far, the syntax elements last_significant_coeff_flag and significant_coeff_flag may be seen as a significance map. Then, for the last transform coefficient in scanning order, the absolute value of the level minus 1, i.e. coeff_abs_level_minus1, and its sign, i.e. coeff_sign_flag, is provided (324), thereby indicating the transform coefficient level of this last significant transform coefficient (326). These
steps - For sake of completeness, it is noted that it became clear from
FIG. 5 that the number of distinguishable scan positions within the 8×8 transform blocks is 64 whereas the number of distinguishable scan positions within the 4×4 transform blocks is merely 16. Accordingly, the abovementioned syntax element scan_idx_start and scan_idx_end may either be defined in an accuracy enabling a distinction between all 64 scan positions, or merely a distinction between 16 scan positions. In the latter case for example, the syntax elements may be applied to each quadruple of consecutive transform coefficients within the 8×8 transform blocks. To be more precise, 8×8 transform blocks may be coded by use of -
residual block(LumaLevel8×8, 4*scan_idx_start, 4*scan_ids_end +3, 64) - and in case of 4×4 transform blocks by use of
-
residual_block(LumaLevel4×4, scan_idx_start, scan_idx_end, 16). - with residual_block being either residual_block_cavlc or residual_block_cabac, and LumaLevel4×4 and LumaLevel8×8 indicating an array of luma samples of the respective 4×4 and 8×8 transform block, respectively. As can be seen, scan_idx_start and scan_idx_end are defined to discriminate between 16 scan positions so that they indicate the range of positions in 4×4 blocks exactly. However, in 8×8 blocks, the accuracy of these syntax elements is not sufficient so that in these blocks the range is adjusted quadruple wise.
- Furthermore, 8×8 blocks of transform coefficients can also be encoded by partitioning the 64 coefficients of an 8×8 block into 4 sets of 16 coefficients, for example by placing every fourth coefficient into the n-th set starting with coefficient n with n in the range of 0 to 3, inclusive, and coding each set of 16 coefficients using the residual block syntax for 4×4 blocks. At the decoder side, these 4 sets of 16 coefficients are re-combined to form a set of 64 coefficients representing an 8×8 block.
- After having described embodiments for an encoder, a decoder for decoding the respective quality scalable data stream is explained with respect to
FIGS. 10 and 11 .FIG. 10 shows the general construction of adecoder 400. Thedecoder 400 comprises ademultiplexer 402 having aninput 404 for receiving the scalable bit-stream 36. Thedemultiplexer 402 demulitplexes theinput signal 36 into the data streams 26 to 32. To this end, the demultiplexer may—perform a decoding and/or parsing function. For example, thedemultiplexer 402 may decode the transform block codings ofFIGS. 8 and 9 . Further, recallFIG. 6 a-6 g. Accordingly,demultiplexer 402, may use information of preceding sub-data streams in order to, in parsing a current sub-data stream, know how many transform coefficient values or contribution values are to be expected for a specific transform block. The data-streams thus retrieved are received by adecoding unit 406 which, based on these data-streams, reconstructs thevideo 18 and outputs the respective reconstructedvideo 408 at arespective output 410. - The internal structure of the
decoding unit 406 is shown in more detail inFIG. 11 . As shown therein, thedecoding unit 406 comprises a base layermotion data input 412, a base layerresidual data input 414, a zero order refinement layermotion data input 416, an optional transform coefficient zero order refinement transformcoefficient data input 418 and aninput 420 for the sub-data streams 30. As shown,inputs stream 26, whereasinputs stream 28. Besides this, thedecoding unit 406 comprises a lower quality reconstructionvideo signal output 422, a higher quality interlayer coded reconstructionvideo signal output 424, and an internally coded reconstructionvideo signal output 426, the latter ones providing the information for a higher quality video signal. - A
combiner 428 has inputs connected toinputs streams combiner 428 presets all transform coefficient values to zero and replaces any of these zeros merely in case of an contribution value being unequal to zero for the respective scan position. By this measure, the combiner collects information on the transform coefficients of the various transform blocks. The association of the contribution or transform coefficient values within the individual layers may involve the combiner using the scan position information of the current layer such as scan_idx_start and/or scan_idx_end. - Alternatively, the combiner may use the knowledge of the transform coefficient values within the individual transform blocks received so far from lower quality or SNR layers.
- The transform blocks output by
combiner 428 are received by a residualpredictive decoder 430 and anadder 432. - Between the residual
predictive decoder 430 and theinput 414, an back orinverse transformation unit 432 is connected in order to forward inversely transformed residual data to the residualpredictive decoder 430. The latter uses the inversely transformed residual data in order to obtain a predictor to be added to the transform coefficients of the transform blocks output bycombiner 428, eventually after performing an up-sampling or another quality adaptation. On the other hand, amotion prediction unit 434 is connected between theinput 412 and an input of anadder 436. Another input of theadder 436 is connected to the output of a back-transformation unit 432. By this measure, themotion prediction unit 434 uses the motion data oninput 412 to generate a prediction signal for the inversely transformed residual signal output by the back-transformation unit 432. A result ofadder 436 at the output ofadder 436 is a reconstructed base layer video signal. The output ofadder 436 is connected to theoutput 432 as well as in input ofpredictive decoder 432. Thepredictive decoder 432 uses the reconstructed base layer signal as a prediction for the intra layer coded portions of the video content output bycombiner 428, eventually by use of an up-sampling. On the other hand, the output ofadder 436 is also connected to an input ofmotion prediction units 434 in order to enable that themotion prediction unit 434 uses the motion data atinput 412 to generate a prediction signal to the second input ofadder 436 based on the reconstructed signals from the base layer data stream. The predictively decoded transform coefficient values output by residualpredictive decoder 430 are back-transfomed by back-transformation unit 438. At the output of back-transformation unit 438, a higher quality residual video signal data results. This higher quality residual data video signal is added by anadder 440 with a motion predicted video signal output by amotion prediction unit 442. At the output ofadder 440, the reconstructed high quality video signal results which reachesoutput 424 as well as a further input ofmotion prediction unit 442. Themotion prediction unit 442 performs the motion prediction based on the reconstructed video signal output byadder 440 as well as the motion information output by a motionparameter prediction decoder 444 which is connected betweeninput 416 and a respective input ofmotion prediction unit 442. The motion parameterpredictive decoder 444 uses, on a macroblock selective basis, motion data from the base layermotion data input 412 as a predictor, and dependent on this data, outputs the motion data to themotion prediction unit 442 with using, for example, the motion vectors atinput 416 as offset vectors to motion vectors atinput 412. - The above described embodiments enable an increase in the granularity of SNR scalable coding on a picture/slice level in comparison to CGS/MGS coding as described in the introductory portion, but without the significant increase in complexity that is present in FGS coding. Furthermore, since it is believed that the feature of FGS that packets can be truncated will not widely be used, the bit-stream adaptation is possible by simple packet dropping.
- The above described embodiments have the basic idea in common, to partition the transform coefficient levels of a traditional CGS/MGS packet as it is currently specified in the SVC draft into subsets, which are transmitted in different packets and different SNR refinement layers. As an example, the above described embodiments concerned the CGS/MGS coding with one base and one enhancement layer. Instead of the enhancement layer including, for each picture, macroblock modes, intra prediction modes, motion vectors, reference picture indices, other control parameters as well as transforms coefficient levels for all macroblocks, in order to increase the granularity of the SNR scalable coding, these data were distributed over different slices, different packets, and different enhancement layers. In the first enhancement layer, the macroblock modes, motion parameter, other control parameters as well as, optionally, a first subset of transform coefficient levels are transmitted. In the next enhancement layer, the same macroblock modes and motion vectors are used, but a second subset of transform coefficient levels are encoded. All transform coefficients that have already been transmitted in the first enhancement layer may be set to zero in the second and all following enhancement layers. In all following enhancement layers (third, etc.), the macroblock modes and motion parameters of the first enhancement layer are again used, but further subsets of transform coefficient levels are encoded.
- It should be noted that this partitioning does not or only very slightly increases the complexity in comparison to the traditional CGS/MGS coding as specified in the current SVC draft. All SNR enhancements can be parsed in parallel, and the transform coefficients do not need to be collected from different scans over the picture/slice. That means for example that a decoder could parse all transform coefficients for a block from all SNR enhancements, and then it can apply the inverse transform for this block without storing the transform coefficient levels in a temporary buffer. When all blocks of a macroblock have been completely parsed, the motion compensated prediction can be applied and the final reconstruction signal for this macroblock can be obtained. It should be noted that all syntax elements in a slice are transmitted macroblock by macroblock, and inside a macroblock, the transform coefficient values are transmitted transform block by transform block.
- It is possible that a flag is encoded at the slice level, which signals whether all macroblock modes and motion parameters are inferred from the base layer. Given the current syntax of CGS/MGS packets that means especially that all syntax elements mb_skip_run and mb_skip_flag are not transmitted but inferred to be equal to 0, that all syntax elements mb_field_decoding_flag are not transmitted but inferred to be equal to their values in the co-located base layer macroblocks, and that all syntax elements base_mode_flag and residual_prediction_flag are not transmitted but inferred to be equal to 1. In the first SNR enhancement layer this flag should usually set to 0, since for this enhancement it should be possible to transmit motion vectors that are different from the base layer in order to improve the coding efficiency. But in all further enhancement layers, this flag is set equal to 1, since these enhancement layers only represent a refinement of transform coefficient levels of scan positions that haven't been encoded in the previous SNR enhancement layers. And by setting this flag equal to 1, the coding efficiency can be improved for this case, since no transmission of non-required syntax elements is necessary and thus associated bit-rate is saved.
- As further described above, the first scanning position x for the transform coefficient levels in the various transform blocks may be transmitted at a slice level, with no syntax elements being transmitted at a macroblock level for transform coefficients with a scanning position that is smaller than x. In addition to the above description where the first scanning position is transmitted only for a specific transform size and the first scanning position for other transform sizes is inferred based on the transmitted value, it would be possible to transmit a first scanning position for all supported transform sizes.
- Similarly, the last scanning position y for the transform coefficient levels in the various transform blocks may be transmitted at a slice level, with no syntax elements being transmitted at a macroblock level for transform coefficients with a scanning position that is greater than y. Again, it is possible to either transmit a last scanning position for all supported transform sizes, or to transmit the last scanning position only for a specific transform size and to infer the last scanning position for other transform sizes based on the transmitted value.
- The first scanning position for each transform block in an SNR enhancement layer may alternatively be inferred based on the transform coefficients that have been transmitted in a previous enhancement layer. This inference rule may independently applied to all transform blocks, and in each block a different first transform coefficient can be derived by, for example,
combiner 428. - Further, a combination of signaling and inferring the first scanning position may be done. That means that the first scanning position may basically inferred based on already transmitted transform coefficient levels in previous SNR enhancement layers, but for this the additional knowledge is used that the first scanning position cannot be smaller than a value x, which is transmitted in the slice header. With this concept it is again possible to have a different first scanning index in each transform block, which can be chosen in order to maximize the coding efficiency.
- As an even further alternative, the signaling of the first scan position, the inference of the first scan position, or the combination of them may be combined with the signaling of the last scanning position.
- In so far, the above description enables a possible scheme allowing for SNR scalability in which only subsets of transform coefficient levels are transmitted in different SNR enhancement layers, and this mode is signaled by one or more slice header syntax elements, which specify that macroblock modes and motion parameters are inferred for all macroblock types and/or that transform coefficients for several scanning positions are not present at a transform block level. A slice level syntax element may be used that signals that the macroblock modes and motion parameters for all macroblock are inferred from the co-located base layer macroblocks. Specifically, the same macroblock modes and motion parameters may be used, and the corresponding syntax elements may not be transmitted at a slice level. The first scanning position x for all transform blocks may be signaled by slice header syntax elements. At the macroblock level, no syntax elements are transmitted for transform coefficient values of scanning positions smaller than x. Alternatively, the first scanning position for a transform block may be inferred based on the transmitted transform coefficient levels of the base layer. A combination of the latter alternatives is also possible. Similarly, the last scanning position y for all transform blocks may be signaled by slice header syntax elements, wherein, at the macroblock level, no syntax elements are transmitted for transform coefficient values of scanning positions greater than y.
- As noted above, the detailed described embodiments of
FIG. 1-11 may be varied in various ways. For example, although the above embodiments were exemplified with respect to a two spatial layer environment, the above embodiments are readily transferable to an embodiment with only one quality layer or with more than one quality layer but with the N+1 SNR scalable refinement layers. Imagine, for example, thatpart 12 inFIG. 1 is missing. In this case,hybrid coder 42 acts as a coding means for coding thevideo signal 18 using block-wise transformation to obtaintransform blocks picture 140 of the video signal whileunit 44 acts as a means for forming, for each of a plurality of quality layers, avideo sub-data stream coder 42 may be simplified to perform no motion prediction but merely block-wise transformation. Similarly, in the one quality layer case,demultiplexer 402 would act as a parsing means for parsing the video sub-data streams of the plurality of quality layers, to obtain, for each quality layer, the scan range information and the transform coefficient information, and thecombiner 428 would act as a means for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions, with the back-transformation unit 438 reconstructing the picture of the video signal by a back-transformation of the transform blocks. - Furthermore, the embodiment in
FIG. 1 may be varied in a way that thebase layer coder 12 operates with the same spatial resolution and the same bit depth as theenhancement layer coder 14. In that case the embodiment represents SNR scalable coding with a standard base layer and various enhancement layers 28, 30 that contain partitions of the transform coefficients. - Depending on an actual implementation, the inventive scheme can be implemented in hardware or in software. Therefore, the present invention also relates to a computer program, which can be stored on a computer-readable medium such as a CD, a disk or any other data carrier. The present invention is, therefore, also a computer program having a program code which, when executed on a computer, performs the inventive method in connection with the above figures.
- Furthermore, it is noted that all steps or functions indicated in the flow diagrams could be implemented by respective means in the encoder and that the implementations may comprise subroutines running on a CPU, circuit parts of an ASIC or the like.
- While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.
Claims (23)
1-42. (canceled)
43. An apparatus for generating a quality-scalable video data stream, comprising:
a coder for coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and
a generator for forming, for each of a plurality of quality layers, a video sub-data stream comprising scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers comprises at least one possible scan position not comprised by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is comprised by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, comprising a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer, such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
44. An apparatus for reconstructing a video signal from a quality-scalable video data stream comprising, for each of a plurality of quality layers, a video sub-data stream, comprising:
a parser for parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions;
a constructor for, using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform I blocks from the transform coefficient information to the sub-set of the possible scan positions; and
a reconstructor for reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein the parser is configured such that the transform coefficient information of more than one of the quality layers may comprise a contribution value relating to one transformation coefficient value, and wherein the constructor is configured to derive the value for the one the transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
45. The apparatus according to claim 44 , wherein the constructor is configured to use the scan range information as if same would indicate a first scan position among the possible scan positions within the sub-set of possible scan positions in the predetermined scan order.
46. The apparatus according to claim 44 , wherein the constructor is configured to use the scan range information as if same would indicate a last scan position among the possible scan positions within the sub-set of possible scan positions in the predetermined scan order.
47. The apparatus according to claim 44 , wherein among the transform blocks, at least one first transform block is comprised of 4×4 transformation coefficient values and at least one second transform block is comprised of 8×8 transformation coefficient values, and wherein the parser is configured to expect that the scan range information for each of the plurality of quality layers indicates the sub-set of possible scan positions for the at least one first transform block in an accuracy enabling a distinction between all 16 possible scan positions of the at least one first transform block, and for the at least one second transform block in an accuracy enabling a distinction between all 16 quadruples of consecutive transform coefficients of the at least one second transform block.
48. The apparatus according to claim 44 , wherein the parser is configured to expect the transformation coefficient values belonging to the sub-set of possible scan positions to be block-wise coded into the transform coefficient information so that the transformation coefficient value of a predetermined transform block form a continuous portion of the transform coefficient information.
49. The apparatus according to claim 48 , wherein the parser is configured to decode the consecutive portion by
decoding a significance map specifying positions of the transformation coefficient values being unequal to zero and belonging to the sub-set of possible scan positions in the predetermined transform block into the video sub-data stream, and subsequently
in a reverse scan order reversed relative to the predetermined scan order—starting with the last transformation coefficient value being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block—decoding the transform coefficient values being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block.
50. The apparatus according to claim 49 , wherein the parser is configured to decode the significance map by
in the predetermined scan order, decoding a significance flag per transformation coefficient value belonging to the sub-set of possible scan positions from the first transformation coefficient value belonging to the sub-set of possible scan positions to the last transform coefficient value belonging to the sub-set of possible scan positions and being unequal to zero, with the significance flags depending on the respective transformation coefficient value being zero or unequal to zero, and
following each significance flag of a respective transformation coefficient value being unequal to zero, decoding a last-flag depending on the respective transformation coefficient value being the last transformation coefficient value belonging to the sub-sets of possible scan positions within the predetermined transform block being non-zero or not.
5. The apparatus according to claim 48 , wherein the parser is configured to decode the consecutive portion by
decoding a significance information specifying the number of transformation coefficient values being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block as well as the number of consecutive trailing transformation coefficient values comprising an absolute value of one within the number of transformation coefficient values being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block;
decoding the signs of the consecutive trailing transformation coefficient values and the remaining transformation coefficient values being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block;
decoding the total number of transformation coefficient values being equal to zero and belonging to the sub-set of possible scan positions up to the last transformation coefficient value being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block;
decoding the number of consecutive transformation coefficient values being equal to zero and belonging to the sub-set of possible scan positions immediately preceding any of the number of transformation coefficient values being unequal to zero and belonging to the sub-set of possible scan positions within the predetermined transform block in a reversed scan order.
52. The apparatus according to claim 44 , wherein the predetermined scan order scans the transformation coefficient values of the transform blocks such that transformation coefficient values belonging to a higher scan position in the predetermined scan order relate to higher spatial frequencies.
53. The apparatus according to claim 44 , wherein the reconstructor is configured to reconstruct the video signal using motion-prediction based on motion information and by combining a motion-prediction result with a motion-prediction residual with acquiring the motion-prediction residual by a block-wise inverse transformation of the transform blocks of transformation coefficient values.
54. The apparatus according to claim 53 , wherein the parser is configured to expect each sub-data stream to comprise an indication indicating motion information existence or motion information non-existence for the respective quality layer, and that the sub-data stream of a first of the quality layers comprises the motion information and comprises the indication indicating motion information existence, or the indication within the sub-data stream of the first quality layer indicates the motion information non-existence, with a part of the quality-scalable video data stream other than the sub-data streams comprising the motion information, and to expect the sub-data stream(s) of the other quality layer(s) to comprise the indication indicating motion information non-existence.
55. The apparatus according to claim 54 , wherein the parser is configured to expect the sub-data stream of the first quality layer to comprise the indication indicating motion information existence, with the motion information being equal to the higher-quality motion information or equal to a refinement information allowing a reconstruction of the higher-quality motion information based on the lower-quality motion information, and that the part of the quality-scalable video data stream also comprises the lower-quality motion information.
56. The apparatus according to claim 54 , wherein the parser is configured such that the motion information and the indication relate to a macroblock of the picture.
57. The apparatus according to claim 44 , wherein the parser is configured to parse each sub-data stream individually independently—with regard to a parsing result—from the other sub-data stream(s).
58. The apparatus according to claim 57 , wherein the constructor is configured to associate the respective transform coefficient information with the transformation coefficient values, with the association result being independent of the other sub-data stream(s).
59. The apparatus according to claim 58 , wherein a layer order is defined among the quality layers, and the sub-data stream of a first quality layer in the layer order enables an association of the respective transform coefficient information with the transformation coefficient values independent of the sub-data stream(s) of the following quality layer(s), whereas the sub-data stream(s) of the following quality layers in layer order enable an association of the respective transform coefficient information with the transformation coefficient values merely in combination with the sub-data stream(s) of (a) quality layer(s) preceding the respective quality layer, wherein the constructor is configured to associate the transform coefficient information of a respective quality layer with the transformation coefficient values by use of the sub-data streams of the respective quality layer and quality layer(s) preceding the respective quality layer.
60. A method for generating a quality-scalable video data stream, comprising:
coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and
forming, for each of a plurality of quality layers, a video sub-data stream comprising scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers comprises at least one possible scan position not comprised by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is comprised by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, comprising a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
61. A method for reconstructing a video signal from a quality-scalable video data stream comprising, for each of a plurality of quality layers, a video sub-data stream, comprising:
parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions;
using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and
reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein
the parsing the video sub-data streams is performed such that the transform coefficient information of more than one of the quality layers may comprise a contribution value relating to one transformation coefficient value, and the constructing the transform blocks comprises deriving the value for the one transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
62. A quality-scalable Video data stream enabling a reconstruction of a video signal comprising, for each of a plurality of quality layers, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions, wherein the transform coefficient information concerns transformation coefficient values belonging to the sub-set of possible scan positions, wherein the transform coefficient information of more than one of the quality layers comprises a contribution value relating to one transformation coefficient value, and the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
63. A computer-program comprising a program code for performing, when running on a computer, a method for generating a quality-scalable video data stream, the method comprising:
coding a video signal using block-wise transformation to acquire transform blocks of two-dimensionally arranged transformation coefficient values for a picture of the video signal, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values; and
forming, for each of a plurality of quality layers, a video sub-data stream comprising scan range information indicating a sub-set of the possible scan positions, such that the sub-set of each of the plurality of quality layers comprises at least one possible scan position not comprised by the sub-set of any other of the plurality of quality layers and one of the possible scan positions is comprised by more than one of the sub-sets of the quality layers, and transform coefficient information on transformation coefficient values belonging to the sub-set of possible scan positions of the respective quality layer, comprising a contribution value per possible scan position of the sub-set of possible scan positions of the respective quality layer such that the transform coefficient value of the one possible scan position is derivable based on a sum of the contribution values for the one possible scan position of the more than one of the sub-sets of the quality layers.
64. A computer-program comprising a program code for performing, when running on a computer, a method for reconstructing a video signal from a quality-scalable video data stream comprising, for each of a plurality of quality layers, a video sub-data stream, the method comprising:
parsing the video sub-data streams of the plurality of quality layers, to acquire, for each quality layer, a scan range information and transform coefficient information on two-dimensionally arranged transformation coefficient values of different transform blocks, wherein a predetermined scan order with possible scan positions orders the transformation coefficient values into a linear sequence of transformation coefficient values, and the scan range information indicates a sub-set of the possible scan positions;
using the scan range information, for each quality layer, constructing the transform blocks by associating the transformation coefficient values of the respective transform blocks from the transform coefficient information to the sub-set of the possible scan positions; and
reconstructing a picture of the video signal by a back-transformation of the transform blocks, wherein
the parsing the video sub-data streams is performed such that the transform coefficient information of more than one of the quality layers may comprise a contribution value relating to one transformation coefficient value, and the constructing the transform blocks comprises deriving the value for the one transform coefficient value based on a sum of the contribution values relating to the one transformation coefficient value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/523,308 US20130051472A1 (en) | 2007-01-18 | 2007-04-18 | Quality Scalable Video Data Stream |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US88553407P | 2007-01-18 | 2007-01-18 | |
US12/523,308 US20130051472A1 (en) | 2007-01-18 | 2007-04-18 | Quality Scalable Video Data Stream |
EPPCT/EP2007/003411 | 2007-04-18 | ||
PCT/EP2007/003411 WO2008086828A1 (en) | 2007-01-18 | 2007-04-18 | Quality scalable video data stream |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2007/003411 A-371-Of-International WO2008086828A1 (en) | 2007-01-18 | 2007-04-18 | Quality scalable video data stream |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/511,875 Division US9113167B2 (en) | 2007-01-18 | 2009-07-29 | Coding a video signal based on a transform coefficient for each scan position determined by summing contribution values across quality layers |
Publications (1)
Publication Number | Publication Date |
---|---|
US20130051472A1 true US20130051472A1 (en) | 2013-02-28 |
Family
ID=38698325
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/523,308 Abandoned US20130051472A1 (en) | 2007-01-18 | 2007-04-18 | Quality Scalable Video Data Stream |
US12/511,875 Active 2030-01-11 US9113167B2 (en) | 2007-01-18 | 2009-07-29 | Coding a video signal based on a transform coefficient for each scan position determined by summing contribution values across quality layers |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/511,875 Active 2030-01-11 US9113167B2 (en) | 2007-01-18 | 2009-07-29 | Coding a video signal based on a transform coefficient for each scan position determined by summing contribution values across quality layers |
Country Status (18)
Country | Link |
---|---|
US (2) | US20130051472A1 (en) |
EP (1) | EP2123052B1 (en) |
JP (1) | JP5014438B2 (en) |
KR (4) | KR101175593B1 (en) |
CN (2) | CN102547277B (en) |
AT (1) | ATE489808T1 (en) |
BR (1) | BRPI0720806B1 (en) |
CA (1) | CA2675891C (en) |
CY (1) | CY1111418T1 (en) |
DE (1) | DE602007010835D1 (en) |
DK (1) | DK2123052T3 (en) |
ES (1) | ES2355850T3 (en) |
HK (1) | HK1135827A1 (en) |
PL (1) | PL2123052T3 (en) |
PT (1) | PT2123052E (en) |
SI (1) | SI2123052T1 (en) |
TW (1) | TWI445412B (en) |
WO (1) | WO2008086828A1 (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110202637A1 (en) * | 2008-10-28 | 2011-08-18 | Nxp B.V. | Method for buffering streaming data and a terminal device |
US20110293003A1 (en) * | 2009-02-11 | 2011-12-01 | Thomson Licensing A Corporation | Methods and apparatus for bit depth scalable video encoding and decoding utilizing tone mapping and inverse tone mapping |
US20120099656A1 (en) * | 2010-10-26 | 2012-04-26 | Ohya Yasuo | Transmitting system, receiving device, and a video transmission method |
US20120163469A1 (en) * | 2009-06-07 | 2012-06-28 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US20120163472A1 (en) * | 2010-12-22 | 2012-06-28 | Qualcomm Incorporated | Efficiently coding scanning order information for a video block in video coding |
US20130003833A1 (en) * | 2011-06-30 | 2013-01-03 | Vidyo Inc. | Scalable Video Coding Techniques |
US20130195169A1 (en) * | 2012-02-01 | 2013-08-01 | Vidyo, Inc. | Techniques for multiview video coding |
US20130329807A1 (en) * | 2011-03-03 | 2013-12-12 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US20130343455A1 (en) * | 2011-03-10 | 2013-12-26 | Sharp Kabushiki Kaisha | Image decoding device, image encoding device, and data structure of encoded data |
US8976861B2 (en) | 2010-12-03 | 2015-03-10 | Qualcomm Incorporated | Separately coding the position of a last significant coefficient of a video block in video coding |
US20150139299A1 (en) * | 2011-06-28 | 2015-05-21 | Samsung Electronics Co., Ltd. | Method and apparatus for coding video and method and apparatus for decoding video accompanied with arithmetic coding |
US9042440B2 (en) | 2010-12-03 | 2015-05-26 | Qualcomm Incorporated | Coding the position of a last significant coefficient within a video block based on a scanning order for the block in video coding |
US9106913B2 (en) | 2011-03-08 | 2015-08-11 | Qualcomm Incorporated | Coding of transform coefficients for video coding |
US9167253B2 (en) | 2011-06-28 | 2015-10-20 | Qualcomm Incorporated | Derivation of the position in scan order of the last significant transform coefficient in video coding |
US9197890B2 (en) | 2011-03-08 | 2015-11-24 | Qualcomm Incorporated | Harmonized scan order for coding transform coefficients in video coding |
JP2016524877A (en) * | 2013-06-11 | 2016-08-18 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | Handling bitstream constraints on inter-layer prediction types in multi-layer video coding |
US9635368B2 (en) | 2009-06-07 | 2017-04-25 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US9756613B2 (en) | 2012-12-06 | 2017-09-05 | Qualcomm Incorporated | Transmission and reception timing for device-to-device communication system embedded in a cellular system |
US10075708B2 (en) | 2012-04-09 | 2018-09-11 | Sun Patent Trust | Image encoding method and image decoding method |
TWI635740B (en) * | 2017-06-12 | 2018-09-11 | 元智大學 | Parallel and hierarchical lossless recompression method and architecture thereof |
US10194158B2 (en) | 2012-09-04 | 2019-01-29 | Qualcomm Incorporated | Transform basis adjustment in scalable video coding |
US10595036B2 (en) | 2012-06-29 | 2020-03-17 | Velos Media, Llc | Decoding device and decoding method |
US11330272B2 (en) | 2010-12-22 | 2022-05-10 | Qualcomm Incorporated | Using a most probable scanning order to efficiently code scanning order information for a video block in video coding |
US11812038B2 (en) * | 2011-03-03 | 2023-11-07 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US12149713B2 (en) * | 2011-03-03 | 2024-11-19 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
Families Citing this family (60)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9055338B2 (en) * | 2007-03-13 | 2015-06-09 | Nokia Corporation | System and method for video encoding and decoding |
US8599921B2 (en) * | 2009-03-27 | 2013-12-03 | Vixs Systems, Inc | Adaptive partition subset selection module and method for use therewith |
CA2904730C (en) * | 2009-05-29 | 2023-11-21 | Mitsubishi Electric Corporation | Image encoding device, image decoding device, image encoding method, and image decoding method |
CN108471537B (en) * | 2010-04-13 | 2022-05-17 | Ge视频压缩有限责任公司 | Device and method for decoding transformation coefficient block and device for coding transformation coefficient block |
US8798131B1 (en) | 2010-05-18 | 2014-08-05 | Google Inc. | Apparatus and method for encoding video using assumed values with intra-prediction |
US9172968B2 (en) | 2010-07-09 | 2015-10-27 | Qualcomm Incorporated | Video coding using directional transforms |
CN105120279B (en) * | 2010-07-20 | 2018-05-29 | 株式会社Ntt都科摩 | Image prediction encoding device and method, image prediction/decoding device and method |
KR101373814B1 (en) * | 2010-07-31 | 2014-03-18 | 엠앤케이홀딩스 주식회사 | Apparatus of generating prediction block |
KR101483179B1 (en) * | 2010-10-06 | 2015-01-19 | 에스케이 텔레콤주식회사 | Frequency Transform Block Coding Method and Apparatus and Image Encoding/Decoding Method and Apparatus Using Same |
US9288496B2 (en) | 2010-12-03 | 2016-03-15 | Qualcomm Incorporated | Video coding using function-based scan order for transform coefficients |
US10992958B2 (en) | 2010-12-29 | 2021-04-27 | Qualcomm Incorporated | Video coding using mapped transforms and scanning modes |
US9490839B2 (en) | 2011-01-03 | 2016-11-08 | Qualcomm Incorporated | Variable length coding of video block coefficients |
WO2012093969A1 (en) * | 2011-01-07 | 2012-07-12 | Agency For Science, Technology And Research | Method and an apparatus for coding an image |
US9210442B2 (en) | 2011-01-12 | 2015-12-08 | Google Technology Holdings LLC | Efficient transform unit representation |
US9380319B2 (en) | 2011-02-04 | 2016-06-28 | Google Technology Holdings LLC | Implicit transform unit representation |
US10142637B2 (en) * | 2011-03-08 | 2018-11-27 | Texas Instruments Incorporated | Method and apparatus for parallelizing context selection in video processing |
CN102685503B (en) | 2011-03-10 | 2014-06-25 | 华为技术有限公司 | Encoding method of conversion coefficients, decoding method of conversion coefficients and device |
CN105791875B (en) * | 2011-06-10 | 2018-12-11 | 联发科技股份有限公司 | Scalable video coding method and its device |
US9516316B2 (en) | 2011-06-29 | 2016-12-06 | Qualcomm Incorporated | VLC coefficient coding for large chroma block |
US9338456B2 (en) | 2011-07-11 | 2016-05-10 | Qualcomm Incorporated | Coding syntax elements using VLC codewords |
WO2013063800A1 (en) * | 2011-11-04 | 2013-05-10 | Mediatek Inc. | Methods and apparatuses of solving mdcs parsing issue |
FR2982447A1 (en) | 2011-11-07 | 2013-05-10 | France Telecom | METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS |
FR2982446A1 (en) | 2011-11-07 | 2013-05-10 | France Telecom | METHOD FOR ENCODING AND DECODING IMAGES, CORRESPONDING ENCODING AND DECODING DEVICE AND COMPUTER PROGRAMS |
JP2014533058A (en) * | 2011-11-08 | 2014-12-08 | サムスン エレクトロニクス カンパニー リミテッド | Video arithmetic encoding method and apparatus, and video arithmetic decoding method and apparatus |
KR20130107861A (en) * | 2012-03-23 | 2013-10-02 | 한국전자통신연구원 | Method and apparatus for inter layer intra prediction |
PL3515073T3 (en) * | 2012-03-26 | 2021-03-08 | Jvckenwood Corporation | Image coding device, image coding method, and image coding program |
US10129540B2 (en) * | 2012-04-10 | 2018-11-13 | Texas Instruments Incorporated | Reduced complexity coefficient transmission for adaptive loop filtering (ALF) in video coding |
US9219913B2 (en) | 2012-06-13 | 2015-12-22 | Qualcomm Incorporated | Inferred base layer block for TEXTURE—BL mode in HEVC based single loop scalable video coding |
JP6141417B2 (en) * | 2012-06-29 | 2017-06-07 | インテル コーポレイション | System, method and computer program product for scalable video coding based on coefficient sampling |
US20140003504A1 (en) * | 2012-07-02 | 2014-01-02 | Nokia Corporation | Apparatus, a Method and a Computer Program for Video Coding and Decoding |
CA2878807C (en) * | 2012-07-09 | 2018-06-12 | Vid Scale, Inc. | Codec architecture for multiple layer video coding |
GB2496015B (en) * | 2012-09-05 | 2013-09-11 | Imagination Tech Ltd | Pixel buffering |
US20140086328A1 (en) * | 2012-09-25 | 2014-03-27 | Qualcomm Incorporated | Scalable video coding in hevc |
RU2646340C2 (en) * | 2012-09-28 | 2018-03-02 | Сони Корпорейшн | Coding device and method, decoding device and method |
KR101835360B1 (en) | 2012-10-01 | 2018-03-08 | 지이 비디오 컴프레션, 엘엘씨 | Scalable video coding using subblock-based coding of transform coefficient blocks in the enhancement layer |
PL2923488T3 (en) * | 2012-11-21 | 2021-11-02 | Dolby International Ab | Signaling scalability information in a parameter set |
US10097825B2 (en) * | 2012-11-21 | 2018-10-09 | Qualcomm Incorporated | Restricting inter-layer prediction based on a maximum number of motion-compensated layers in high efficiency video coding (HEVC) extensions |
US10178400B2 (en) | 2012-11-21 | 2019-01-08 | Dolby International Ab | Signaling scalability information in a parameter set |
CN103916670B (en) * | 2013-01-07 | 2017-08-04 | 华为技术有限公司 | A kind of coding of image, coding/decoding method and device |
US9219915B1 (en) | 2013-01-17 | 2015-12-22 | Google Inc. | Selection of transform size in video coding |
US9967559B1 (en) | 2013-02-11 | 2018-05-08 | Google Llc | Motion vector dependent spatial transformation in video coding |
US9544597B1 (en) | 2013-02-11 | 2017-01-10 | Google Inc. | Hybrid transform in video encoding and decoding |
US9681155B2 (en) * | 2013-03-15 | 2017-06-13 | Sony Interactive Entertainment America Llc | Recovery from packet loss during transmission of compressed video streams |
US9674530B1 (en) | 2013-04-30 | 2017-06-06 | Google Inc. | Hybrid transforms in video coding |
US10516898B2 (en) | 2013-10-10 | 2019-12-24 | Intel Corporation | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
EP3202142B1 (en) * | 2014-09-30 | 2020-11-11 | Microsoft Technology Licensing, LLC | Hash-based encoder decisions for video coding |
US9565451B1 (en) | 2014-10-31 | 2017-02-07 | Google Inc. | Prediction dependent transform coding |
US10306229B2 (en) | 2015-01-26 | 2019-05-28 | Qualcomm Incorporated | Enhanced multiple transforms for prediction residual |
US9769499B2 (en) | 2015-08-11 | 2017-09-19 | Google Inc. | Super-transform video coding |
US10277905B2 (en) | 2015-09-14 | 2019-04-30 | Google Llc | Transform selection for non-baseband signal coding |
US9807423B1 (en) | 2015-11-24 | 2017-10-31 | Google Inc. | Hybrid transform scheme for video coding |
KR102546142B1 (en) * | 2016-03-18 | 2023-06-21 | 로즈데일 다이나믹스 엘엘씨 | Method and apparatus for deriving block structure in video coding system |
US10623774B2 (en) | 2016-03-22 | 2020-04-14 | Qualcomm Incorporated | Constrained block-level optimization and signaling for video coding tools |
CN109417638B (en) * | 2016-05-28 | 2022-02-08 | 世宗大学校产学协力团 | Method and apparatus for encoding or decoding video signal |
US10791340B2 (en) * | 2016-11-15 | 2020-09-29 | Sony Corporation | Method and system to refine coding of P-phase data |
US11606569B2 (en) * | 2018-09-25 | 2023-03-14 | Apple Inc. | Extending supported components for encoding image data |
US11323748B2 (en) | 2018-12-19 | 2022-05-03 | Qualcomm Incorporated | Tree-based transform unit (TU) partition for video coding |
US11122297B2 (en) | 2019-05-03 | 2021-09-14 | Google Llc | Using border-aligned block functions for image compression |
CN113473139A (en) | 2020-03-31 | 2021-10-01 | 华为技术有限公司 | Image processing method and image processing device |
CN116800968A (en) * | 2022-03-17 | 2023-09-22 | 中兴通讯股份有限公司 | Encoding method and apparatus, decoding method and apparatus, storage medium, and electronic apparatus |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030118243A1 (en) * | 2001-09-18 | 2003-06-26 | Ugur Sezer | Largest magnitude indices selection for (run, level) encoding of a block coded picture |
US20050030205A1 (en) * | 2002-03-19 | 2005-02-10 | Fujitsu Limited | Hierarchical encoding and decoding devices |
US20060008002A1 (en) * | 2002-09-27 | 2006-01-12 | Koninklijke Philips Electronics N.V. | Scalable video encoding |
US20060230162A1 (en) * | 2005-03-10 | 2006-10-12 | Peisong Chen | Scalable video coding with two layer encoding and single layer decoding |
US20060233255A1 (en) * | 2005-04-13 | 2006-10-19 | Nokia Corporation | Fine granularity scalability (FGS) coding efficiency enhancements |
US20070160133A1 (en) * | 2006-01-11 | 2007-07-12 | Yiliang Bao | Video coding with fine granularity spatial scalability |
US20070223580A1 (en) * | 2006-03-27 | 2007-09-27 | Yan Ye | Methods and systems for refinement coefficient coding in video compression |
US20070230564A1 (en) * | 2006-03-29 | 2007-10-04 | Qualcomm Incorporated | Video processing with scalability |
US20070237239A1 (en) * | 2006-03-24 | 2007-10-11 | Byeong-Moon Jeon | Methods and apparatuses for encoding and decoding a video data stream |
US20090219988A1 (en) * | 2006-01-06 | 2009-09-03 | France Telecom | Methods of encoding and decoding an image or a sequence of images, corresponding devices, computer program and signal |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5253055A (en) * | 1992-07-02 | 1993-10-12 | At&T Bell Laboratories | Efficient frequency scalable video encoding with coefficient selection |
US5953506A (en) * | 1996-12-17 | 1999-09-14 | Adaptive Media Technologies | Method and apparatus that provides a scalable media delivery system |
JPH1141600A (en) * | 1997-07-24 | 1999-02-12 | Nippon Telegr & Teleph Corp <Ntt> | Method and device for hierarchical coding and decoding of image and storage medium |
JP4593720B2 (en) * | 2000-03-10 | 2010-12-08 | パナソニック株式会社 | Method and apparatus for dynamically displaying residue number coefficient |
US20030118097A1 (en) * | 2001-12-21 | 2003-06-26 | Koninklijke Philips Electronics N.V. | System for realization of complexity scalability in a layered video coding framework |
WO2003075579A2 (en) * | 2002-03-05 | 2003-09-12 | Koninklijke Philips Electronics N.V. | Method and system for layered video encoding |
KR100729270B1 (en) * | 2002-05-02 | 2007-06-15 | 프라운호퍼-게젤샤프트 츄어 푀르더룽 데어 안게반텐 포르슝에.파우. | Method and Arrangement for Encoding Transformation Coefficients in Image and/or Video Encoders and Decoders, Corresponding Computer Program, and Corresponding Computer-readable Storage Medium |
US6795584B2 (en) * | 2002-10-03 | 2004-09-21 | Nokia Corporation | Context-based adaptive variable length coding for adaptive block transforms |
-
2007
- 2007-04-18 PL PL07724348T patent/PL2123052T3/en unknown
- 2007-04-18 DE DE602007010835T patent/DE602007010835D1/en active Active
- 2007-04-18 EP EP07724348A patent/EP2123052B1/en active Active
- 2007-04-18 KR KR1020097015099A patent/KR101175593B1/en active IP Right Grant
- 2007-04-18 CN CN201110412117.7A patent/CN102547277B/en active Active
- 2007-04-18 JP JP2009545822A patent/JP5014438B2/en active Active
- 2007-04-18 US US12/523,308 patent/US20130051472A1/en not_active Abandoned
- 2007-04-18 PT PT07724348T patent/PT2123052E/en unknown
- 2007-04-18 AT AT07724348T patent/ATE489808T1/en active
- 2007-04-18 DK DK07724348.3T patent/DK2123052T3/en active
- 2007-04-18 KR KR1020117006484A patent/KR101341111B1/en active IP Right Grant
- 2007-04-18 WO PCT/EP2007/003411 patent/WO2008086828A1/en active Application Filing
- 2007-04-18 ES ES07724348T patent/ES2355850T3/en active Active
- 2007-04-18 BR BRPI0720806-5A patent/BRPI0720806B1/en active IP Right Grant
- 2007-04-18 CN CN2007800501094A patent/CN101606391B/en active Active
- 2007-04-18 CA CA2675891A patent/CA2675891C/en active Active
- 2007-04-18 SI SI200730518T patent/SI2123052T1/en unknown
- 2007-04-18 KR KR1020117006485A patent/KR101190227B1/en active IP Right Grant
- 2007-04-18 KR KR1020117006483A patent/KR101168294B1/en active IP Right Grant
-
2008
- 2008-01-16 TW TW097101622A patent/TWI445412B/en active
-
2009
- 2009-07-29 US US12/511,875 patent/US9113167B2/en active Active
-
2010
- 2010-04-21 HK HK10103889.6A patent/HK1135827A1/en unknown
-
2011
- 2011-02-23 CY CY20111100219T patent/CY1111418T1/en unknown
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030118243A1 (en) * | 2001-09-18 | 2003-06-26 | Ugur Sezer | Largest magnitude indices selection for (run, level) encoding of a block coded picture |
US20050030205A1 (en) * | 2002-03-19 | 2005-02-10 | Fujitsu Limited | Hierarchical encoding and decoding devices |
US20060008002A1 (en) * | 2002-09-27 | 2006-01-12 | Koninklijke Philips Electronics N.V. | Scalable video encoding |
US20060230162A1 (en) * | 2005-03-10 | 2006-10-12 | Peisong Chen | Scalable video coding with two layer encoding and single layer decoding |
US20060233255A1 (en) * | 2005-04-13 | 2006-10-19 | Nokia Corporation | Fine granularity scalability (FGS) coding efficiency enhancements |
US20090219988A1 (en) * | 2006-01-06 | 2009-09-03 | France Telecom | Methods of encoding and decoding an image or a sequence of images, corresponding devices, computer program and signal |
US20070160133A1 (en) * | 2006-01-11 | 2007-07-12 | Yiliang Bao | Video coding with fine granularity spatial scalability |
US20070237239A1 (en) * | 2006-03-24 | 2007-10-11 | Byeong-Moon Jeon | Methods and apparatuses for encoding and decoding a video data stream |
US20070223580A1 (en) * | 2006-03-27 | 2007-09-27 | Yan Ye | Methods and systems for refinement coefficient coding in video compression |
US20070230564A1 (en) * | 2006-03-29 | 2007-10-04 | Qualcomm Incorporated | Video processing with scalability |
Cited By (51)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8612552B2 (en) * | 2008-10-28 | 2013-12-17 | Nxp B.V. | Method for buffering streaming data and a terminal device |
US20110202637A1 (en) * | 2008-10-28 | 2011-08-18 | Nxp B.V. | Method for buffering streaming data and a terminal device |
US20110293003A1 (en) * | 2009-02-11 | 2011-12-01 | Thomson Licensing A Corporation | Methods and apparatus for bit depth scalable video encoding and decoding utilizing tone mapping and inverse tone mapping |
US8867616B2 (en) * | 2009-02-11 | 2014-10-21 | Thomson Licensing | Methods and apparatus for bit depth scalable video encoding and decoding utilizing tone mapping and inverse tone mapping |
US9100648B2 (en) * | 2009-06-07 | 2015-08-04 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US12120352B2 (en) | 2009-06-07 | 2024-10-15 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US10986372B2 (en) | 2009-06-07 | 2021-04-20 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US10405001B2 (en) | 2009-06-07 | 2019-09-03 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US10015519B2 (en) | 2009-06-07 | 2018-07-03 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US20120163469A1 (en) * | 2009-06-07 | 2012-06-28 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US9635368B2 (en) | 2009-06-07 | 2017-04-25 | Lg Electronics Inc. | Method and apparatus for decoding a video signal |
US20120099656A1 (en) * | 2010-10-26 | 2012-04-26 | Ohya Yasuo | Transmitting system, receiving device, and a video transmission method |
US9154812B2 (en) * | 2010-10-26 | 2015-10-06 | Kabushiki Kaisha Toshiba | Transmitting system, receiving device, and a video transmission method |
US9042440B2 (en) | 2010-12-03 | 2015-05-26 | Qualcomm Incorporated | Coding the position of a last significant coefficient within a video block based on a scanning order for the block in video coding |
US8976861B2 (en) | 2010-12-03 | 2015-03-10 | Qualcomm Incorporated | Separately coding the position of a last significant coefficient of a video block in video coding |
US9055290B2 (en) | 2010-12-03 | 2015-06-09 | Qualcomm Incorporated | Coding the position of a last significant coefficient within a video block based on a scanning order for the block in video coding |
US20120163472A1 (en) * | 2010-12-22 | 2012-06-28 | Qualcomm Incorporated | Efficiently coding scanning order information for a video block in video coding |
US11330272B2 (en) | 2010-12-22 | 2022-05-10 | Qualcomm Incorporated | Using a most probable scanning order to efficiently code scanning order information for a video block in video coding |
US20240031584A1 (en) * | 2011-03-03 | 2024-01-25 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US11812038B2 (en) * | 2011-03-03 | 2023-11-07 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US20130329807A1 (en) * | 2011-03-03 | 2013-12-12 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US11102494B2 (en) * | 2011-03-03 | 2021-08-24 | Electronics And Telecommunication Research Institute | Method for scanning transform coefficient and device therefor |
US11606567B2 (en) * | 2011-03-03 | 2023-03-14 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US12149713B2 (en) * | 2011-03-03 | 2024-11-19 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US20210344935A1 (en) * | 2011-03-03 | 2021-11-04 | Electronics And Telecommunications Research Institute | Method for scanning transform coefficient and device therefor |
US9106913B2 (en) | 2011-03-08 | 2015-08-11 | Qualcomm Incorporated | Coding of transform coefficients for video coding |
US9338449B2 (en) | 2011-03-08 | 2016-05-10 | Qualcomm Incorporated | Harmonized scan order for coding transform coefficients in video coding |
US11405616B2 (en) | 2011-03-08 | 2022-08-02 | Qualcomm Incorporated | Coding of transform coefficients for video coding |
US11006114B2 (en) | 2011-03-08 | 2021-05-11 | Velos Media, Llc | Coding of transform coefficients for video coding |
US9197890B2 (en) | 2011-03-08 | 2015-11-24 | Qualcomm Incorporated | Harmonized scan order for coding transform coefficients in video coding |
US10499059B2 (en) | 2011-03-08 | 2019-12-03 | Velos Media, Llc | Coding of transform coefficients for video coding |
US10397577B2 (en) | 2011-03-08 | 2019-08-27 | Velos Media, Llc | Inverse scan order for significance map coding of transform coefficients in video coding |
US20130343455A1 (en) * | 2011-03-10 | 2013-12-26 | Sharp Kabushiki Kaisha | Image decoding device, image encoding device, and data structure of encoded data |
US10148974B2 (en) * | 2011-03-10 | 2018-12-04 | Sharp Kabushiki Kaisha | Image decoding device, image encoding device, and data structure of encoded data |
US9491469B2 (en) | 2011-06-28 | 2016-11-08 | Qualcomm Incorporated | Coding of last significant transform coefficient |
US20150139332A1 (en) * | 2011-06-28 | 2015-05-21 | Samsung Electronics Co., Ltd. | Method and apparatus for coding video and method and apparatus for decoding video accompanied with arithmetic coding |
US20150139299A1 (en) * | 2011-06-28 | 2015-05-21 | Samsung Electronics Co., Ltd. | Method and apparatus for coding video and method and apparatus for decoding video accompanied with arithmetic coding |
US9167253B2 (en) | 2011-06-28 | 2015-10-20 | Qualcomm Incorporated | Derivation of the position in scan order of the last significant transform coefficient in video coding |
US9247270B2 (en) * | 2011-06-28 | 2016-01-26 | Samsung Electronics Co., Ltd. | Method and apparatus for coding video and method and apparatus for decoding video accompanied with arithmetic coding |
US9258571B2 (en) * | 2011-06-28 | 2016-02-09 | Samsung Electronics Co., Ltd. | Method and apparatus for coding video and method and apparatus for decoding video accompanied with arithmetic coding |
US20130003833A1 (en) * | 2011-06-30 | 2013-01-03 | Vidyo Inc. | Scalable Video Coding Techniques |
US20130195169A1 (en) * | 2012-02-01 | 2013-08-01 | Vidyo, Inc. | Techniques for multiview video coding |
US10075708B2 (en) | 2012-04-09 | 2018-09-11 | Sun Patent Trust | Image encoding method and image decoding method |
US11399197B2 (en) | 2012-06-29 | 2022-07-26 | Velos Media, Llc | Encoding device and encoding method with setting and encoding of reference information |
US10958930B2 (en) | 2012-06-29 | 2021-03-23 | Velos Media, Llc | Encoding device and encoding method with setting and encoding of reference information |
US10623765B2 (en) | 2012-06-29 | 2020-04-14 | Velos Media, Llc | Encoding device and encoding method with setting and encoding of reference information |
US10595036B2 (en) | 2012-06-29 | 2020-03-17 | Velos Media, Llc | Decoding device and decoding method |
US10194158B2 (en) | 2012-09-04 | 2019-01-29 | Qualcomm Incorporated | Transform basis adjustment in scalable video coding |
US9756613B2 (en) | 2012-12-06 | 2017-09-05 | Qualcomm Incorporated | Transmission and reception timing for device-to-device communication system embedded in a cellular system |
JP2016524877A (en) * | 2013-06-11 | 2016-08-18 | クゥアルコム・インコーポレイテッドQualcomm Incorporated | Handling bitstream constraints on inter-layer prediction types in multi-layer video coding |
TWI635740B (en) * | 2017-06-12 | 2018-09-11 | 元智大學 | Parallel and hierarchical lossless recompression method and architecture thereof |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9113167B2 (en) | Coding a video signal based on a transform coefficient for each scan position determined by summing contribution values across quality layers | |
US8428143B2 (en) | Coding scheme enabling precision-scalability | |
US10659776B2 (en) | Quality scalable coding with mapping different ranges of bit depths | |
JP5151984B2 (en) | Video encoding device | |
ES2527932T3 (en) | Bit Depth Scalability | |
RU2411689C2 (en) | Method and device for interlayer forecasting of internal texture adaptive to macroblock | |
US7099515B2 (en) | Bitplane coding and decoding for AC prediction status information | |
US8532176B2 (en) | Methods and systems for combining layers in a multi-layer bitstream | |
WO2007079782A1 (en) | Quality scalable picture coding with particular transform coefficient scan path | |
US20120219060A1 (en) | System and method for scalable encoding and decoding of multimedia data using multiple layers | |
US20080008247A1 (en) | Methods and Systems for Residual Layer Scaling | |
MXPA06002079A (en) | Advanced bi-directional predictive coding of interlaced video. | |
US20140044180A1 (en) | Device and method for coding video information using base layer motion vector candidate | |
US20070177664A1 (en) | Entropy encoding/decoding method and apparatus | |
KR20060122684A (en) | Method for encoding and decoding video signal | |
US20090219988A1 (en) | Methods of encoding and decoding an image or a sequence of images, corresponding devices, computer program and signal | |
US20070230811A1 (en) | Method of enhancing entropy-coding efficiency, video encoder and video decoder thereof | |
Limnell et al. | Quality scalability in H. 264/AVC video coding | |
Feldmann | Low complexity scalable HEVC using single loop decoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WIEGAND, THOMAS;KIRCHHOFFER, HEINER;SCHWARZ, HEIKO;REEL/FRAME:024265/0930 Effective date: 20090727 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |