[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2006017230A1 - Unbiased rounding for video compression - Google Patents

Unbiased rounding for video compression Download PDF

Info

Publication number
WO2006017230A1
WO2006017230A1 PCT/US2005/024552 US2005024552W WO2006017230A1 WO 2006017230 A1 WO2006017230 A1 WO 2006017230A1 US 2005024552 W US2005024552 W US 2005024552W WO 2006017230 A1 WO2006017230 A1 WO 2006017230A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
rounding
processing
unbiased rounding
decoding
Prior art date
Application number
PCT/US2005/024552
Other languages
French (fr)
Inventor
Walter Christian Gish
Hyung-Suk Kim
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Priority to KR1020067025385A priority Critical patent/KR20070033343A/en
Priority to JP2007521538A priority patent/JP2008507206A/en
Priority to CA002566349A priority patent/CA2566349A1/en
Priority to EP05770121A priority patent/EP1766995A1/en
Priority to US11/632,365 priority patent/US20080075166A1/en
Publication of WO2006017230A1 publication Critical patent/WO2006017230A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream

Definitions

  • This invention relates to digital methods for compressing moving images, and, in particular, to more accurate methods of rounding for compression techniques that utilize inter- or intra-prediction to increase compression efficiency.
  • the invention includes not only methods but also corresponding computer program implementations and apparatus implementations.
  • a digital representation of video images consists of spatial samples of image intensity and/or color quantized to some particular bit depth.
  • the dominant value for this bit depth is 8 bits, which provides reasonable image quality and each sample fits perfectly into a single byte of digital memory.
  • bit depths such as 10 and 12 bits per sample, as evidenced by the MPEG-4 Studio and N-bit profiles and the Fidelity Range Extensions to H.264 (see citations below).
  • MSE mean-squared error criterion
  • NX and NY are the number of samples in the x- and y-directions.
  • the MSE is called the distortion.
  • the spatial samples of both these images are digital values.
  • the fidelity of a compressed image is measured by this distortion or MSE, normalized to the maximum possible (peak) amplitude and measured in logarithmic units.
  • the distortion PSNR Peak Signal-to-Noise Ratio
  • PSNR 101og((2" -l) 2 /(1/12)] (3)
  • FIG. 1 and FIG. 2 show block diagrams for an H.264 encoder and decoder, respectively.
  • H.264 also known as MPEG-4/AVC
  • MPEG-4/AVC is considered the state-of-the-art in modern video coding.
  • extensions currently being developed for H.264 known collectively as the "Fidelity Range Extensions.”
  • H.264 FRExt coding environments. Details of H.264 coding are set forth in "Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264
  • JVT Joint Video Team
  • H.264 FRExt Details of the "Fidelity Range Extensions" to the basic H.264 specifications (hence “H.264 FRExt”) are set forth in "Draft Text of H.264/AVC Fidelity Range Extensions Amendment," Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC
  • H.264 standard and its implementation may be found in various published literature, including, for example, "The emerging H.264/AVC standard,” by RaIf Schafer et al, EBU Technical Review, January 2003 (12 pages) and "H.264/MPEG-4 Part 10 White Paper: Overview of H.264,” by Iain E G Richardson, 07/10/02, published at www.vcodex.com. Said Schafer et al and Richardson publications are also incorporated by reference herein in their entirety. Aspects of the present invention may also be used with advantage in connection with modified MPEG-2 coding environments, as is explained further below.
  • H.264 or H.264 FRExt encoder (they are the same at a block diagram level) shown in FIG. 1 has elements now common in video coders: transform and quantization processes, entropy (lossless) coding, motion estimation (ME) and motion compensation (MC), and a buffer to store reconstructed frames.
  • H.264 and H.264 FRExt differ from previous codecs in a number of ways: an in-loop deblocking filter, several modes for intra- prediction, a new integer transform, two modes of entropy coding (variable length coding and arithmetic coding), motion block sizes down to 4x4 pixels, and so on.
  • the H.264 or H.264 FRExt decoder shown in FIG. 2 can be readily seen as a subset of the encoder.
  • the Fidelity Range Extensions (FRExt) to H.264 provide tools for encoding and decoding at sample bit depths up to 12 bits per sample. This is the first video codec to incorporate tools for encoding and decoding at bit depths greater than 8 bits per sample in a unified way.
  • the quantization method adopted in the Fidelity Range Extensions to H.264 produces a compressed bit stream that is potentially compatible among different sample bit depths as described in copending United States provisional patent application S.N. 60/573,017 of Walter C. Gish and Christopher J.
  • a goal of the present invention is to be able to decode a bitstream encoded at a high bit depth from a high bit depth input not only at that same high bit depth, but, alternatively, at a lower bit depth that provides decoded images bearing a reasonable approximation to the original high bit depth images.
  • This would, for example, enable an 8-bit or 10-bit H.264 FRExt decoder to reasonably decode bitstreams that would conventionally require, respectively, a 10-bit or 12-bit H.264 FRExt decoder.
  • this would enable a conventional 8-bit MPEG-2 decoder (as in FIG. 9 described below) to reasonably decode bitstreams produced by a modified MPEG-2 encoder such as described below in connection with FIG.
  • FIG. 3 shows that when a single bitstream encoded from a high bit depth source is decoded at the original high bit depth and at a lower bit depth, the lower bit depth decoding has some error, measured as MSE, with respect to the high bit depth reference.
  • MSE some error
  • the lower bit depth approximation is decoded as if the encoder bit depth were low, that is, it is a conventional decoder (see FIG. 6 below) or a conventional decoder employing the unbiased rounding aspects of the present invention (see FIG. 7 below).
  • FIG. 4 shows a simplified diagram of the prediction loop that exists in both the encoder and decoder identifying the places where rounding occurs: calculating the prediction (intra and inter), the deblocking filter, and the residual decoding.
  • calculating the prediction intra and inter
  • the deblocking filter the residual decoding.
  • the dominant sources of error are inter- and intra-prediction.
  • the loop deblocking filter is optional and, along with the rounding in decoding, the residual will introduce smaller errors. The problem then is to minimize these errors so that the MSE between the high bit depth output and the lower bit depth approximation is minimized.
  • the high bit depth decoding output is - error free with respect to the encoder since they both have the same high bit depth prediction loop. Therefore, a reduction in the MSE between it and the lower bit depth approximation indicates that the lower bit depth decoding more closely approximates the high bit depth decoding.
  • United States Patent Application Publication US 2002/0154693 Al disclosed a method for improving coding accuracy and efficiency by performing all intermediate calculations with greater precision. Said published application is hereby incorporated by reference in its entirety. In general, reasonable and common approximations at a lower bit depth can become unacceptable when compared to calculations at a higher bit depth.
  • An aspect of the present invention is directed to a method for improving the rounding in such intermediate calculations in order to minimize the error when decoding a bitstream at a lower bit depth than the input to the encoder.
  • the present invention is directed to the reduction or minimization of the errors resulting from decoding at a lower bit depth a video bitstream that was encoded at a higher bit depth compared to decoding such a bitstream at the higher bit depth.
  • a major, if not the dominant, contribution to such errors is the simple, but biased, rounding that is used in prior art compression schemes.
  • unbiased rounding methods in the decoder or, as may be appropriate, in both the decoder and the encoder, are employed to improve the overall accuracy resulting from decoding at lower bit depths than the bit depth of the encoder.
  • Such results may be demonstrated by the reduction or minimization of the error between the decoded results at a bit depth that is the same as the bit depth of the encoder and at a lower bit depth.
  • Other aspects of the invention may be appreciated as this document is read and understood.
  • FIG. 1 is a schematic functional block diagram of an H.264 or H.264 FRExt video encoder.
  • FIG. 2 is a schematic functional block diagram of an H.264 or H.264 FRExt video decoder.
  • FIG. 3 is a schematic functional block diagram of an arrangement for' comparing the quality of the outputs of two decoders.
  • FIG. 4 is a schematic functional block diagram of the prediction loop in an encoder and a decoder, identifying the places where rounding occurs.
  • FIG. 5 is a schematic functional block diagram of a motion compensation feedback loop (the deblocking filter and adder for the coded residual shown in FIG. 4 have been removed for simplicity).
  • FIG. 6 is a graphical representation showing the number of cumulative errors (vertical scale) versus video frame number (horizontal scale) for the case of a conventional decoder operating at a lower bit depth than the bit depth of the encoder with respect to a reference decoder (a decoder operating at the bit depth of the encoder).
  • FIG. 7 is a graphical representation showing the number of cumulative errors (vertical scale) versus video frame number (horizontal scale) for the case of a conventional decoder employing unbiased rounding operating at a lower bit depth than the bit depth of the encoder with respect to a reference decoder (a decoder operating at the bit depth of the encoder).
  • FIG. 8 is a representation of pixels in consecutive video lines, showing the pixels (unshaded) that may be used to predict another pixel (shaded).
  • FIG. 9 is a schematic functional block diagram showing a prior art MPEG-2 encoder (FIG. 9a) and decoder (FIG. 9b).
  • FIG. 10 is a schematic functional block diagram of a modified MPEG- 2 encoder (FIG. 10a) and decoder (FIG. 10b).
  • FIG. 11 is a comparison of 8-bit and 10-bit versions of the input, residual, transformed residual, and quantized transformed residual in MPEG- 2 type devices.
  • aspects of the present invention propose the use of unbiased rounding in the decoder, or, as may be appropriate, in both the encoder and decoder, for video compression, particularly for inter- and intra-prediction, where the error tends to accumulate in the prediction loop.
  • unbiased rounding in the decoder, or, as may be appropriate, in both the encoder and decoder, for video compression, particularly for inter- and intra-prediction, where the error tends to accumulate in the prediction loop.
  • the error variance, 3/32 is close to the variance for the continuous case, 1/12. Because the error mean is non-zero, this is called, "biased rounding." There is little that can be done to reduce the error variance as a non-zero error variance is unavoidable with rounding. However, there are known solutions for reducing the mean error to zero. When the fraction is exactly 1 A, all of these solutions round up half the time and round down half the time. The decision to round up or down can be made in a number of ways, both deterministically and randomly. For example:
  • Patent 5,930, 159 by Wong entitled "Right-Shifting an Integer Operand and Rounding a Fractional Intermediate Result to Obtain a Rounded Integer Result” describes what it characterizes as “unbiased” methods for "rounding" towards zero or towards infinity as described in the MPEG-I and MPEG-2 standards.
  • the methods Wong describes are more appropriately viewed as truncation methods rather than rounding.
  • they are unbiased only for an equal mix of positive and negative values; they are highly biased (as all truncation methods are) for non-negative values.
  • Unbiased rounding, as used herein, is unbiased for positive and negative values separately and not just in combination.
  • the magnitude of the error introduced by biased rounding depends on the number of fractional bits, M.
  • M is 2 and so the case that causes the bias occurs 25% of the time. If M is 1, this case occurs 50% of the time and so the mean error is twice as large. Analogously, if M is 3, this case occurs 12.5% of the time and so the mean error is half as much.
  • the mean error for biased rounding is
  • FIG. 5 shows the essential components of such a motion compensation feedback loop (the deblocking filter and adder for the coded residual shown in FIG. 4 have been removed for simplicity).
  • the frame store in FIG. 5 is initialized by some initial image. In common practice, this initial image corresponds to an intra-macroblock or intra-frame picture.
  • the motion compensation filter interpolates a portion of the frame store displaced by the integer portion of a motion vector. This filter has the overall linear form shown in equations (4) and (5).
  • the filter coefficients themselves are generally a windowed sine function with a phase determined by the fractional portion of the motion vector, and (x',y') is determined by the integer portion of the motion vector. Round-off error is unavoidable given the fractional coefficients c(i,j) or their integer version C(i,j). Only in the case that c(ij) were an integer would there be no round ⁇ off error.
  • the error variance adds incoherently from iteration to iteration, but the mean error adds coherently so that the mean error eventually dominates the total mean-squared error (MSE) in the frame store.
  • MSE mean-squared error
  • Table 4 tabulates the relative contributions of the mean error and variance error to the overall MSE from iteration to iteration. Each iteration corresponds to the next P-frame or P-macroblock, i.e., one that is predicted from a previous frame or macroblock. When B-frames are used as reference frames, they also constitute an iteration. At the Kth iteration the cumulative mean error is
  • FIG. 6 and FIG. 7 show the growth of the MSE or drift error with biased rounding as in the prior art and unbiased rounding in accordance with the present invention, respectively, for decoding at 8 -bits a bitstream encoded from a 10-bit source using the modified version of MPEG-2 shown in FIG. 10(a).
  • FIG. 8 shows the blocks (in white) that can influence the intra- predicted values for a given block (in black) in the H.264 and H.264 FRExt systems. Because these predictions can take place on blocks as small as 4x4 pixels, the error propagation for intra-prediction can occur over and over many times. For example, at the HDTV resolution of 1080x1920, there can be hundreds of iterations in both the horizontal and vertical directions. By comparison, the error propagation for inter-prediction shown in FIG. 6 and FIG. 7 was only for 16 iterations, and Table 4 only went up to 32 iterations.
  • FIGS. 9a and 9b show prior art implementations of an MPEG-2 encoder and decoder (b).
  • profiles video data having an input precision, or bit depth, of 8 bits is applied. This input precision subsequently determines the minimum precision of various internal variables used in compression.
  • input video with a precision, or bit depth, of 8 bits is applied to a subtractor("-").
  • the integer output of the subtractor also has 8 bits of precision, but since it can be negative, it requires a sign bit for a total of 9 bits which is shown as "s8" (signed 8).
  • the difference output of the subtractor is called the "residual.”
  • This integer output is then applied to a 2-D DCT whose output requires three additional bits or 12 bits in a signed 11 bit (“si 1") format.
  • These 12 bits are quantized and then entropy (variable length coding) ("VLC") coded with other parameters to produce an encoded bitstream.
  • VLC variable length coding
  • the quantized, transformed coefficients are also inverse quantized (“IQ”), inverse transformed (“IDCT”), and added (with saturation) to the same prediction used in the original subtraction. Note that this portion of the encoder mimics the decoder shown in FIG. 9b.
  • VLC entropy coding
  • VLD decoding
  • Quantized data 12 bits (signed)
  • Quantized data 12 bits (signed) Those portions of the encoder and decoder that are altered are enclosed by a dotted line in each of FIGS. 10a and 10b.
  • the quantization and inverse quantization are altered so that the scale of the quantized values does not change. Since the internal variables in the 10-bit encoder have two extra bits of precision, this change is an additional right shift of 2, or a division by 4, for quantization and an additional left shift of 2, or a multiplication by 4, for dequantization. Since 8 -bit quantization is simply a division by the quantization scale, QS 5 the equivalent 10-bit quantization is simply a division by four times the quantization scale, or 4*QS. Similarly, since inverse quantization at 8 -bits is basically a multiplication by the quantization scale QS, at 10-bits we simply multiply by four times the quantization scale.
  • Unbiased rounding has a significant effect on the error between high and low bit depth decoding of the same bitstream. Biased rounding creates both a mean and variance error.
  • the mean error is coherent, grows rapidly (MSE growth is quadratic in K as shown by equations (12) and (13)) from prediction to prediction, and is quite visible.
  • the variance error grows more slowly (MSE growth is linear) and is much less visible because it is random and has lower amplitude.
  • Unbiased rounding is more accurate when rounding is required.
  • unbiased rounding may be applied to calculations in the prediction loop, particularly inter- and intra- prediction. Implementation
  • the invention may be implemented in hardware or software, or a combination of both ⁇ e.g., programmable logic arrays). Unless otherwise specified, the algorithms included as part of the invention are not inherently related to any particular computer or other apparatus. In particular, various general-purpose machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct more specialized apparatus ⁇ e.g., integrated circuits) to perform the required method steps. Thus, the invention may be implemented in one or more computer programs executing on one or more programmable computer systems each comprising at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port. Program code is applied to input data to perform the functions described herein and generate output information. The output information is applied to one or more output devices, in known fashion.
  • Program code is applied to input data to perform the functions described herein and generate output information.
  • the output information is applied to one or more output devices, in known fashion
  • Each such program may be implemented in any desired computer language (including machine, assembly, or high level procedural, logical, or object oriented programming languages) to communicate with a computer system.
  • the language may be a compiled or interpreted language.
  • Each such computer program is preferably stored on or downloaded to a storage media or device ⁇ e.g., solid state memory or media, or magnetic or optical media) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer system to perform the procedures described herein.
  • a storage media or device e.g., solid state memory or media, or magnetic or optical media
  • the inventive system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer system to operate in a specific and predefined manner to perform the functions described herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Unbiased rounding of unsigned data is employed in the decoding or the encoding and decoding of digital bitstreams representing data-video when the video is encoded at a first bit depth and is decoded at a second bit depth, lower than the first bit depth. The unbiased rounding may be employed in processing that employs a prediction loop. When the data-compressed video is represented in frames, the unbiased rounding may be of inter-frame and/or intra-frame data.

Description

Description
Unbiased Rounding for Video Compression
Technical Field
This invention relates to digital methods for compressing moving images, and, in particular, to more accurate methods of rounding for compression techniques that utilize inter- or intra-prediction to increase compression efficiency. The invention includes not only methods but also corresponding computer program implementations and apparatus implementations.
Background Art
A digital representation of video images consists of spatial samples of image intensity and/or color quantized to some particular bit depth. The dominant value for this bit depth is 8 bits, which provides reasonable image quality and each sample fits perfectly into a single byte of digital memory. However, there is an increasing demand for systems that operate at higher bit depths, such as 10 and 12 bits per sample, as evidenced by the MPEG-4 Studio and N-bit profiles and the Fidelity Range Extensions to H.264 (see citations below).
Greater bit depths allow higher fidelity, or lower error, in the overall compression. The most common measure of error is the mean-squared error criterion, or MSE. The MSE between a test image whose spatial samples are testXty and a reference image whose spatial samples are refXιV is
1 NX NY y ,
MSE = yy (test -ref v) (1)
(NX)(NY) "^ ^ x'y Jx'y ) V J where NX and NY are the number of samples in the x- and y-directions. When the reference image is the input image and the test image is the compressed image, the MSE is called the distortion. In this case, the spatial samples of both these images are digital values. The fidelity of a compressed image is measured by this distortion or MSE, normalized to the maximum possible (peak) amplitude and measured in logarithmic units. In short, the distortion PSNR (Peak Signal-to-Noise Ratio) in dB is
PSNR = 10 \og(peak21 MSE) (2)
Greater bit depths permit higher values for PSNR. One can use the generality of the MSE criterion to show this. Consider the quantization of an analog input to N-bits. Here the MSE is computed between an analog input and its digital approximation. The quantization error for N-bit sampling is commonly modeled as independent, uniformly distributed random noise over the interval [-1A, 1A] so that the MSE is 1/12 with respect to the least significant bit. Since the input samples are integers in the range [0, 2N-1], the peak value is 2 ^N -1. Thus the PSNR corresponding to this MSE is
PSNR = 101og((2" -l)2 /(1/12)] (3)
Since this represents the error between the analog samples of the original image and its quantized representation, it is an upper bound for the fidelity of the compressed result compared to the original analog image. Table 1 shows this upper bound for some representative bit depths:
Figure imgf000003_0001
Table 1 Maximum PSNR as a function of bit depth FIG. 1 and FIG. 2 show block diagrams for an H.264 encoder and decoder, respectively. H.264, also known as MPEG-4/AVC, is considered the state-of-the-art in modern video coding. Of particular relevance here are a set of extensions currently being developed for H.264 known collectively as the "Fidelity Range Extensions."
Aspects of the present invention may be used with particular advantage in "H.264 FRExt" coding environments. Details of H.264 coding are set forth in "Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496- 10 AVC)," Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6), 8th Meeting: Geneva, Switzerland, 23-27 May, 2003. Details of the "Fidelity Range Extensions" to the basic H.264 specifications (hence "H.264 FRExt") are set forth in "Draft Text of H.264/AVC Fidelity Range Extensions Amendment," Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC
JTC1/SC29/WG11 and ITU-T SG16 Q.6), 11th Meeting: Munich, DE, 15-19 March, 2004. Both of the just-identified documents are hereby incorporated by reference in their entireties. The "Fidelity Range Extensions" will support higher-fidelity video coding by supporting increased sample accuracy, including 10-bit and 12-bit coding. Aspects of the present invention are particularly useful in connection with the implementation of such increased sample accuracy. Further details regarding the H.264 standard and its implementation may be found in various published literature, including, for example, "The emerging H.264/AVC standard," by RaIf Schafer et al, EBU Technical Review, January 2003 (12 pages) and "H.264/MPEG-4 Part 10 White Paper: Overview of H.264," by Iain E G Richardson, 07/10/02, published at www.vcodex.com. Said Schafer et al and Richardson publications are also incorporated by reference herein in their entirety. Aspects of the present invention may also be used with advantage in connection with modified MPEG-2 coding environments, as is explained further below.
An H.264 or H.264 FRExt encoder (they are the same at a block diagram level) shown in FIG. 1 has elements now common in video coders: transform and quantization processes, entropy (lossless) coding, motion estimation (ME) and motion compensation (MC), and a buffer to store reconstructed frames. H.264 and H.264 FRExt differ from previous codecs in a number of ways: an in-loop deblocking filter, several modes for intra- prediction, a new integer transform, two modes of entropy coding (variable length coding and arithmetic coding), motion block sizes down to 4x4 pixels, and so on.
Except for the entropy decode step, the H.264 or H.264 FRExt decoder shown in FIG. 2 can be readily seen as a subset of the encoder. The Fidelity Range Extensions (FRExt) to H.264 provide tools for encoding and decoding at sample bit depths up to 12 bits per sample. This is the first video codec to incorporate tools for encoding and decoding at bit depths greater than 8 bits per sample in a unified way. In particular, the quantization method adopted in the Fidelity Range Extensions to H.264 produces a compressed bit stream that is potentially compatible among different sample bit depths as described in copending United States provisional patent application S.N. 60/573,017 of Walter C. Gish and Christopher J. Vogt, filed May 19, 2004, entitled "Quantization Control for Variable Bit Depth" and in the United States non-provisional patent application S.N. 11/128,125, filed May 11, 2005, of the same inventors and bearing the same title, which non-provisional application claims priority of said S.N. 60/573,017 provisional application. Both said provisional and non- provisional applications of Gish and Vogt are hereby incorporated by reference in their entirety. The techniques of said provisional and non- provisional patent applications facilitate the interoperability of encoders and decoders operating at different bit depths, particularly the case of a decoder operating at a lower bit depth than the bit depth of an encoder. Some details of the techniques disclosed in said non-provisional and provisional applications of Gish and Vogt are published in a document that describes the quantization method adopted in the Fidelity Range Extensions to H.264: "Extended Sample Depth: Implementation and Characterization," Joint Video Team (JVT) of ISO/EC MPEG & ITU-T VCEG (ISO7IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6), Document JVT-H016, 8th Meeting: Geneva, Switzerland, 23-27-May, 2003, published on the world wide web at http ://ftρ3.itu.ch/av-arch/j vt-site/2003_05_Geneva/JVT- H016.doc,. Said JVT-HO 16 document is also hereby incorporated by reference in its entirety.
A goal of the present invention is to be able to decode a bitstream encoded at a high bit depth from a high bit depth input not only at that same high bit depth, but, alternatively, at a lower bit depth that provides decoded images bearing a reasonable approximation to the original high bit depth images. This would, for example, enable an 8-bit or 10-bit H.264 FRExt decoder to reasonably decode bitstreams that would conventionally require, respectively, a 10-bit or 12-bit H.264 FRExt decoder. Alternatively, this would enable a conventional 8-bit MPEG-2 decoder (as in FIG. 9 described below) to reasonably decode bitstreams produced by a modified MPEG-2 encoder such as described below in connection with FIG. 10a, which decoding would otherwise require the modified MPEG-2 decoder such as described below in connection with FIG. 10b. FIG. 3 shows that when a single bitstream encoded from a high bit depth source is decoded at the original high bit depth and at a lower bit depth, the lower bit depth decoding has some error, measured as MSE, with respect to the high bit depth reference. In the example of FIG. 3, the lower bit depth approximation is decoded as if the encoder bit depth were low, that is, it is a conventional decoder (see FIG. 6 below) or a conventional decoder employing the unbiased rounding aspects of the present invention (see FIG. 7 below).
While one would expect the decoded results at different bit depths to differ somewhat due to rounding error, the actual differences observed with prior art encoders and decoders tend to be much larger. Such large differences occur because the rounding errors will accumulate from prediction to prediction in a manner that is exacerbated by the way rounding is currently done. FIG. 4 shows a simplified diagram of the prediction loop that exists in both the encoder and decoder identifying the places where rounding occurs: calculating the prediction (intra and inter), the deblocking filter, and the residual decoding. One can see how errors will accumulate from prediction to prediction in the feedback loop formed by the Frame Store, Prediction, the adder, and the Deblocking Filter. As explained further below, the dominant sources of error are inter- and intra-prediction. The loop deblocking filter is optional and, along with the rounding in decoding, the residual will introduce smaller errors. The problem then is to minimize these errors so that the MSE between the high bit depth output and the lower bit depth approximation is minimized. The high bit depth decoding output is - error free with respect to the encoder since they both have the same high bit depth prediction loop. Therefore, a reduction in the MSE between it and the lower bit depth approximation indicates that the lower bit depth decoding more closely approximates the high bit depth decoding.
For the case of inter-prediction, rounded results from one frame are used to predict the image in another frame. Consequently, the error grows over successive frames because the feedback loop comprised of the frame store (buffer) and the prediction from the motion compensation filter accumulates errors. The result is that the MSE between the decoded frames of different bit depths shown in FIG. 3 increases at each predicted frame or macroblock. In the prior art such error that accumulates from frame to frame was first encountered in dealing with the allowable mismatch between IDCTs in MPEG-2. Because the error would grow from frame to frame it was called "drift." The intra-prediction modes in H.264 behave similarly; only in this case the rounded results for pixels are used to predict other neighboring pixels in the same frame. Both intra- and inter-prediction are identical in that the error accumulates from prediction to prediction and the form of the prediction calculations is the same. In both cases, the prediction is the rounded sum of integer values from the frame store weighted by fractional coefficients whose sum is 1. That is, the predicted value pred(x,y) is pred(x,y) = ∑c(i, j)FS(x', y') + U2
Σ Uj «CΛ=i (4) where FS(x ',y ') are Frame Store values and c(ij) are the weighting coefficients. The relationship between (x,y), (x',y '), and (ij) and the values for c(i,j) depend on the type of predictor: inter or a particular intra mode. Because the coefficients c(i,j) are fractional values, this calculation is typically performed using integer coefficients C(i,j) that sum to a power of two with a final right-shift to truncate the result to the final bit depth.
pred(x,y) = Y1C(Lj)FS(W) + I M-X »M
(5)
∑C(i,j) = 2 M u In this form, the number of fractional bits rounded away is M, so that the added 1A for rounding is scaled to 2M"\ This form is important not just because it is the most common form actually used, but because the value of M determines the severity of the rounding error (i.e., equation 9).
It is desirable that systems using different sample bit depths are as interoperable as possible. That is, one would like to be able to decode reasonably a bitstream regardless of the bit depth of the encoder or decoder. When the decoder has a bit depth equal to or larger than the bit depth of the input, it is trivial to mimic a decoder with the same bit depth as the encoder. When the decoder has a bit depth less than the encoder, there must be some loss, but the decoded results should have a PSNR appropriate for that lower bit depth, and, desirably, not less. Achieving interoperability between different bit depths requires careful attention to arithmetic details. United States Patent Application Publication US 2002/0154693 Al disclosed a method for improving coding accuracy and efficiency by performing all intermediate calculations with greater precision. Said published application is hereby incorporated by reference in its entirety. In general, reasonable and common approximations at a lower bit depth can become unacceptable when compared to calculations at a higher bit depth. An aspect of the present invention is directed to a method for improving the rounding in such intermediate calculations in order to minimize the error when decoding a bitstream at a lower bit depth than the input to the encoder.
Disclosure of the Invention
In one aspect, the present invention is directed to the reduction or minimization of the errors resulting from decoding at a lower bit depth a video bitstream that was encoded at a higher bit depth compared to decoding such a bitstream at the higher bit depth. In particular, it is shown that a major, if not the dominant, contribution to such errors is the simple, but biased, rounding that is used in prior art compression schemes. In accordance with an aspect of the present invention, unbiased rounding methods in the decoder, or, as may be appropriate, in both the decoder and the encoder, are employed to improve the overall accuracy resulting from decoding at lower bit depths than the bit depth of the encoder. Such results may be demonstrated by the reduction or minimization of the error between the decoded results at a bit depth that is the same as the bit depth of the encoder and at a lower bit depth. Other aspects of the invention may be appreciated as this document is read and understood.
Description of the Drawings
FIG. 1 is a schematic functional block diagram of an H.264 or H.264 FRExt video encoder.
FIG. 2 is a schematic functional block diagram of an H.264 or H.264 FRExt video decoder. FIG. 3 is a schematic functional block diagram of an arrangement for' comparing the quality of the outputs of two decoders.
FIG. 4 is a schematic functional block diagram of the prediction loop in an encoder and a decoder, identifying the places where rounding occurs.
FIG. 5 is a schematic functional block diagram of a motion compensation feedback loop (the deblocking filter and adder for the coded residual shown in FIG. 4 have been removed for simplicity).
FIG. 6 is a graphical representation showing the number of cumulative errors (vertical scale) versus video frame number (horizontal scale) for the case of a conventional decoder operating at a lower bit depth than the bit depth of the encoder with respect to a reference decoder (a decoder operating at the bit depth of the encoder).
FIG. 7 is a graphical representation showing the number of cumulative errors (vertical scale) versus video frame number (horizontal scale) for the case of a conventional decoder employing unbiased rounding operating at a lower bit depth than the bit depth of the encoder with respect to a reference decoder (a decoder operating at the bit depth of the encoder).
FIG. 8 is a representation of pixels in consecutive video lines, showing the pixels (unshaded) that may be used to predict another pixel (shaded). FIG. 9 is a schematic functional block diagram showing a prior art MPEG-2 encoder (FIG. 9a) and decoder (FIG. 9b).
FIG. 10 is a schematic functional block diagram of a modified MPEG- 2 encoder (FIG. 10a) and decoder (FIG. 10b). FIG. 11 is a comparison of 8-bit and 10-bit versions of the input, residual, transformed residual, and quantized transformed residual in MPEG- 2 type devices.
Best Mode For Carrying Out the Invention Fundamentals of Biased and Unbiased Rounding Aspects of the present invention propose the use of unbiased rounding in the decoder, or, as may be appropriate, in both the encoder and decoder, for video compression, particularly for inter- and intra-prediction, where the error tends to accumulate in the prediction loop. Thus, one may begin with an analysis of rounding methods and the errors they introduce. In particular,' the mean and variance of the error caused by rounding are of interest.
Because the calculations in video compression are typically performed with integers of different precision, the rounding of integers is of particular interest.
The most commonly employed rounding method adds 1A and then truncates the result. That is, given a (N+M)-bit value s where the binary point is between the N and M-bit portions, a rounded N-bit value r is given by r = s + l/2 (6) where the equal sign implies truncation. Let's suppose that M is 2. In this case there are four possibilities for the M fractional bits in s: III III III
Figure imgf000012_0001
Table 2 Biased rounding
That is, for .00 and .01, one rounds down and, for.10 and .11, one rounds up. The problem occurs for the 1A value for the fractional bits in s, which in this example is the .10 case. It is known (for example, in the field of numerical analysis) that rounding the 1A value requires special treatment. This is, although the .01 and .11 cases balance each other, there is nothing to balance the .10 case. This imbalance causes the mean error to be non-zero.
Because each of these four cases is equally probable, the error mean and variance are
I fn 1 1 1 1
-- U t —
41 4 2 4 8
(7)
1 1 1 1
^ =- 0 + — + -+— = —
4 ^ 16 4 \β) 32
The error variance, 3/32, is close to the variance for the continuous case, 1/12. Because the error mean is non-zero, this is called, "biased rounding." There is little that can be done to reduce the error variance as a non-zero error variance is unavoidable with rounding. However, there are known solutions for reducing the mean error to zero. When the fraction is exactly 1A, all of these solutions round up half the time and round down half the time. The decision to round up or down can be made in a number of ways, both deterministically and randomly. For example:
(a) Round to even: if the integer portion of s is odd round r up, otherwise down (b) Alternate: a one bit counter is incremented at each rounding, if the counter is 1 round up, otherwise, round down
(c) Random: pick a random number in [0,1], if this number is greater than 1A , round up, otherwise round down
With these methods, the possible outcomes shown in Table 2 become:
Figure imgf000013_0002
Table 3 Unbiased rounding
So that the mean error and variance are
Figure imgf000013_0001
Since this reduces the mean error to zero, it is called unbiased rounding.
While this is generally how the term unbiased rounding is used, there are known examples where the term is used differently. By unbiased rounding is meant rounding with special attention to the 1A value for the fractional portion so that it is rounded up and down with equal frequency. An example of prior art that uses the term unbiased rounding in the same way is published U. S. Patent Application 2003/0055860 Al by Giacalone et al entitled "Rounding Mechanisms in Processors". This application describes circuitry for the implementation of the "round to even" form of unbiased rounding when rounding 32-bit integers to 16-bits. On the other hand, U. S. Patent 5,930, 159 by Wong entitled "Right-Shifting an Integer Operand and Rounding a Fractional Intermediate Result to Obtain a Rounded Integer Result" describes what it characterizes as "unbiased" methods for "rounding" towards zero or towards infinity as described in the MPEG-I and MPEG-2 standards. However, the methods Wong describes are more appropriately viewed as truncation methods rather than rounding. Furthermore, they are unbiased only for an equal mix of positive and negative values; they are highly biased (as all truncation methods are) for non-negative values. Unbiased rounding, as used herein, is unbiased for positive and negative values separately and not just in combination. The magnitude of the error introduced by biased rounding depends on the number of fractional bits, M. In the example presented above, M is 2 and so the case that causes the bias occurs 25% of the time. If M is 1, this case occurs 50% of the time and so the mean error is twice as large. Analogously, if M is 3, this case occurs 12.5% of the time and so the mean error is half as much. Thus, in general, the mean error for biased rounding is
Figure imgf000014_0001
This result is somewhat counter-intuitive in that it shows that the mean error introduced by biased rounding is larger for less {i.e., smaller M) rounding. For the tests whose results are shown in FIG. 6 and FIG. 7, 10-bit per sample video is encoded at 10 bits using a modified MPEG-2 encoder as described in connection with FIG. 10a and then decoded in three ways: (1) a 10-bit decoding using a modified MPEG-2 decoder, as described in connection with FIG. 10b (this decoding is used as a reference for the two eight-bit decodings next described, in the manner of the FIG. 3 test arrangement), (2) an 8 bit decoding using a conventional MPEG-2 8 -bit decoder, as described in connection with FIG. 9b, and (3) an 8 bit decoding using an otherwise-conventional MPEG-2 8 -bit decoder (as in FIG. 9b) but which is modified to employ unbiased rounding in accordance with aspects of the present invention. The MSE for the 8 -bit decoder without unbiased rounding and for the 8 -bit decoder with unbiased rounding are each computed with reference to the 10 bit decoding in the manner as shown in FIG. 3. To bound the overall drift MSE, an I-frame is inserted by the modified MPEG-2 encoder every 48 frames. Comparing FIGS. 6 and 7 shows that unbiased rounding reduces the MSE by about a factor of four (75% reduction). Furthermore, the slightly quadratic growth in MSE (i.e., a positive second derivative) of FIG. 6 is replaced in FIG. 7 with a growth rate that is linear or even sub-linear. This is entirely due to using unbiased rounding to reduce to zero the mean error, the dominant (i.e., quadratic) term in equation (12) and (13).
Effect of Unbiased Rounding on Inter-Prediction (Motion Compensation) In general, unbiased rounding is superior to biased rounding because the mean error is reduced to zero while the variance remains unchanged. We will show that the effects of biased rounding are particularly detrimental in motion compensation because the feedback loop causes error to accumulate. FIG. 5 shows the essential components of such a motion compensation feedback loop (the deblocking filter and adder for the coded residual shown in FIG. 4 have been removed for simplicity). The frame store in FIG. 5 is initialized by some initial image. In common practice, this initial image corresponds to an intra-macroblock or intra-frame picture. The motion compensation filter interpolates a portion of the frame store displaced by the integer portion of a motion vector. This filter has the overall linear form shown in equations (4) and (5). The filter coefficients themselves are generally a windowed sine function with a phase determined by the fractional portion of the motion vector, and (x',y') is determined by the integer portion of the motion vector. Round-off error is unavoidable given the fractional coefficients c(i,j) or their integer version C(i,j). Only in the case that c(ij) were an integer would there be no round¬ off error.
Because of the feedback loop in FIG. 5, the error variance adds incoherently from iteration to iteration, but the mean error adds coherently so that the mean error eventually dominates the total mean-squared error (MSE) in the frame store. Table 4 (below) tabulates the relative contributions of the mean error and variance error to the overall MSE from iteration to iteration. Each iteration corresponds to the next P-frame or P-macroblock, i.e., one that is predicted from a previous frame or macroblock. When B-frames are used as reference frames, they also constitute an iteration. At the Kth iteration the cumulative mean error is
»-*(-i) do) the cumulative variance error is
σi-κ{U (u) and the resulting MSE is given by the well-known formula
MSE = m22 (12) which, for the case of M=2 (two bits of rounding), exemplified by equations (10) and (11), becomes
MSE = —K2 +—K (13)
64 32 V ' These equations show biased rounding is the asymptotically dominant
(i.e., quadratic in K) contributor to the overall MSE.
Ill III III
Figure imgf000017_0001
Table 4 Error growth in the prediction loop
Examining Table 4, one can see that initially the contribution from the mean error is 1/6 the contribution from the variance error. However, they are equal at the sixth iteration, and by the 32nd iteration the mean error is over 5 times the variance error.
Because the actual filtering in motion compensation is 2-dimensional, and the number of fractional bits rounded depends on codec-specific details, the foregoing examples are only illustrative. The iteration, where the mean error dominates, can vary from this simple example, but regardless of the details, the mean error dominates after a small number of iterations.
By changing to unbiased rounding the contribution from mean error can be reduced to zero. FIG. 6 and FIG. 7 show the growth of the MSE or drift error with biased rounding as in the prior art and unbiased rounding in accordance with the present invention, respectively, for decoding at 8 -bits a bitstream encoded from a 10-bit source using the modified version of MPEG-2 shown in FIG. 10(a).
Effect of Unbiased Rounding on Intra-Prediction H.264 and H.264 FRExt are unique among modern codecs in that they have many modes for intra-prediction. Most of these modes average a number of neighboring pixels (most commonly two or four) to arrive at an initial estimate for the given pixel. These averaging calculations have the same linear form shown in equations 4 and 5 with biased rounding. Because only a small number of values are combined, the error from biased rounding is particularly significant since this corresponds to M= 1,2 in Equation 6.
FIG. 8 shows the blocks (in white) that can influence the intra- predicted values for a given block (in black) in the H.264 and H.264 FRExt systems. Because these predictions can take place on blocks as small as 4x4 pixels, the error propagation for intra-prediction can occur over and over many times. For example, at the HDTV resolution of 1080x1920, there can be hundreds of iterations in both the horizontal and vertical directions. By comparison, the error propagation for inter-prediction shown in FIG. 6 and FIG. 7 was only for 16 iterations, and Table 4 only went up to 32 iterations. When one attempts to use a conventional 8-bit H.264 FRExt decoder to decode a bitstream generated by a 10-bit FRExt encoder the resultant images are recognizable but the colors are different. Even the very first I frame illustrates this because of rounding errors in intra-prediction. Furthermore, if one subtracts the 8-bit decoded image from the reference 10- bit decoded image, the error can be seen to propagate down and to the right as FIG. 8 suggests. Because the error for intra-prediction grows in a complex fashion over the two-dimensional image there is no simple plot of increasing error analogous to FIG. 6 and FIG. 7. However, the effects of unbiased rounded are the same. For example, unbiased rounding can reduce the MSE for the initial I-frame (which has only intra-prediction) from a low PSNR of around 20 dB5 to a high PSNR close to 50 dB.
Video compression techniques, such as MPEG-2, are widely deployed today. FIGS. 9a and 9b, respectively, show prior art implementations of an MPEG-2 encoder and decoder (b). In most commonly-used MPEG-2 video compression configurations, called profiles, video data having an input precision, or bit depth, of 8 bits is applied. This input precision subsequently determines the minimum precision of various internal variables used in compression. Thus, typically, input video with a precision, or bit depth, of 8 bits is applied to a subtractor("-"). The integer output of the subtractor also has 8 bits of precision, but since it can be negative, it requires a sign bit for a total of 9 bits which is shown as "s8" (signed 8). The difference output of the subtractor is called the "residual." This integer output is then applied to a 2-D DCT whose output requires three additional bits or 12 bits in a signed 11 bit ("si 1") format. These 12 bits are quantized and then entropy (variable length coding) ("VLC") coded with other parameters to produce an encoded bitstream. The quantized, transformed coefficients are also inverse quantized ("IQ"), inverse transformed ("IDCT"), and added (with saturation) to the same prediction used in the original subtraction. Note that this portion of the encoder mimics the decoder shown in FIG. 9b. Because the entropy coding ("VLC") and decoding ("VLD") are lossless, the quantized DCT coefficients input to the VLC are identical to those output from the VLD block. If the IDCTs in the decoder and encoder are identical, the decoded residual in the encoder and decoder are identical. The decoded residual is an approximation to the raw residual. By adding this decoded residual to the prediction and saturating to the original range ([0,255] for MPEG-2), one creates a decoded frame that is an approximation of the input frame. Such decoded frames are stored in a frame store ("FS") whose contents are the same (within IDCT error tolerances) in the encoder and decoder. The decoded frames are then used for creating a prediction to use in the original subtraction. Thus, in summary, a prior art MPEG-2 system has bit-depth precisions of
Input 8 bits (unsigned) Frame store (for prediction) 8 bits (unsigned)
Residual (input minus prediction) 9 bits (signed) Transformed residual 12 bits (signed)
Quantized data 12 bits (signed)
In the MPEG-2 modifications shown in FIGS. 10a and 10b, video sequences are encoded at a higher precision than in conventional MPEG-2 while maintaining compatibility with nominal 8 -bit streams. This is achieved by increasing the precision used to perform calculations so as to make optimal use of the precision carried by the transformed and quantized residuals. This is particularly applicable to MPEG-2, which uses 12 bits for the transformed and quantized residuals while the input video is only 8 bits. In the modifications of FIGS. 10 and 10b, the precision of all internal encoder and decoder calculations is increased by two bits, the input source has a bit depth that is two bits greater, and the quantized data precision remains the same, that is: Input 10 bits (unsigned)
Frame store (for prediction) 10 bits (unsigned)
Residual (input minus prediction) 11 bits (signed) Transformed residual 14 bits (signed)
Quantized data 12 bits (signed) Those portions of the encoder and decoder that are altered are enclosed by a dotted line in each of FIGS. 10a and 10b.
In addition, the quantization and inverse quantization (indicated by the *) are altered so that the scale of the quantized values does not change. Since the internal variables in the 10-bit encoder have two extra bits of precision, this change is an additional right shift of 2, or a division by 4, for quantization and an additional left shift of 2, or a multiplication by 4, for dequantization. Since 8 -bit quantization is simply a division by the quantization scale, QS5 the equivalent 10-bit quantization is simply a division by four times the quantization scale, or 4*QS. Similarly, since inverse quantization at 8 -bits is basically a multiplication by the quantization scale QS, at 10-bits we simply multiply by four times the quantization scale. Thus the changes required for Q* and IQ* are simply to alter the quantization scale, QS, according to the bit depth. Another modification of MPEG-2 encoders and decoders is described in International Publication Number WO 03/063491 A2, "Improved Compression Techniques," by Cotton and Knee of Snell & Wilcox Limited. According to the Cotton and Knee publication, the calculation precision in a video compression encoder and decoder are increased except for the precision of the frame store. Such an arrangement may also be useful for encoding when unbiased rounding is employed in an otherwise-conventional MPEG-2 decoder.
Summary Unbiased rounding has a significant effect on the error between high and low bit depth decoding of the same bitstream. Biased rounding creates both a mean and variance error. The mean error is coherent, grows rapidly (MSE growth is quadratic in K as shown by equations (12) and (13)) from prediction to prediction, and is quite visible. The variance error grows more slowly (MSE growth is linear) and is much less visible because it is random and has lower amplitude. Unbiased rounding is more accurate when rounding is required. In accordance with aspects of the present invention, in order to make lower bit depth calculations closer to the same calculations at a higher bit depth, unbiased rounding may be applied to calculations in the prediction loop, particularly inter- and intra- prediction. Implementation
The invention may be implemented in hardware or software, or a combination of both {e.g., programmable logic arrays). Unless otherwise specified, the algorithms included as part of the invention are not inherently related to any particular computer or other apparatus. In particular, various general-purpose machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct more specialized apparatus {e.g., integrated circuits) to perform the required method steps. Thus, the invention may be implemented in one or more computer programs executing on one or more programmable computer systems each comprising at least one processor, at least one data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device or port, and at least one output device or port. Program code is applied to input data to perform the functions described herein and generate output information. The output information is applied to one or more output devices, in known fashion.
Each such program may be implemented in any desired computer language (including machine, assembly, or high level procedural, logical, or object oriented programming languages) to communicate with a computer system. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferably stored on or downloaded to a storage media or device {e.g., solid state memory or media, or magnetic or optical media) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer system to perform the procedures described herein. The inventive system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer system to operate in a specific and predefined manner to perform the functions described herein.
A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention.

Claims

CMtMS
1. A method for decoding a digital bitstream representing data- compressed video encoded at a first bit depth, comprising decoding at a second, lower, bit depth, said decoding including the unbiased rounding of unsigned data in intermediate processing.
2. A method according to claim 1 wherein the decoding includes processing in a prediction loop and said processing includes said unbiased rounding of unsigned data.
3. A method according to claim 1 or claim 2 wherein the data- compressed video is represented in frames and said unbiased rounding of unsigned data includes the unbiased rounding of inter-frame and/or intra- frame data.
4. A method for encoding a digital bitstream representing data- compressed video, wherein the encoding includes the unbiased rounding of unsigned data in intermediate processing.
5. A method according to claim 4 wherein the encoding includes processing in a prediction loop and said processing includes said unbiased rounding of unsigned data.
6. A method according to claim 4 or claim 5 wherein the data- compressed video is represented in frames and said unbiased rounding of unsigned data includes the unbiased rounding of inter-frame and/or intra- frame data.
7. A method for encoding and decoding a digital bitstream representing data-compressed video, comprising encoding at a first bit depth, said encoding including the unbiased rounding of unsigned data in intermediate processing, and decoding at a second, lower, bit depth, said decoding including the unbiased rounding of unsigned data in intermediate processing.
8. A method according to claim 7wherein the encoding includes processing in a prediction loop and said processing includes said unbiased rounding of unsigned data and wherein the decoding includes processing in a prediction loop and said processing includes said unbiased rounding of unsigned data.
9. A method according to claim 7 or claim 8 wherein the data- compressed video is represented in frames and said unbiased rounding of unsigned data includes the unbiased rounding of inter-frame and/or intra- frame data.
10. Apparatus adapted to perform the methods of any one of claims 1 through 9.
11. A computer program, stored on a computer-readable medium for causing a computer to perform the methods of any one of claims 1 through 9.
12. A decoder for decoding a digital bitstream representing data- compressed video encoded at a first bit depth, comprising means for receiving the digital bitstream, and means for decoding at a second, lower, bit depth, which means includes means for the unbiased rounding of unsigned data in intermediate processing.
13. A decoder according to claim 12 wherein said means for decoding includes means for processing in a prediction loop and said means for processing includes said means for the unbiased rounding of unsigned data.
14. A decoder according to claim 13 or claim 14 wherein the data- compressed video is represented in frames and said means for unbiased rounding of unsigned data includes means for the unbiased rounding of inter- frame and/or intra-frame data.
15. An encoder for encoding a digital bitstream representing data- compressed video, comprising means for processing in a prediction loop, which processing includes the unbiased rounding of unsigned data in intermediate processing, and means for outputting said digital bitstream.
16. An encoder according to claim 15 wherein said means for encoding includes means for processing in a prediction loop and said means for processing includes said means for the unbiased rounding of unsigned data.
17. An encoder according to claim 15 or claim 16 wherein the data- compressed video is represented in frames and said means for unbiased rounding of unsigned data includes means for the unbiased rounding of inter- frame and/or intra-frame data.
18. A system for encoding and decoding a digital bitstream representing data-compressed video, comprising means for encoding at a first bit depth, said encoding including means for processing in a prediction loop, which means for processing includes means for the unbiased rounding of unsigned data in intermediate processing, and means for decoding at a second, lower, bit depth, said means for decoding including means for processing in a prediction loop, which means for processing includes means for the unbiased rounding of unsigned data in intermediate processing.
19. A system according to claim 18 wherein the means for encoding includes means for processing in a prediction loop and said means for processing includes said unbiased rounding of unsigned data and wherein the decoding includes processing in a prediction loop and said processing includes said means for the unbiased rounding of unsigned data.
20. A system according to claim 18 or claim 19 wherein the data- compressed video is represented in frames and said means for unbiased rounding of unsigned data includes means for the unbiased rounding of inter- frame and/or intra-frame data.
PCT/US2005/024552 2004-07-13 2005-07-12 Unbiased rounding for video compression WO2006017230A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
KR1020067025385A KR20070033343A (en) 2004-07-13 2005-07-12 Unbiased rounding for video compression
JP2007521538A JP2008507206A (en) 2004-07-13 2005-07-12 Uneven rounding for image compression
CA002566349A CA2566349A1 (en) 2004-07-13 2005-07-12 Unbiased rounding for video compression
EP05770121A EP1766995A1 (en) 2004-07-13 2005-07-12 Unbiased rounding for video compression
US11/632,365 US20080075166A1 (en) 2004-07-13 2005-07-12 Unbiased Rounding for Video Compression

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US58769904P 2004-07-13 2004-07-13
US60/587,699 2004-07-13

Publications (1)

Publication Number Publication Date
WO2006017230A1 true WO2006017230A1 (en) 2006-02-16

Family

ID=34975183

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/024552 WO2006017230A1 (en) 2004-07-13 2005-07-12 Unbiased rounding for video compression

Country Status (7)

Country Link
US (1) US20080075166A1 (en)
EP (1) EP1766995A1 (en)
JP (1) JP2008507206A (en)
KR (1) KR20070033343A (en)
CN (1) CN100542289C (en)
CA (1) CA2566349A1 (en)
WO (1) WO2006017230A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008130367A1 (en) * 2007-04-19 2008-10-30 Thomson Licensing Adaptive reference picture data generation for intra prediction
WO2011127964A3 (en) * 2010-04-13 2012-05-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus for intra predicting a block, apparatus for reconstructing a block of a picture, apparatus for reconstructing a block of a picture by intra prediction
JP2012191642A (en) * 2006-03-30 2012-10-04 Toshiba Corp Image decoding apparatus and method
WO2014165958A1 (en) 2013-04-08 2014-10-16 Blackberry Limited Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7949044B2 (en) * 2005-04-12 2011-05-24 Lsi Corporation Method for coefficient bitdepth limitation, encoder and bitstream generation apparatus
JP2008544598A (en) * 2005-06-10 2008-12-04 エヌエックスピー ビー ヴィ Alternate up and down motion vectors
KR100813258B1 (en) * 2005-07-12 2008-03-13 삼성전자주식회사 Apparatus and method for encoding and decoding of image data
KR101045205B1 (en) * 2005-07-12 2011-06-30 삼성전자주식회사 Apparatus and method for encoding and decoding of image data
KR101365597B1 (en) * 2007-10-24 2014-02-20 삼성전자주식회사 Video encoding apparatus and method and video decoding apparatus and method
US9338475B2 (en) * 2008-04-16 2016-05-10 Intel Corporation Tone mapping for bit-depth scalable video codec
US9378751B2 (en) * 2008-06-19 2016-06-28 Broadcom Corporation Method and system for digital gain processing in a hardware audio CODEC for audio transmission
BRPI1008372A2 (en) * 2009-02-11 2018-03-06 Thomson Licensing methods and apparatus for bit depth scalable video encoding and decoding using tone mapping and reverse tone mapping
KR101510108B1 (en) 2009-08-17 2015-04-10 삼성전자주식회사 Method and apparatus for encoding video, and method and apparatus for decoding video
US9521434B2 (en) 2011-06-09 2016-12-13 Qualcomm Incorporated Internal bit depth increase in video coding
KR101307257B1 (en) * 2012-06-28 2013-09-12 숭실대학교산학협력단 Apparatus for video intra prediction
US20140301447A1 (en) * 2013-04-08 2014-10-09 Research In Motion Limited Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded
US9674538B2 (en) * 2013-04-08 2017-06-06 Blackberry Limited Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded
WO2014165960A1 (en) * 2013-04-08 2014-10-16 Blackberry Limited Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded
CN109417629B (en) * 2016-07-12 2023-07-14 韩国电子通信研究院 Image encoding/decoding method and recording medium therefor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998046025A1 (en) * 1997-04-04 1998-10-15 Snell & Wilcox Limited Digital video signal processing for signals of low amplitude resolution
US20030055860A1 (en) * 1998-10-06 2003-03-20 Jean-Pierre Giacalone Rounding mechanisms in processors
US6728317B1 (en) * 1996-01-30 2004-04-27 Dolby Laboratories Licensing Corporation Moving image compression quality enhancement using displacement filters with negative lobes

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08256341A (en) * 1995-03-17 1996-10-01 Sony Corp Image signal coding method, image signal coder, image signal recording medium and image signal decoder
US5696710A (en) * 1995-12-29 1997-12-09 Thomson Consumer Electronics, Inc. Apparatus for symmetrically reducing N least significant bits of an M-bit digital signal
JPH1022836A (en) * 1996-07-02 1998-01-23 Sony Corp Bit-rounding device
DE69808519T2 (en) * 1997-06-09 2003-06-26 Hitachi, Ltd. Image sequence coding method
JP2998741B2 (en) * 1997-06-09 2000-01-11 株式会社日立製作所 Moving picture encoding method, computer-readable recording medium on which the method is recorded, and moving picture encoding apparatus
JPH1169345A (en) * 1997-06-11 1999-03-09 Fujitsu Ltd Inter-frame predictive dynamic image encoding device and decoding device, inter-frame predictive dynamic image encoding method and decoding method
US6038576A (en) * 1997-12-02 2000-03-14 Digital Equipment Corporation Bit-depth increase by bit replication
US6334189B1 (en) * 1997-12-05 2001-12-25 Jamama, Llc Use of pseudocode to protect software from unauthorized use
JP2000023195A (en) * 1998-06-26 2000-01-21 Sony Corp Image encoding device and method, image decoding device and method and encoded data providing medium
US7162080B2 (en) * 2001-02-23 2007-01-09 Zoran Corporation Graphic image re-encoding and distribution system and method
JP4917724B2 (en) * 2001-09-25 2012-04-18 株式会社リコー Decoding method, decoding apparatus, and image processing apparatus
JP4082025B2 (en) * 2001-12-18 2008-04-30 日本電気株式会社 Method and apparatus for re-encoding compressed video
US8009739B2 (en) * 2003-09-07 2011-08-30 Microsoft Corporation Intensity estimation/compensation for interlaced forward-predicted fields
US7623574B2 (en) * 2003-09-07 2009-11-24 Microsoft Corporation Selecting between dominant and non-dominant motion vector predictor polarities
US7440633B2 (en) * 2003-12-19 2008-10-21 Sharp Laboratories Of America, Inc. Enhancing the quality of decoded quantized images

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6728317B1 (en) * 1996-01-30 2004-04-27 Dolby Laboratories Licensing Corporation Moving image compression quality enhancement using displacement filters with negative lobes
WO1998046025A1 (en) * 1997-04-04 1998-10-15 Snell & Wilcox Limited Digital video signal processing for signals of low amplitude resolution
US20030055860A1 (en) * 1998-10-06 2003-03-20 Jean-Pierre Giacalone Rounding mechanisms in processors

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DANIAN GONG, YUN HE, ZHIGANG CAO, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 14, no. 4, April 2004 (2004-04-01), pages 405 - 415, XP002346475 *
KAR-LIK WONG ET AL: "High performance IDCT realization using complex arithmetic", 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. PROCEEDINGS. (ICASSP). HONG KONG, APRIL 6 - 10, 2003, IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), NEW YORK, NY : IEEE, US, vol. VOL. 1 OF 6, 6 April 2003 (2003-04-06), pages 313 - 316, XP010640944, ISBN: 0-7803-7663-3 *
WALTER GISH, HAOPING YU: "Extended Sample Depth: Implementation and Characterization", JOINT VIDEO TEAM OF ISO/IEC MPEG & ITU-T VCEG 8TH MEETING, 23 May 2003 (2003-05-23) - 27 May 2003 (2003-05-27), geeva switzerland, pages 1 - 14, XP002346476 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012191642A (en) * 2006-03-30 2012-10-04 Toshiba Corp Image decoding apparatus and method
JP5254004B2 (en) * 2006-03-30 2013-08-07 株式会社東芝 Image coding apparatus and method
US8606028B2 (en) 2006-03-30 2013-12-10 Kabushiki Kaisha Toshiba Pixel bit depth conversion in image encoding and decoding
WO2008130367A1 (en) * 2007-04-19 2008-10-30 Thomson Licensing Adaptive reference picture data generation for intra prediction
JP2010525658A (en) * 2007-04-19 2010-07-22 トムソン ライセンシング Adaptive reference image data generation for intra prediction
WO2011127964A3 (en) * 2010-04-13 2012-05-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus for intra predicting a block, apparatus for reconstructing a block of a picture, apparatus for reconstructing a block of a picture by intra prediction
US9344744B2 (en) 2010-04-13 2016-05-17 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus for intra predicting a block, apparatus for reconstructing a block of a picture, apparatus for reconstructing a block of a picture by intra prediction
WO2014165958A1 (en) 2013-04-08 2014-10-16 Blackberry Limited Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded
EP2984835A4 (en) * 2013-04-08 2016-11-16 Blackberry Ltd Methods for reconstructing an encoded video at a bit-depth lower than at which it was encoded

Also Published As

Publication number Publication date
CN1973549A (en) 2007-05-30
US20080075166A1 (en) 2008-03-27
KR20070033343A (en) 2007-03-26
CN100542289C (en) 2009-09-16
CA2566349A1 (en) 2006-02-16
EP1766995A1 (en) 2007-03-28
JP2008507206A (en) 2008-03-06

Similar Documents

Publication Publication Date Title
US20080075166A1 (en) Unbiased Rounding for Video Compression
CN104811714B (en) Use the enhancing intraframe predictive coding of plane expression
WO2018061588A1 (en) Image encoding device, image encoding method, image encoding program, image decoding device, image decoding method, and image decoding program
US10735746B2 (en) Method and apparatus for motion compensation prediction
WO2010001999A1 (en) Dynamic image encoding/decoding method and device
WO2010001614A1 (en) Video image encoding method, video image decoding method, video image encoding apparatus, video image decoding apparatus, program and integrated circuit
RU2665309C2 (en) Data encoding and decoding
WO2009087095A1 (en) Encoding filter coefficients
JPH11252573A (en) Hierarchical image coding system and hierarchical image decoding system
KR20050105222A (en) Video coding
US8194748B2 (en) Apparatus for scalable encoding/decoding of moving image and method thereof
JP2004032718A (en) System and method for processing video frame by fading estimation/compensation
JP2011166592A (en) Image encoding device, and image decoding device
JP6708211B2 (en) Moving picture coding apparatus, moving picture coding method, and recording medium storing moving picture coding program
CN113132731A (en) Video decoding method, device, equipment and storage medium
JP2022093657A (en) Encoding device, decoding device, and program
JP7444599B2 (en) Intra prediction device, image encoding device, image decoding device, and program
KR20240089011A (en) Video coding using optional neural network-based coding tools
AU2019203981A1 (en) Method, apparatus and system for encoding and decoding a block of video samples
ALVAR Intra prediction with 3-tap filters for lossless and lossy video coding

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2566349

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 1020067025385

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 200580020485.X

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2005770121

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007521538

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWP Wipo information: published in national office

Ref document number: 1020067025385

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2005770121

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 11632365

Country of ref document: US