AU2008201106C1

AU2008201106C1 - Method and bitstream for variable accuracy inter-picture timing specification for digital video encoding

Info

Publication number: AU2008201106C1
Application number: AU2008201106A
Authority: AU
Inventors: Adriana Dumitras; Barin G. Haskell; Atul Puri; David W. Singer
Original assignee: Apple Inc
Current assignee: Apple Inc
Priority date: 2002-07-15
Filing date: 2008-03-07
Publication date: 2011-06-09
Anticipated expiration: 2023-07-11
Also published as: AU2011202000B2; AU2008201106A1; AU2008201106B2; AU2011202000A1

Description

1 AUSTRALIA FB RICE & CO Patent and Trade Mark Attorneys Patents Act 1990 APPLE COMPUTER, INC. COMPLETE SPECIFICATION STANDARD PATENT Invention Title: Method and bitstream for variable accuracy inter-picture timing specification for digital video encoding The following statement is a full description of this invention including the best method of performing it known to us:- 2 FIELD OF THE INVENTION The present invention relates to the field of multimedia compression systems. In particular the present invention discloses methods and bitstreams for specifying variable accuracy inter-picture timing. 5 BACKGROUND OF THE INVENTION Digital based electronic media formats are finally on the cusp of largely replacing analog electronic media formats. Digital compact discs (CDs) replaced analog vinyl records long ago. Analog magnetic cassette tapes are becoming 10 increasingly rare. Second and third generation digital audio systems such as Mini-discs and MP3 (MPEG Audio - layer 3) are now taking market share from the first generation digital audio format of compact discs. The video media has been slower to move to digital storage and transmission 15 formats than audio. This has been largely due to the massive amounts of digital information required to accurately represent video in digital form. The massive amounts of digital information needed to accurately represent video require very high capacity digital storage systems and high-bandwidth transmission systems. 20 However, video is now rapidly moving to digital storage and transmission formats. Faster computer processors, high-density storage systems, and new efficient compression and encoding algorithms have finally made digital video practical at consumer price points. The DVD (Digital Versatile Disc), a digital video system, has been one of the fastest selling consumer electronic products in years. DVDs have been 25 rapidly supplanting Video-Cassette Recorders (VCRs) as the pre-recorded video playback system of choice due to their high video quality, very high audio quality, convenience, and extra features. The antiquated analog NTSC (National Television Standards Committee) video transmission system is currently in the process of being replaced with the digital ATSC (Advanced Television Standards Committee) video 30 transmission system. Computer systems have been using various different digital video encoding formats for a number of years. Among the best digital video compression and encoding systems used by computer systems have been the digital video systems backed by the 35 Motion Pictures Expert Group commonly known by the acronym MPEG. The three most well known and highly used digital video formats from MPEG are known simply 3 as MPEG-1, MPEG-2, and MPEG-4. VideoCDs (VCDs) and early consumer-grade digital video editing systems use the early MPEG-1 digital video encoding format. Digital Versatile Discs (DVDs) and the Dish Network brand Direct Broadcast Satellite (DBS) television broadcast system use the higher quality MPEG-2 digital video 5 compression and encoding system. The MPEG-4 encoding system is rapidly being adapted by the latest computer based digital video encoders and associated digital video players. The MPEG-2 and MPEG-4 standards compress a series of video frames or video 10 fields and then encode the compressed frames or fields into a digital bitstream. When encoding a video frame or field with the MPEG-2 and MPEG-4 systems, the video frame or field is divided into a rectangular grid of macroblocks. Each macroblock is independently compressed and encoded. 15 When compressing a video frame or field, the NIPEG-4 standard may compress the frame or field into one of three types of compressed frames or fields: Intra-frames (I-frames), Unidirectional Predicted frames (P-frames), or Bi-Directional Predicted frames (B-frames). Intra-frames completely independently encode an independent video frame with no reference to other video frames. P-frames define a video frame 20 with reference to a single previously displayed video frame. B-frames define a video frame with reference to both a video frame displayed before the current frame and a video frame to be displayed after the current frame. Due to their efficient usage of redundant video information, P-frames and B-frames generally provide the best compression. 25 Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the 30 field relevant to the present invention as it existed before the priority date of each claim of this application. SUMMARY OF THE INVENTION Throughout this specification the word "comprise", or variations such as 35 "comprises" or "comprising", will be understood to imply the inclusion of a stated 4 element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps. A bitstream is provided comprising: 5 an encoded first video picture; an encoded second video picture; and an integer value that is based on an exponent of a representation of an order value, said order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures. 10 A further bitstream is provided comprising: a first video picture comprising a B-frame; and a second video picture that is based on the first video picture. 15 A still further bitstream is provided comprising: a first video picture comprising a P-frame; a second video picture that is based the first video picture, wherein the second video picture comprises a P-frame. 20 A still further method is provided comprising: receiving a first video picture and a second video picture, wherein the first video picture comprises a B-frame; and decoding the second video picture based on the first video picture. 25 A still further method is provided comprising: receiving a first video picture and a second video picture, wherein the first and second video pictures comprise a P-frame; and decoding the second video picture based on the first video picture. 30 A method of encoding a plurality of video pictures is provided, the method comprising: encoding a first video picture and a second video picture; and encoding an integer value that is based on an exponent of a representation of an order value, the order value representative of a position of the second video picture 35 with reference to the first picture in a sequence of video pictures.

5 A still further method is provided comprising: receiving a first video picture; receiving an order value relating said first video picture to a second video picture; and 5 calculating a first motion vector for the first video picture by using said order value. A computer readable medium storing a computer program that is executable by at least one processor is provided, the computer program comprising sets of instructions 10 for implementing the method according to any one of the fourth, fifth, sixth and seventh aspects of the invention, or any preferred embodiments of each of the aspects. A computer system is provided comprising means for implementing steps according to any one of the fifth, sixth and seventh aspects of the invention, or any one 15 of the preferred embodiments of each of the aspects. A method for decoding a plurality of video pictures is provided, the method comprising: receiving an encoded first video picture, an encoded second video picture and an 20 integer value for the second video picture, said integer value based on an exponent of a representation of an order value, the order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures; and by a decoder, decoding the second video picture by using the order value. 25 A method for encoding a plurality of video pictures is provided, the method comprising: encoding a plurality of video pictures; encoding a plurality of order values, each order value specifying a position of a 30 video picture in a sequence of video pictures; and storing the encoded video pictures and the encoded order values in a bitstream, wherein a particular order value is a power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the particular order value.

5a A further method for decoding a plurality of video pictures is provided, the method comprising: receiving a bitstream comprising a plurality of encoded video pictures and a plurality of encoded values, each encoded value representing an order value, wherein a 5 particular order value is a power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the power of two value; and decoding the bitstream. A storage medium storing a bitstream is provided, the bitstream comprising: 10 a plurality of encoded video pictures; and a plurality of encoded order values, , each order value specifying a position of a video picture in a sequence of video pictures, wherein a particular order value is a power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the power of two value. 15 A method is provided comprising: encoding a first video picture in a bitstream; encoding a second video picture in the bitstream, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header; and 20 encoding an order value in each of the slice headers of the encoded second video picture, the order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures, wherein the second video picture is encoded based on the order value. 25 A storage medium for storing a bitstream is provided, the bitstream comprising: an encoded first video picture; an encoded second video picture, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header; and an encoded order value, the order value encoded in each of the encoded second 30 video picture slice headers, the order value specifying a positional relationship between said second video picture and said first video picture. A method for decoding a plurality of video pictures of a video sequence is provided, the method comprising: 35 receiving a bitstream comprising an encoded first video picture, an encoded second video picture, and an encoded order value for the second video picture, the 5b order value representative of a position of the second video picture with reference to the first video picture in the video sequence, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header, said order value encoded in each of the encoded second video picture slice headers; and 5 decoding the second video picture by using said order value. Methods and apparatus for variable accuracy inter-picture timing specification for digital video encoding are disclosed. A system is disclosed that allows the relative timing of nearby video pictures to be encoded in a very efficient manner. In one 10 embodiment, the display time difference between a current video picture and a nearby video picture is determined. The display time difference is then encoded into a digital representation of the video picture. In a preferred embodiment, the nearby video picture is the most recently transmitted stored picture. 15 For coding efficiency, the display time difference may be encoded using a variable length coding system or arithmetic coding. In an alternate embodiment, the display time difference is encoded as a power of two to reduce the number of bits transmitted. 20 Other features, and advantages of present invention will be apparent from the company drawings and from the following detailed description. BRIEF DESCRIPTION OF THE DRAWINGS The features, and advantages of the present invention will be apparent to one 25 skilled in the art, in view of the following detailed description in which: 6 Figure 1 illustrates a high-level block diagram of one possible digital video encoder system. Figure 2 illustrates a serious of video pictures in the order that the pictures 5 should be displayed wherein the arrows connecting different pictures indicate inter picture dependency created using motion compensation. Figure 3 illustrates the video pictures from Figure 2 listed in a preferred transmission order of pictures wherein the arrows connecting different pictures indicate 10 inter-picture dependency created using motion compensation. Figure 4 graphically illustrates a series of video pictures wherein the distances between video pictures that reference each other are chosen to be powers of two. 15 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT A method and system for specifying Variable Accuracy Inter-Picture Timing in a multimedia compression and encoding system is disclosed. In the following description, for purposes of explanation, specific nomenclature is set forth to provide a thorough understanding of the present invention. However, it will be apparent to one 20 skilled in the art that these specific details are not required in order to practice the present invention. For example, the present invention has been described with reference to the MPEG-4 multimedia compression and encoding system. However, the same techniques can easily be applied to other types of compression and encoding systems. 25 Multimedia Compression and Encoding Overview Figure 1 illustrates a high-level block diagram of a typical digital video encoder 100 as is well known in the art. The digital video encoder 100 receives an incoming 30 video stream of video frames 105 at the left of the block diagram. Each video frame is processed by a Discrete Cosine Transformation (DCT) unit 110. The frame may be processed independently (an intra-frame) or with reference to information from other frames received from the motion compensation unit (an inter-frame). Next, a Quantizer (Q) unit 120 quantizes the information from the Discrete Cosine Transformation unit 35 110. Finally, the quantized video frame is then encoded with an entropy encoder (H) 7 unit 180 to produce an encoded bitstream. The entropy encoder (H) unit 180 may use a variable length coding (VLC) system. Since an inter-frame encoded video frame is defined with reference to other 5 nearby video frames, the digital video encoder 100 needs to create a copy of how decoded each frame will appear within a digital video decoder such that inter-frames may be encoded. Thus, the lower portion of the digital video encoder 100 is actually a digital video decoder system. Specifically, an inverse quantizer (Q- 1 ) unit 130 reverses the quantization of the video frame information and an inverse Discrete Cosine 10 Transformation (DCT-') unit 140 reverses the Discrete Cosine Transformation of the video frame information. After all the DCT coefficients are reconstructed from iDCT, the motion compensation unit will use the information, along with the motion vectors, to reconstruct the encoded frame which is then used as the reference frame for the motion estimation of the next frame. 15 The decoded video frame may then be used to encode inter-frames (P-frames or B-frames) that are defined relative to information in the decoded video frame. Specifically, a motion compensation (MC) unit 150 and a motion estimation (ME) unit 160 are used to determine motion vectors and generate differential values used to 20 encode inter-frames. A rate controller 190 receives information from many different components in a digital video encoder 100 and uses the information to allocate a bit budget for each video frame. The rate controller 190 should allocate the bit budget in a manner that 25 will generate the highest quality digital video hit stream that that complies with a specified set of restrictions. Specifically, the rate controller 190 attempts to generate the highest quality compressed video stream without overflowing buffers (exceeding the amount of available memory in a decoder by sending more information than can be stored) or underflowing buffers (not sending video frames fast enough such that a 30 decoder runs out of video frames to display). Multimedia Compression and Encoding Overview In some video signals the time between successive video pictures (frames or 35 fields) may not be constant. (Note: This document will use the term video pictures to generically refer to video frames or video fields.) For example, some video pictures 8 may be dropped because of transmission bandwidth constraints. Furthermore, the video timing may also vary due to camera irregularity or special effects such as slow motion or fast motion. In some video streams, the original video source may simply have non-uniform inter-picture times by design. For example, synthesized video such 5 as computer graphic animations may have non-uniform timing since no arbitrary video timing is created by a uniform video capture system such as a video camera system. A flexible digital video encoding system should be able to handle non-uniform timing. Many digital video encoding systems divide video pictures into a rectangular 10 grid of macroblocks. Each individual macroblock from the video picture is independently compressed and encoded. In some embodiments, sub-blocks of macroblocks known as 'pixelblocks' are used. Such pixel blocks may have their own motion vectors that may be interpolated. This document will refer to macroblocks although the teachings of the present invention may be applied equally to both 15 macroblocks and pixelblocks. Some video coding standards, e.g., ISO MPEG standards or the ITU H.264 standard, use different types of predicted macroblocks to encode video pictures. In one scenario, a macroblock may be one of three types: 20 1. I-macroblock - An Intra (I) macroblock uses no information from any other video pictures in its coding (it is completely self-defined); 2. P-macroblock - A unidirectionally predicted (P) macroblock refers to 25 picture information from one preceding video picture; or 3. B-macroblock - A bi-directional predicted (B) macroblock uses information from one preceding picture and one future video picture. 30 If all the macroblocks in a video picture are Intra-macroblocks, then the video picture is an Intra-frame. If a video picture only includes unidirectional predicted macro blocks or infra-macroblocks, then the video picture is known as a P-frame. If the video picture contains any bi-directional predicted macroblocks, then the video picture is known as a B-frame. For the simplicity, this document will consider the case 35 where all macroblocks within a given picture are of the same type.

9 An example sequence of video pictures to be encoded might be represented as Ii B 2

B

3

B

4

P

5

B

6

B

7

B

8

B

9 Plo B 11

P

12

B

13 114 ... 5 where the letter (I, P, or B) represents if the video picture is an I-frame, P-frame, or B frame and the number represents the camera order of the video picture in the sequence of video pictures. The camera order is the order in which a camera recorded the video pictures and thus is also the order in which the video pictures should be displayed (the display order). 10 The previous example series of video pictures is graphically illustrated in Figure 2. Referring to Figure 2, the arrows indicate that macroblocks from a stored picture (I frame or P-frame in this case) are used in the motion compensated prediction of other pictures. 15 In the scenario of Figure 2, no information from other pictures is used in the encoding of the intra-frame video picture I,. Video picture Ps is a P-frame that uses video information from previous video picture I in its coding such that an arrow is drawn from video picture 1i to video picture P 5 . Video picture B 2 , video picture B 3 , 20 video picture B 4 all use information from both video picture I1 and video picture P 5 in their coding such that arrows are drawn from video picture I and video picture P 5 to video picture B 3 , video picture B 3 , and video picture B 4 . As stated above the inter picture times are, in general, not the same. 25 Since B-pictures use information from future pictures (pictures that will be displayed later), the transmission order is usually different than the display order. Specifically, video pictures that are needed to construct other video pictures should be transmitted first. For the above sequence, the transmission order might be 30 I P 5

B

2

B

3

B

4 Pio B 6

B

7

B

8

B

9

P

12

B

1 1 114 B 13 ... Figure 3 graphically illustrates the above transmission order of the video pictures from Figure 2. Again, the arrows in the figure indicate that macroblocks from a stored video picture (I or P in this case) are used in the motion compensated 35 prediction of other video pictures.

10 Referring to Figure 3, the system first transmits I-frame 1i which does not depend on any other frame. Next, the system transmits P-frame video picture P 5 that depends upon video picture I,. Next, the system transmits B-frame video picture B 2 after video picture P 5 even though video picture B 2 will be displayed before video 5 picture P 5 . The reason for this is that when it comes time to decode B 2 , the decoder will have already received and stored the information in video pictures 1I and P 5 necessary to decode video picture B 2 . Similarly, video pictures 1i and P 5 are ready to be used to decode subsequent video picture B 3 and video picture B 4 . The receiver/decoder reorders the video picture sequence for proper display. In this 10 operation I and P pictures arc often referred to as stored pictures. The coding of the P-frame pictures typically utilizes Motion Compensation, wherein a Motion Vector is computed for each macroblock in the picture. Using the computed motion vector, a prediction macroblock (P-macroblock) can be formed by 15 translation of pixels in the aforementioned previous picture. The difference between the actual macroblock in the P-frame picture and the prediction macroblock is then coded for transmission. Each motion vector may also be transmitted via predictive coding. For example, 20 a motion vector prediction may be formed using nearby motion vectors. In such a case, then the difference between the actual motion vector and the motion vector prediction is coded for transmission. Each B-macroblock uses two motion vectors: a first motion vector referencing 25 the aforementioned previous video picture and a second motion vector referencing the future video picture. From these two motion vectors, two prediction macroblocks are computed. The two predicted macroblocks are then combined together, using some function, to form a final predicted macroblock. As above, the difference between the actual macroblock in the B-frame picture and the final predicted macroblock is then 30 encoded for transmission. As with P-macroblocks, each motion vector (MV) of a B-macroblock may be transmitted via predictive coding. Specifically, a predicted motion vector is formed using nearby motion vectors. Then, the difference between the actual motion vector 35 and the predicted is coded for transmission.

1 However, with B-macroblocks the opportunity exists for interpolating motion vectors from motion vectors in the nearest stored picture macroblock. Such interpolation is carried out both in the digital video encoder and the digital 'video decoder. 5 This motion vector interpolation works particularly well on video pictures from a video sequence where a camera is slowly panning across a stationary background. In fact, such motion vector interpolation may be good enough to be used alone. Specifically, this means that no differential information needs be calculated or 10 transmitted for these B-macroblock motion vectors encoded using interpolation. To illustrate further, in the above scenario let us represent the inter-picture display time between pictures i and j as Dij, i.e., if the display times of the pictures are T, and Tj, respectively, then 15 Di= Ti - Tj from which it follows that Di,k =Di + Dj,k

D

i ,k =-D,i 20 Note that Dij may be negative in some cases. Thus, if MV 5

,

1 is a motion vector for a P 5 macroblock as referenced to 1I, then for the corresponding macroblocks in B 2 , B 3 and B 4 the motion vectors as referenced to 1I and P 5 , respectively, would be interpolated by 25

MV

2

,

1 = MV5,i*D2,i/D5,1

MV

5

,

2 = MV5,i*Ds,2/D5,1

MV

3

,

1 = MV5,i*D3,i/Dsi 30 MV 5

,

3 = MV5,i*Ds,3/Dsi

MV

4

,

1 = MV5, 1

*D

4 ,i/Ds, 1

MV

5

,

4 = MV5, 1 *Ds, 4 /Ds, 1 12 Note that since ratios of display times arc used for motion vector prediction, absolute display times are not needed. Thus, relative display times may be used for Dij display time values. 5 This scenario may be generalized, as for example in the H.264 standard. In the generalization, a P or B picture may use any previously transmitted picture for its motion vector prediction. Thus, in the above case picture B 3 may use picture 11 and picture B 2 in its prediction. Moreover, motion vectors may be extrapolated, not just interpolated. Thus, in this case we would have: 10

MV

3

,

1 = MV 2

,

1 *D3,i/D2,1 Such motion vector extrapolation (or interpolation) may also he used in the prediction process for predictive coding of motion vectors. 15 In any event, the problem in the case of non-uniform inter-picture times is to transmit the relative display time values of Dij to the receiver, and that is the subject of the present invention. In one embodiment of the present invention, for each picture after the first picture we transmit the display time difference between the current 20 picture and the most recently transmitted stored picture. For error resilience, the transmission could be repeated several times within the picture, e.g., in the so-called slice headers of the MPEG or H.264 standards. If all slice headers are lost, then presumably other pictures that rely on the lost picture for decoding information cannot be decoded either. 25 Thus, in the above scenario we would transmit the following:

D

5

,

1

D

2

,

5

D

3

,

5

D

4

,

5

DIO,

5

D

6

,

1 o D 7

,

1 o D 8

,

1 o D 9

,

1 o D 1 2 ,jo Du 1,12 D 14

,

1 2

D

1 3

,

1 4 30 For the purpose of motion vector estimation, the accuracy requirements for Dij may vary from picture to picture. For example, if there is only a single B-frame picture B 6 halfway between two P-frame pictures Ps and P 7 , then it suffices to send only: D7,5 = 2 and D 6

,

7 - -1 35 13 Where the Dij display time values are relative time values. If, instead, video picture B 6 is only one quarter the distance between video picture Ps and video picture P 7 then the appropriate Dij display time values to send would be: 5 D7,5 = 4 and D 6

,

7 = -1 Note that in both of the two preceding examples, the display time between the video picture B 6 and video picture video picture P 7 is being used as the display time "unit" and the display time difference between video picture P 5 and picture video picture P 7 is 10 four display time "units". In general, motion vector estimation is less complex if divisors are powers of two. This is easily achieved in our embodiment if Dij (the inter-picture time) between two stored pictures is chosen to be a power of two as graphically illustrated in Figure 4. 15 Alternatively, the estimation procedure could be defined to truncate or round all divisors to a power of two. In the case where an inter-picture time is to be a power of two, the number of data bits can be reduced if only the integer power (of two) is transmitted instead of the 20 full value of the inter-picture time. Figure 4 graphically illustrates a case wherein the distances between pictures are chosen to be powers of two. In such a case, the D 3

,

1 display time value of 2 between video picture P 1 and picture video picture P 3 is transmitted as 1 (since 21 = 2) and the D 7

,

3 display time value of 4 between video 2 picture P 7 and picture video picture P 3 can be transmitted as 2 (since 2 4). 25 In some cases, motion vector interpolation may not be used. However, it is still necessary to transmit the display order of the video pictures to the receiver/player system such that the receiver/player system will display the video pictures in the proper order. In this case, simple signed integer values for Dij suffice irrespective of the actual 30 display times. In some applications only the sign may be needed. The inter-picture times Dij may simply be transmitted as simple signed integer values. However, many methods may be used for encoding the Dij values to achieve additional compression. For example, a sign bit followed by a variable length coded 35 magnitude is relatively easy to implement and provides coding efficiency.

14 One such variable length coding system that may be used is known as UVLC (Universal Variable Length Code). The UVLC variable length coding system is given by the code words: 2 = 1 2 = 0 1 0 3 = 0 1 1 4 = 0 0 1 0 0 5 = 00 1 0 1 6 = 0 0 1 0 7 = 0 01 1 1 8 = 0 0 0 1 0 0 0... 5 Another method of encoding the inter-picture times may be to use arithmetic coding. Typically, arithmetic coding utilizes conditional probabilities to effect a very high compression of the data bits. 10 Thus, the present invention introduces a simple but powerful method of encoding and transmitting inter-picture display times. The encoding of inter-picture display times can be made very efficient by using variable length coding or arithmetic coding. Furthermore, a desired accuracy can be chosen to meet the needs of the video decoder, but no more. 15 The foregoing has described a system for specifying variable accuracy inter picture timing in a multimedia compression and encoding system. It is contemplated that changes and modifications may be made by one of ordinary skill in the art, to the materials and arrangements of elements of the present invention without departing from 20 the scope of the invention.

Claims

1. A bitstream comprising: an encoded first video picture; 5 an encoded second video picture; and an integer value that is based on an exponent of a representation of an order value, said order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures. 10

2. The bitstream of claim I further comprising a slice header that is associated with said second video picture, said slice header comprises an encoded value that is based on the order value.

3. The bitstream of claim 1, wherein the exponent is an exponent of a power of two 15 integer.

4. The bitstream of claim 1, wherein said order value is for computing a motion vector for the second video picture. 20

5. The bitstream of claim 1, wherein said order value is a compressed order value.

6. The bitstream of claim 5, wherein said compressed order value is compressed in said bitstream by using variable length coding. 25

7. The bitstream of claim 5, wherein said compressed order value is compressed in said bitstream by using arithmetic coding.

8. The bitstream of claim 1, wherein the order value is representative of an order difference value between said second video picture and said first video picture. 30

9. A method of encoding a plurality of video pictures, the method comprising: encoding a first video picture; encoding a second video picture; and encoding an integer value that is based on an exponent of a representation of an 35 order value, the order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures. 16

10. The method of claim 9, wherein said encoded integer value, encoded first video picture and encoded second video picture are stored in a bitstream.

11. The method of claim 9, wherein encoding the second video picture comprises 5 using the order value to compute a motion vector for the second video picture based on a motion vector of another video picture.

12. The method of claim 9, wherein the order value is representative of an order difference value between said second video picture and said first video picture. 10

13. The method of claim 9 further comprising encoding the order value in a slice header in the bitstream.

14. The method of claim 13, wherein the slice header is associated with the second 15 picture video.

15. The method of claim 9, wherein the exponent is an exponent of a power of two integer. 20

16. The method of claim 9 further comprising calculating a motion vector for the second video picture by using said order value.

17. The method of claim 16, wherein encoding the second video picture comprises using the calculated motion vector for encoding a macroblock. 25

18. The method of claim 16 further comprising calculating another motion vector for the second video picture by using said order value.

19. The method of claim 16, wherein calculating the motion vector for the second 30 video picture comprises calculating a particular value that is based on (i) a first order value difference between an order value of a third video picture and an order value of the first video picture, and (ii) a second order value difference between the order value of the second video picture and the order value of the first video picture. 35 17

20. The method of claim 19, wherein calculating the motion vector for the second video picture further comprises multiplying a motion vector for the third video picture with the particular value. 5

21. The method of claim 16, wherein the calculated motion vector is not encoded in a bitstream.

22. The method of claim 9, wherein said encoded integer value is stored in a slice header associated with the encoded second video picture. 10

23. A method for decoding a plurality of video pictures, the method comprising: receiving an encoded first video picture, an encoded second video picture and an integer value for the second video picture, said integer value based on an exponent of a representation of an order value, the order value representative of a position of the 15 second video picture with reference to the first video picture in a sequence of video pictures; and by a decoder, decoding the second video picture by using the order value.

24. The method of claim 23 further comprising receiving a slice header associated 20 with the second video picture, wherein the slice header comprises an encoded value that is based on the order value.

25. The method of claim 23 further comprising calculating a motion vector for the second video picture by using said order value. 25

26. The method of claim 25 further comprising calculating another motion vector for the second video picture by using said order value.

27. The method of claim 25 wherein calculating the motion vector for the second 30 video picture comprises calculating a particular value that is based on (i) a first order value difference between an order value of a third video picture and an order value of the first video picture, and (ii) a second order value difference between the order value of the second video picture and the order value of the first video picture. 35 18

28. The method of claim 27, wherein calculating the motion vector for the second video picture further comprises multiplying a motion vector for the third video picture with the particular value. 5

29. The method of claim 23, wherein the exponent is an exponent of a power of two integer.

30. The method of claim 23, wherein the order value is a power of two integer. 10

31. The method of claim 23, wherein the position of the second video picture with reference to the first video picture is a temporal position.

32. A method for encoding a plurality of video pictures, the method comprising: encoding a plurality of video pictures; 15 encoding a plurality of order values, each order value specifying a position of a video picture in a sequence of video pictures; and storing the encoded video pictures and the encoded order values in a bitstream, wherein a particular order value is a power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the particular order value. 20

33. The method of claim 32, wherein encoding the plurality of video pictures comprises using the plurality of order values to calculate a plurality of motion vectors.

34. A method for decoding a plurality of video pictures, the method comprising: 25 receiving a bitstream comprising a plurality of encoded video pictures and a plurality of encoded values, each encoded value representing an order value, wherein a particular order value is a power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the power of two value; and decoding the bitstream. 30

35. The method of claim 34, wherein decoding the bitstream comprises decoding an order value by decoding one of said encoded values using said integer value.

36. The method of claim 35, wherein decoding the bitstream further comprises 35 using the decoded order value to calculate a motion vector. 19

37. A storage medium storing a bitstream, the bitstream comprising: a plurality of encoded video pictures; and a plurality of encoded order values, each order value specifying a position of a video picture in a sequence of video pictures, wherein a particular order value is a 5 power of two value, wherein an integer value of the power of two value is stored in the bitstream to represent the power of two value.

38. The storage medium of claim 37, wherein said integer value is stored in a slice header of the bitstream. 10

39. A method comprising: encoding a first video picture in a bitstream; encoding a second video picture in the bitstream, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header; and 15 encoding an order value in each of the slice headers of the encoded second video picture, the order value representative of a position of the second video picture with reference to the first video picture in a sequence of video pictures, wherein the second video picture is encoded based on the order value. 20

40. The method of claim 39, wherein the sequence of video pictures is a sequence for displaying the video pictures.

41. The method of claim 39, wherein encoding the second video picture comprises using the order value to compute a motion vector for the second video picture based on 25 a motion vector of another video picture.

42. The method of claim 39, wherein the order value is representative of an order difference value between said second video picture and said first video picture. 30

43. A storage medium for storing a bitstream, the bitstream comprising: an encoded first video picture; an encoded second video picture, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header; and an encoded order value, the order value encoded in each of the encoded second 35 video picture slice headers, the order value specifying a positional relationship between said second video picture and said first video picture. 20

44. The storage medium of claim 43, wherein the order value is representative of a position of said second video picture in a sequence of video pictures. 5

45. The storage medium of claim 43, wherein said order value is for computing a motion vector for the second video picture, wherein the motion vector is not in said bitstream.

46. The storage medium of claim 43, wherein said order value is compressed in said 10 bitstream.

47. The storage medium of claim 46, wherein said compressed order value is compressed in said bitstream by using variable length coding. 15

48. The storage medium of claim 46, wherein said compressed order value is compressed in said bitstream by using arithmetic coding.

49. The storage medium of claim 43, wherein the order value is representative of an order difference value between said second video picture and said first video picture. 20

50. A method for decoding a plurality of video pictures of a video sequence, the method comprising: receiving a bitstream comprising an encoded first video picture, an encoded second video picture, and an encoded order value for the second video picture, the 25 order value representative of a position of the second video picture with reference to the first video picture in the video sequence, the encoded second video picture comprising a plurality of slices, each slice associated with a slice header, said order value encoded in each of the encoded second video picture slice headers; and decoding the second video picture by using said order value. 30

51. The method of claim 50, wherein the sequence of video pictures is a sequence for displaying the video pictures.

52. The method of claim 51, wherein decoding the second video picture comprises 35 using the order value to compute a motion vector for the second video picture based on a motion vector of another video picture. 21

53. The method of claim 52, wherein the computed motion vector is not in the bitstream. 5

54. A computer readable medium storing a computer program that is executable by at least one processor, the computer program comprising sets of instructions for implementing the method according to any one of claims 9 to 22, 32, 33 and 39 to 42.

55. A computer readable medium storing a computer program that is executable by 10 at least one processor, the computer program comprising sets of instructions for implementing the method according to any one of claims 23 to 31, 34 to 36 and 50 to 53.

56. A computer system comprising means for implementing steps according to any 15 one of claims 9 to 22, 32, 33 and 39 to 42.

57. A computer system comprising means for implementing steps according to any one of claims 23 to 31, 34 to 36 and 50 to 53.