EP0838113A1 - Video coding - Google Patents
Video codingInfo
- Publication number
- EP0838113A1 EP0838113A1 EP96923178A EP96923178A EP0838113A1 EP 0838113 A1 EP0838113 A1 EP 0838113A1 EP 96923178 A EP96923178 A EP 96923178A EP 96923178 A EP96923178 A EP 96923178A EP 0838113 A1 EP0838113 A1 EP 0838113A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- bit
- plane
- bit plane
- planes
- different
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 claims abstract description 161
- 238000007906 compression Methods 0.000 claims abstract description 45
- 230000006835 compression Effects 0.000 claims abstract description 45
- 230000002123 temporal effect Effects 0.000 claims abstract description 10
- 230000011218 segmentation Effects 0.000 claims abstract description 9
- 230000007704 transition Effects 0.000 claims abstract description 7
- 230000005540 biological transmission Effects 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 4
- 230000000875 corresponding effect Effects 0.000 description 28
- 230000000007 visual effect Effects 0.000 description 21
- 238000013459 approach Methods 0.000 description 11
- 230000007246 mechanism Effects 0.000 description 11
- 230000009467 reduction Effects 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000016776 visual perception Effects 0.000 description 4
- 230000000903 blocking effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000001444 catalytic combustion detection Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004393 visual impairment Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/39—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- the present invention is related to a methods and devices for compressing and coding digitally represented moving pictures, grey scale as well as colour, which are to be transmitted on a channel and in particular a channel which has a relatively small capacity or bandwidth as well as a system for transferring moving pictures using compression of the pictures.
- a digitalized video image In many applications involving transmission of video signals, the capacity of the channel used is limited.
- a digitalized video image consists of a very large number of bits.
- transmission times for most applications become unacceptably long, if every bit of the image has to be transmitted. This is especially true in the case of moving pictures, where strict real time constraints exist.
- Lossless methods i.e. methods exploiting the redundancy in the image in such a manner that the image can be reconstructed by the receiver without any loss of information, i.e. the reconstructed image coincides exactly with the original image.
- Lossy methods i.e. methods exploiting the fact that all bits are not equally important to the receiver.
- the received image is not identical to the original, but looks, e.g. for the human eye, sufficiently alike the original image.
- DCT Discrete Cosine Transform
- JPEG Joint Photographic Experts Group
- MPEG I/II Motion Picture Experts Group
- CCITT Recommendation H.261 Px64
- CCITT Recommendation H.263 which is related to H.261, but is developed for lower bit rates (16 - 64 kilo bits per second kbps) .
- the existing coding methods are based on computationally complex and expensive systems, comprising frequency or fractal transformations, filtering stages and vector quantization processes.
- processors such as DCT processors, zigzag processors, blocking processors etc. are required.
- bit plane coding technique maps the pixels of a digitalized image into a number of binary bit planes the first one of which usually consists of the most significant bits of the pixels.
- the image which consists of pixels, which in turn consist of a number of bits are mapped into a number of bit planes, where the number of bit planes is equal to the number of bits per pixel (bpp) .
- mapping the bits into bit planes is to exploit the spatial redundancy of the digitalized video image.
- these redundancies no information is lost, and thus the images compressed and transmitted using this technique can be recreated exactly bit by bit, i.e. the technique is lossless.
- This kind of techniques has been introduced and successfully applied in cases of lossless coding of still pictures, such as X-ray medical images, satellite and space images and facsimile images.
- bit plane coding is quite efficient compared to other lossless existing coding methods for coding still images.
- bit plane datasets comprise a bogus bit plane, most significant bit plane, next-to-most significant bit plane, least significant bit plane, and next-to- least significant bit plane and a significance of each said bit plane corresponds to a printing time length.
- U.S. 5,142,619 patent discloses techniques using the XOR- operation.
- a device described in the document has means provided for comparing the contents of two bit planes in order to compare the respective pixel locations and exclusive OR-ing each pair of corresponding pixels to set a corresponding pixel location in a third bit plane to reflect similarity or dissimilarity between the compared pair of pixels of two bit planes of the same image.
- U.S. patent 4,653,112 concerns image data management, where image data are organized in bit planes. Data comprising the most to the least significant bits are arranged in the first to the last bit planes respectively.
- U.S. patent 4,546,385 relates to data compression of graphic images.
- a graphic image has at least first (most significant bit) and second (least significant bit) bit planes. The most significant bit of a pixel and the successive pixel are compared using an exclusive OR-operation on a spatial dimension, i.e. the XOR operation is performed between pixels of the same image.
- EP-Al 0 547 528 discloses coding of binary bit planes eliminating the need for forming a Gray code bit plane representation.
- the invention uses different significances for different bit planes.
- a method for coding both grey scale and colour images, in particular moving pictures is also based on bit plane coding, and can transmit moving pictures of relatively good quality only requiring bit rates of approximately 10 kbps.
- the method for achieving this can be divided into the following five substeps:
- V. Coding the output sequence plane-by-plane by means of a specially developed mono dimensional run length encoding (extended RLE1D) technique which is designed to exploit the fact that the binary sequence to be coded consists of long runs of the same symbol and also has relatively many isolated symbols of the other kind, e.g. long runs of binary zeroes are interrupted by isolated binary ones.
- the coding techniques above has several advantages compared to the techniques involving the use of the frequency domain.
- the technique work directly in the spatial domain. This avoids the problems of introduction of low-pass effects, blocking effects, undesired frequencies, which commonly affect other techniques based on transformation.
- the method only comprises simple manipulations of the bits of the digitalized image, and only elementary operations are involved. This makes the method easy to implement, and particularly suitable for VLSI implementations, and the realization of low-cost systems.
- the method above can also be simplified and made less computational expensive by means of introducing transmit/not transmit (TX/NT) procedure to the different regions of the segmented images before a possible application of ME/MC only to the transmitted regions.
- TX/NT transmit/not transmit
- the technique as described above can be used for lossless compression of moving pictures. In that case all substeps leading to a degradation of the image are not used, i.e. no plane skipping is performed and the TX/NT procedure is also not applied.
- - Fig. 1 is a picture illustrating the bits of the pixels representing a digitally coded image, using the bit plane coding method.
- Figs. 2a and 2b are pictures similar to that of Fig. 1, and illustrating the bit planes before and after some bit planes have been skipped by means of the plane skipping process, respectively.
- - Fig. 3 is a picture illustrating a shift of the bits of one of the pixels in Fig. 2b, and the padding with zeroes thereof.
- FIG. 4 is a block diagram of a system for transmitting successive video images, illustrating some of the various procedural steps made in the image compression.
- Fig. 5 is a block diagram of a system similar to that of Fig. 4, which also has means for performing transmission / not transmission.
- Fig. 6 is a block diagram similar to that of Fig. 4 illustrating a system intended for a more advanced compression.
- Fig. 7a and 7b illustrate the use of a conventional RLE1D technique and an extended RLE1D technique for sequences comprising many isolated binary ones.
- the achieved lossy method relies on the experimental fact that the human eye is relatively less sensitive to the type of distortion introduced by the coarser quantization caused by the skipping of the least significant bit planes. Moreover, these bits are often corrupted by noise, in particular thermal noise introduced by the sensors used for the image acquisition, cameras, CCDs, etc. These bits are therefore usually useless from a visual point of view.
- MSE Mean Square Error
- the method as described above does not only involve a reduction in the redundancy, because of the reduction of the planes to be coded. Most significant bit-planes are also characterized by highly structured information. By taking this into account, this leads to the possibility of achieving very high compression ratios through bit-plane coding techniques.
- an encoding of the remaining bit planes is performed.
- the encoding aims at maximally exploiting the redundancy of the bit planes.
- the spatial redundancy is addressed.
- Gray code method is well known in literature. It was developed for and is used in applications involving lossless coding and absence of loss of information.
- the Gray code is applied in a method associated with loss of information, i.e. the plane skipping mechanism.
- the Gray code when the Gray code is applied in association with the plane skipping technique, experiments have revealed that on one hand when applying the Gray code before making use of the plane skipping technique, the compression achieved is substantial but the visual quality of the received images is reduced and poor. In particular, the experiments showed that this method results in a hard visual loss of details and in a poorer quality. On the other hand, the experiments showed that when first applying the plane skipping technique on the binary coded information and then applying the Gray code, this method results in a very good quality, i.e. the same as without the use of the Gray code. However, the compression achieved is no longer as high as when applying the Gray code before the plane skipping technique.
- a solution to the problem with an increased dynamic range is to introduce a shifting mechanism to the bit planes.
- the sequences forming binary words which correspond to different pixels and the values thereof and to which the plane skipping technique have been applied are shifted to the lowest positions of the sequence, which have been left empty after the plane skipping step of the method, before being mapped using the Gray code.
- the positions made vacant by the plane shifting step are filled by means of padding the positions with zeroes. It is simple to carry out this operation, only comprising a bit shift operation of a predetermined number of steps and the normal zero padding executed in a shifting operation. Thanks to this operation, the data dynamic is not increased during the step involving the Gray code, and therefore the same compression as in the procedure of first Gray coding the binary plane and then applying the plane skipping technique is achieved, without affecting the quality of the compressed image.
- the Gray code operation can be carried out by means of a simple Look Up Table (LUT) .
- Fig. 1 illustrates the mapping of m by n (m x n) pixels of an image into k bit planes, where k is equal to the number of bits per pixel (bpp) .
- the bit planes are arranged in such a manner that the (k - l)-th bit plane consists of the most significant bit of each pixel of the image, the (k - 2)-th bit plane consists of the next-to-most significant bits of the pixels, and so on until the last (0-th) bit plane, which consists of the least significant bits of the pixels.
- Figs. 2a and 2b illustrate the use of the plane skipping technique.
- Fig. 2a is a picture illustrating the same as Fig. 1, i.e. the mapping of an m by n pixel image into k bit planes, which are arranged in such a manner that the number of a bit plane corresponds to the significance its pixels.
- Fig. 2b illustrates the same as Fig. 2a, but including skipping of 1 bit planes holding the least significant bits.
- the original image of Fig. 2a consisting of m by n pixels, where each pixel is built up of k bits, is reduced to the image illustrated in Fig. 2b having m by n pixels, which are built up of only k - 2 bits.
- Fig. 3 illustrates by way of an example the complete procedure of skipping the least significant bits of a pixel, shifting the remaining bits, and zero padding the remaining bits of the pixel.
- the bits of a pixel are then arranged as shown at 2. These five bits k 7 - k 3 are shifted and then padded with zeroes in the positions left vacant after the shift operation.
- the bits of a pixel are then arranged as shown at 3.
- the new sequence formed in this manner is hence identical to the original pixel sequence formed by the k bits in the respective bit planes in Fig. 2b, except for the appended zeroes in the beginning of the sequence, i.e. the remaining most significant bits have been shifted to the positions originally occupied by the least significant bits and the positions made vacant when doing this are padded with zeroes. This is made in order to maximally exploit the spatial redundancy.
- a suitable mechanism for exploitation of the temporal redundancy for moving pictures in connection with a bit plane coding technique is also provided. The mechanism has been proved to remarkably increase the achieved compression ratios.
- This technique is based on a comparison of two corresponding bit planes of two successive images.
- the comparison is in the preferred embodiment carried out by means of an XOR-operation involving the two corresponding bit planes.
- the result of this XOR-operation is a third bit plane, consisting of binary zeroes in every position where the bit value remains unchanged, and binary ones in the positions where the value of the bit has changed. That is, the third, new bit plane has binary zeroes in every position where the two compared bit planes are equal and has binary ones in every position where the two compared bit planes have the same value.
- the new bit plane instead of coding all the elements in the new bit plane, only the elements that have changed from the previous bit plane are considered. These elements are termed variations.
- the method differs from the normal methods applied to exploit the temporal redundancy, which are based on the difference between the pixels (pixel values) of successive frames.
- a plane- by-plane technique is applied instead by means of the XOR operation.
- the number of bit planes, i,e, the dynamic range is thereby preserved, which is advantageous for the compression obtained.
- the dynamic range is not preserved resulting in one more bit plane to code. For example, if sequences having 8 bits per pixel are considered, i.e. formed by pixels in the range [0, 255], the application of a traditional difference method would lead to a dynamic range of [-255, 255], i.e. an increase of the number of bit planes to 9 bit planes. However, by instead using the XOR operation the dynamic remain in the range [0, 255], i.e. 8 bit planes.
- Fig. 4 the various procedural steps and corresponding components of a system for transmitting video coded images are illustrated, the system comprising means for carrying out the various steps as described above.
- the image is captured and digitalized in a block 1, e.g. by a video camera. Then the image is represented by means of conventional bit plane representation in a block 2, and then the unwanted planes are skipped in a block 3. Thereafter the planes are shifted and padded with zeroes in a block 4, and then coded by means of the conventional Gray code in a block 5. After that the temporal redundancy of the corresponding bit planes of successive images is exploited by means of an XOR operation in a block 6. The images compressed in this manner are then coded by means of a conventional entropy coding technique in a block 7 and transmitted on a channel 8, which is usually corrupted by noise 9. The received information is decoded in a block 10, after which the compressed image is available for the intended user 11, for decompression, visualization, digital signal processing, etc.
- the images compressed in this manner still contain redundant information.
- a system aiming at reducing the bit rate as much as possible while still maintaining the visual quality above a certain level, it is of course desirable to exploit these redundancies as much as possible without decreasing the visual quality too much.
- TX/NT transmit/not transmit
- the transmit/not transmit (TX/NT) procedure exploits the redundant information in such manner that the frames are divided into blocks or segments. These blocks or segments are in the simplest and perhaps most common approaches square blocks of RxR pixels, but the blocks or segments can also have other proportions or have irregular shapes. This division or segmentation of the frames into smaller regions allows the system to identify and localize parts or regions inside the picture which are in some aspect not of interest for the final result.
- TX/NT transmission/not transmission
- the transmission/not transmission (TX/NT) procedure for two successive frames basically consists of the following steps.
- the regions can then have structures such as simple uniformly sized square blocks of RxR pixels, e.g. 16x16 pixels, differently sized blocks (also known as quad tree approach) or variously shaped regions.
- a comparison is then perfor ed between the corresponding regions of the two frames.
- the aim of this comparison is to obtain some kind of visual distance parameter value.
- the visual distance parameter shall estimate the distance or difference between the two corresponding regions inside the two frames from a visual point of view. It is therefore desirable to get the visual distance parameter as correlated as possible to the visual interpretation of the two corresponding regions of the two frames, i.e. the more alike the two corresponding regions of the two frames look the smaller, the value of the visual distance parameter adopts.
- MSE Mean Square Error
- HVS Human Visual System
- TX the value obtained by any of these well known quality measures is then compared to a preset threshold or distance threshold. Based on this comparison the system makes a decision whether it is necessary to transmit (TX) this particular region or if the region is not to be transmitted (NT) , i.e. the region does not differ from the previously transmitted corresponding region in such a way, or so much that it is necessary to transmit the current region. That is, the old region can be used instead, without a heavy loss of quality. This procedure is carried out for every region of the frames, and thus only the regions that have changed more than the preset threshold will be transmitted, thereby reducing the information which must be transmitted.
- the information content of the frames is different from the traditional compression systems.
- the coding described sofar avoids small transitions between the pixel values, and the objects in the picture therefore become very well defined from the background.
- the dynamic reduction introduced by the shifting process which is described in more detail below, does not only reduce the dynamic but it also allows to separate the visually important information from the not important information by means of a simple Mean Square Error (MSE) approach, which will be described below.
- MSE Mean Square Error
- the Mean Square Error (MSE) approach is usually not very efficient because of the low correlation between the MSE and the visual perception, but in the system described, due to the characteristics of the dynamic, the correlation between the visual perception and the MSE increases.
- the Mean Square Error (MSE) mechanism is performed very simply.
- a first frame is segmented, i.e. it is divided into regions of e.g. 16x16 pixels (other segmentations such as e.g. the ones described above can of course also be used) .
- the pixels inside one region are then compared to the corresponding pixels of the previous frame, where the comparison is carried out for determining the distance between the corresponding pixels, i.e. the following calculation is performed:
- the operation is repeated for all the corresponding pixels within the two corresponding regions of the frames in order to compute the distances between them and their average distance is determined, i.e. a Mean Square Error operation is carried out for the corresponding pixels within all the regions of the frames. For two corresponding regions, consisting of N pixels each, the following calculation is performed:
- the regions of a frame, which are decided not to be transmitted by the algorithm described above, are not made subject to any further steps of compression, since they must not be transmitted.
- the introduction of this non-transmission step will reduce the amount of information that has to be transmitted and thereby the bit rate required will be lowered.
- the regions that are decided to be transmitted are however subject to additional compression steps, i.e. the Gray code, XOR operation and the entropy coding, as described above.
- this procedure is to be introduced between the shifting step and the Gray code step of the compression. This is of course made in order to make the MSE-algorithm work properly and to make the correlation between the distance evaluated by the MSE-algorithm and the visual perception high. At the same time this reduces the computational cost. It is further to be noted that the compression system will work very well even without the introduction of the (TX/NT) procedure.
- Fig. 5 shows a block diagram of more advanced embodiment of the system, which also comprises the transmitted/not transmitted (TX/NT) procedure described above and in addition a motion estimation/motion compensation procedure (ME/MC) .
- TX/NT transmitted/not transmitted
- ME/MC motion estimation/motion compensation procedure
- a motion estimation/motion compensation (ME/MC) procedure is a method for increasing the exploitation of the temporal redundancy, which is well known in literature. Such a procedure is used for reducing the information that must be transmitted, in particular in the case of a moving sequence in which the frames contain moving objects. The application of a ME/MC procedure then allows transmitting only the motion vectors of the moving object inside a region or a block of the frame. The methods for performing the motion estimation and the motion compensation are numerous and are well known. The object of the MC procedure is thus to reduce the information that has to be transmitted, in this case the number of variations.
- the motion compensation procedure can advantageously be used in association with the transmitted/not transmitted (TX/NT) procedure.
- the TX/NT procedure is applied. This results in that some of the regions of the segmented frame are decided not to be transmitted (NT) . Thereafter some known motion estimation procedure is applied to predict the motion of the regions or blocks of the frames, which by the comparison between the used quality factor and the used threshold have been decided to be transmitted.
- the regions which based on a motion estimation have been motion compensated (MC) will then be made subject to the further compression in the following steps of the system, i.e. the XOR operation and the plane-by-plane entropy coding.
- the information associated with the motion will be transmitted as motion vectors according to some suitable known method described in the literature.
- the introduction of motion compensation (MC) will only reduce the information to be transmitted. This is due to the lossless compression of the following steps of the compression. Hence, this operation will not modify the visual quality at all, i.e. the quality will be exactly the same as for the case including only the transmitted/not transmitted procedure.
- the introduction of the ME/MC will however increase the computational load on the system, but since the regions of the frames that are subject to motion estimation are only the transmitted ones (TX) , the computational load will not increase heavily. A low-cost approach to the ME/MC is then used.
- a system using all of the steps described above is hence illustrated in fig. 5. That is, the bit plane represented frames that remain after the plane skipping procedures indicated in a block 51, are put into a block 52 where the shift operation as described in association with fig. 3 is performed. The frames are then segmented into suitable blocks or regions in a block 53. Then the Gray code is applied in a block 54. The result of the operation in block 53 is also fed both to a block 55 and to a motion estimation block 56. Then a transmit/not transmit (TX/NT) operation is performed in block 55, preferably by means of a MSE algorithm as a measure of the visual distance. The decision is taken based on the similarity between a corresponding region of a previously transmitted frame stored in a memory 59 and the current region as fed from the block 53.
- TX/NT transmit/not transmit
- the remaining blocks or regions i.e. the blocks determined to be transmitted (TX) are subjected to a motion estimation (ME) , resulting in predicted motion compensated (MC) blocks.
- ME motion estimation
- MC predicted motion compensated
- This is carried out in block 56 by means of feeding the a previously transmitted corresponding block to the ME/MC block 56 from the memory 59 which has been provided with a decoded shifted and segmented version of the previously transmitted frame(s) from a decoder comprising of an XOR block 50, which performs an XOR operation between the output from a Gray code block 49 connected to the motion compensation block 56, and the output from block 57 and a block 48 performing an inverse Gray coding, and thereby reconstructing the shifted and segmented image as received by a receiver.
- the XOR operation is performed in the block 57 between the current region and a motion compensated, and in block 49 Gray coded region of a previously transmitted corresponding region, and finally the information is coded by means of a plane-by-plane entropy coder in block 58.
- the block matching method which is described in A.N. Netravali and B.G. Haskell, "Digital pictures", 2nd ed. Plenum Press 1995 p. 340 and which is also employed and described in ITU-T Recommendation H261, Geneva, August 1990, can be used.
- the method can also incorporate more advanced procedures such as a transmit/not transmit (TX/NT) procedure and/or a motion compensation (MC) procedure as illustrated in fig. 5, which will increase the computational load and complexity somewhat, but in return will greatly lower the necessary bit rate.
- TX/NT transmit/not transmit
- MC motion compensation
- the various substeps of the method only comprise elementary operations such as shifts, table look-up and XOR-operations, which makes the construction cost of a transmitting system using some or all of the substeps low. Moreover, the low complexity contributes to make the method very well suited for real time applications.
- Thi ⁇ is achieved by means of the introduction of an exhaustive approach to the motion estimation/motion compensation (ME/MC) .
- ME/MC motion estimation/motion compensation
- the system consists of all the elements described above, but some additional blocks have been added.
- the bit plane represented frames that have been subject to the plane skipping mechanism enter the system at 61.
- the frames are then shifted according to the procedure described in association with fig. 3. in a block 62.
- the frames are divided into regions, also called blocks or segments, in a block 63.
- blocks or segments are in the simplest and perhaps most common approaches square blocks of RxR pixels, but the blocks or segments can also have other proportions or have irregular shapes. This division or segmentation of the frames into smaller regions allows the system to identify and localize parts or regions inside the picture which in some aspect are not of interest for the final result, or to exploit the redundancies in the separate block more efficiently than the whole image, e.g. by means of motion compensation.
- the segmented image is then subjected to further processing in the blocks 67 and 66.
- the segmented image is coded with the Gray code in a block 65 as described above.
- the whole picture is scanned block by block or segment by segment in order to perform a motion estimation (ME) of the objects inside the picture.
- ME motion estimation
- This motion estimation is performed according to one of the many well known procedures described in literature and using the previous decoded picture, which has been stored in the memory block 64.
- the decoding procedure is performed in the XOR and inverse Gray code blocks 72 and 70 producing a reconstructed shifted and segmentated version of the received image in the same manner as described above in conjunction with the blocks 48 and 50 in Fig. 5.
- the outcome of the motion estimation is used to form a predicted motion compensated picture, which reduces the number of blocks or segments to be transmitted and the number of variations inside the blocks.
- the motion information only the motion parameters (motion vectors) of these motion compensated blocks in relation to the previous blocks need to be transmitted.
- the motion estimation method used can for instance be the block matching method, described in A.N. Netravali and B.G. Haskell, "Digital pictures", 2nd ed. Plenum Press 1995 p. 340 and which also is employed and described in ITU-T Recommendation H261, Geneva, August 1990.
- a comparison is then performed of the corresponding regions of the two frames, i.e. the current one and the previous.
- the decoded previous frame is, as mentioned above, available in the block 64.
- the aim of this comparison is to obtain some kind of visual distance parameter value.
- the visual distance parameter shall estimate the distance between the two corresponding regions inside the two frames from a visual point of view. As mentioned above it is of interest to get the visual distance parameter as much correlated as possible to the visual perception of the two corresponding regions of the two frames, i.e. the more alike the two corresponding regions of the two frames look, the smaller (or higher) the value of the visual distance parameter adopts.
- the system decides, for a particular segment, whether all of the information of the region or segment in question, i.e. motion parameters and variations, is to be transmitted (TX) , or if it is sufficient to only transmit the motion parameters of the motion compensated (MC) region, or if it is unnecessary to transmit the region at all, i.e. neither motion parameters nor variations, and thus not transmit (NT) the region.
- TX motion parameters and variations
- MC motion compensated
- NT motion compensated
- the comparison and the decision is made in block 67 between the output from block 63 and the outputted motion compensated version of a previously transmitted segment from block 66.
- the motion parameters of the segments that are decided to be transmitted (TX) and motion compensated (MC) are then transmitted according to some known technique.
- the frames that are decided to be transmitted are then subject to the above described XOR operation in block 68 to which the current segment and a Grey coded version of the output from block 66 is fed.
- the Gray coding of the output segment from block 66 is performed in a block 71.
- the result of this operation is, as described in detail above, that bits that remain unchanged are coded as zeroes and bits that have changed are coded as ones. In the case where few changes occur, a phenomena that is increased by the motion estimation procedure, the output binary sequence will consist of many binary zeroes and some, often isolated, binary ones.
- RLE1D mono- dimensional run length encoding
- the conventional RLE1D encoding is illustrated in fig. 7a.
- This method is well known and can in short be described as a method for exploiting binary sequences having relatively few transitions between binary ones and binary zeroes, i.e. long runs. This is performed in such manner that two separate alphabets are used, one for the zeroes and one for the ones, e.g. B (Black) for zeroes and W (White) for ones.
- the sequence is then coded as runs of ones and zeroes, e.g.
- the output sequence of the XOR block 68 is expected to consist of a relatively large number of isolated ones.
- extended RLE1D an extension of the RLE1D procedure has been developed, which is called extended RLE1D.
- This developed procedure exploits the condition of isolated ones in such a manner that, unlike the conventional RLE1D procedure, each run of ones or zeroes will include one transition at the end, i.e. that each run of ones will have one zero as its last symbol and that this entire sequence will be coded with one symbol and vice versa.
- the technique is shown as an example in fig. 7b, where the same sequence coded by means of the conventional RLE1D in fig. 7a i ⁇ coded with the extended RLEID technique.
- a run consisting of three binary zeroes followed by a binary one can be coded as 4B, i.e. a count of the length of the run and an indication of the first binary symbol in the run.
- the number of isolated ones is relatively high and the number of runs needed to be coded have decreased from 20 in fig. 7a to 11 in fig. 7b.
- a substantial reduction of information has been achieved. It is al ⁇ o ea ⁇ ily seen that the use of the extended RLEID will always be superior or at worst, in the case of no isolated ones or zeroes at all, equal to the conventional RLEID encoding technique.
- the extended RLEID procedure is performed plane-by-plane in block 69, before the sequence is transmitted onto a channel.
- Experiments have shown that the described compression system is able to obtain moving pictures of very good quality at 10 - 16 kbps, and under some conditions bit rates down to 6 - 7 kbps have been obtained.
- the method as described can easily be extended to lossless applications.
- no planes are skipped, i.e. no bit planes are removed before the compression.
- the threshold for the TX/NT operation is set to zero, i.e. all blocks which have changed are compressed and transmitted. Thus, no distortions are introduced and the compression becomes lossless.
- Gray code is used, since no planes are skipped there is no need for a shift operation, i.e. the Gray code is applied directly in its proper lossless conditions. Then, the XOR operation and an entropy coding of all of the planes are applied.
- Different coding schemes can be applied for different bit planes according to their different characteristics. For example, most significant bit planes have lower numbers of ones (isolated ones) and are more structured, whereas least ⁇ ignificant bit planes have an almost equal number of ones and zeroes, and this can be taken into account when choosing a proper coding scheme. Al ⁇ o, as an option, an ME/MC scheme can be introduced. However, in most cases the improvement would be quite small.
- the lossless extension of the method is of interest for some specific applications.
- the quality is good enough for the normal video communication, but is usually not good enough in some specific cases, such as the transmission of documents, or graphic ⁇ and drawing ⁇ . It i ⁇ also not good enough for specific applications, such as in the case when transmission is made for medical or legal reasons.
- a freeze of a particular image can be done and a lossles ⁇ transmission can be set-up for that specific image before switching back to the normal conditions of real-time los ⁇ y communication, by means of only changing a few parameters in the compression algorithms, i.e. for the plane skipping and for the threshold for the transmit/not transmit (TX/NT) procedure.
- Another field of application for the lossless coding is for storage purposes. Huge amounts of memory is required for storing sequences, for example by television companies, multimedia producers, and network companies. MPEG I and MPEG II standard ⁇ are u ⁇ ually applied for reducing the memory occupation, but these techniques are lo ⁇ y and introduce a degradation in the ⁇ equences. In the case where no distortions are required and where sequences need to be reproduced exactly with the same quality, los ⁇ le ⁇ s compres ⁇ ion ⁇ yste s are required.
- Lossless still-image compression techniques can then be applied to each frame ⁇ eparately, but no temporal redundancy, i.e. correlation between the frame ⁇ , i ⁇ taken into account in this case.
- the method and sy ⁇ tem a ⁇ de ⁇ cribed herein provide a solution to the problem, which not only is able to guarantee flexibility, including the possibility to introduce smaller or larger amounts of di ⁇ tortion if higher compre ⁇ sion ratios are required, but which also is simple and which allows exploitation of temporal redundancy inside sequences without an increase in the signal to be processed.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE9502557 | 1995-07-11 | ||
SE9502557A SE9502557D0 (en) | 1995-07-11 | 1995-07-11 | Video coding |
SE9503736A SE9503736D0 (en) | 1995-10-24 | 1995-10-24 | Advanced video coding |
SE9503735 | 1995-10-24 | ||
SE9503735A SE9503735D0 (en) | 1995-10-24 | 1995-10-24 | Low-coast video codin |
SE9503736 | 1995-10-24 | ||
PCT/SE1996/000943 WO1997003516A1 (en) | 1995-07-11 | 1996-07-11 | Video coding |
Publications (1)
Publication Number | Publication Date |
---|---|
EP0838113A1 true EP0838113A1 (en) | 1998-04-29 |
Family
ID=27355782
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP96923178A Withdrawn EP0838113A1 (en) | 1995-07-11 | 1996-07-11 | Video coding |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP0838113A1 (en) |
JP (1) | JPH11513205A (en) |
AU (1) | AU6376296A (en) |
WO (1) | WO1997003516A1 (en) |
Families Citing this family (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7381407B1 (en) | 1996-03-25 | 2008-06-03 | Medarex, Inc. | Monoclonal antibodies specific for the extracellular domain of prostate-specific membrane antigen |
AU714202B2 (en) * | 1997-01-22 | 1999-12-23 | Canon Kabushiki Kaisha | A method for digital image compression |
SE511186C2 (en) | 1997-04-11 | 1999-08-16 | Ericsson Telefon Ab L M | Method and apparatus for encoding data sequences |
GB2327003A (en) * | 1997-07-04 | 1999-01-06 | Secr Defence | Image data encoding system |
US6615335B1 (en) * | 1999-07-02 | 2003-09-02 | Koninklijke Philips Electronics N.V. | Compressed storage of information |
ES2155408B1 (en) * | 1999-08-09 | 2001-10-16 | Telefonica Sa | DEVICE APPLICABLE TO THE CAPTURE AND TRANSMISSION IN REAL TIME OF VIDEO SEQUENCES THROUGH NETWORKS OF LOW WIDTH DATA. |
US6501397B1 (en) | 2000-05-25 | 2002-12-31 | Koninklijke Philips Electronics N.V. | Bit-plane dependent signal compression |
WO2002073973A2 (en) * | 2001-03-13 | 2002-09-19 | Loronix Information Systems, Inc. | Method and apparatus for temporal wavelet compression |
US8824553B2 (en) * | 2003-05-12 | 2014-09-02 | Google Inc. | Video compression method |
KR101409526B1 (en) * | 2007-08-28 | 2014-06-20 | 한국전자통신연구원 | Apparatus and Method for keeping Bit rate of Image Data |
US8325796B2 (en) | 2008-09-11 | 2012-12-04 | Google Inc. | System and method for video coding using adaptive segmentation |
US8385404B2 (en) | 2008-09-11 | 2013-02-26 | Google Inc. | System and method for video encoding using constructed reference frame |
US8326075B2 (en) | 2008-09-11 | 2012-12-04 | Google Inc. | System and method for video encoding using adaptive loop filter |
US9172967B2 (en) | 2010-10-05 | 2015-10-27 | Google Technology Holdings LLC | Coding and decoding utilizing adaptive context model selection with zigzag scan |
US9154799B2 (en) | 2011-04-07 | 2015-10-06 | Google Inc. | Encoding and decoding motion via image segmentation |
US8638854B1 (en) | 2011-04-07 | 2014-01-28 | Google Inc. | Apparatus and method for creating an alternate reference frame for video compression using maximal differences |
US8891616B1 (en) | 2011-07-27 | 2014-11-18 | Google Inc. | Method and apparatus for entropy encoding based on encoding cost |
US8885706B2 (en) | 2011-09-16 | 2014-11-11 | Google Inc. | Apparatus and methodology for a video codec system with noise reduction capability |
US9247257B1 (en) | 2011-11-30 | 2016-01-26 | Google Inc. | Segmentation based entropy encoding and decoding |
US9262670B2 (en) | 2012-02-10 | 2016-02-16 | Google Inc. | Adaptive region of interest |
US9131073B1 (en) | 2012-03-02 | 2015-09-08 | Google Inc. | Motion estimation aided noise reduction |
US11039138B1 (en) | 2012-03-08 | 2021-06-15 | Google Llc | Adaptive coding of prediction modes using probability distributions |
WO2013162980A2 (en) | 2012-04-23 | 2013-10-31 | Google Inc. | Managing multi-reference picture buffers for video data coding |
US9609341B1 (en) | 2012-04-23 | 2017-03-28 | Google Inc. | Video data encoding and decoding using reference picture lists |
US9014266B1 (en) | 2012-06-05 | 2015-04-21 | Google Inc. | Decimated sliding windows for multi-reference prediction in video coding |
US8819525B1 (en) | 2012-06-14 | 2014-08-26 | Google Inc. | Error concealment guided robustness |
US9774856B1 (en) | 2012-07-02 | 2017-09-26 | Google Inc. | Adaptive stochastic entropy coding |
US9344729B1 (en) | 2012-07-11 | 2016-05-17 | Google Inc. | Selective prediction signal filtering |
US9509998B1 (en) | 2013-04-04 | 2016-11-29 | Google Inc. | Conditional predictive multi-symbol run-length coding |
US9756331B1 (en) | 2013-06-17 | 2017-09-05 | Google Inc. | Advance coded reference prediction |
US9392288B2 (en) | 2013-10-17 | 2016-07-12 | Google Inc. | Video coding using scatter-based scan tables |
US9179151B2 (en) | 2013-10-18 | 2015-11-03 | Google Inc. | Spatial proximity context entropy coding |
US10102613B2 (en) | 2014-09-25 | 2018-10-16 | Google Llc | Frequency-domain denoising |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5442458A (en) * | 1991-12-18 | 1995-08-15 | Eastman Kodak Company | Method and associated apparatus for encoding bitplanes for improved coding efficiency |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4110795A (en) * | 1976-12-20 | 1978-08-29 | Litton Systems, Inc. | Method of graphic data redundancy reduction in an optical facsimile system |
CA1128645A (en) * | 1978-07-31 | 1982-07-27 | Yasuhiro Yamazaki | Transmission method and system for facsimile signal |
EP0220110A3 (en) * | 1985-10-07 | 1990-04-25 | Fairchild Semiconductor Corporation | Bit plane area correlator |
JP2506332B2 (en) * | 1986-03-04 | 1996-06-12 | 国際電信電話株式会社 | High-efficiency coding method for moving image signals |
GB2260236A (en) * | 1991-10-04 | 1993-04-07 | Sony Broadcast & Communication | Data encoder |
US5157489A (en) * | 1991-10-28 | 1992-10-20 | Virgil Lowe | Apparatus and method for reducing quantizing distortion |
US5325126A (en) * | 1992-04-01 | 1994-06-28 | Intel Corporation | Method and apparatus for real time compression and decompression of a digital motion video signal |
-
1996
- 1996-07-11 EP EP96923178A patent/EP0838113A1/en not_active Withdrawn
- 1996-07-11 WO PCT/SE1996/000943 patent/WO1997003516A1/en not_active Application Discontinuation
- 1996-07-11 AU AU63762/96A patent/AU6376296A/en not_active Abandoned
- 1996-07-11 JP JP9505759A patent/JPH11513205A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5442458A (en) * | 1991-12-18 | 1995-08-15 | Eastman Kodak Company | Method and associated apparatus for encoding bitplanes for improved coding efficiency |
Also Published As
Publication number | Publication date |
---|---|
JPH11513205A (en) | 1999-11-09 |
AU6376296A (en) | 1997-02-10 |
WO1997003516A1 (en) | 1997-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6208761B1 (en) | Video coding | |
WO1997003516A1 (en) | Video coding | |
US4969040A (en) | Apparatus and method for differential sub-band coding of video signals | |
US6091777A (en) | Continuously adaptive digital video compression system and method for a web streamer | |
US4821119A (en) | Method and apparatus for low bit-rate interframe video coding | |
EP0903042B1 (en) | Quantization matrix for still and moving picture coding | |
US5235419A (en) | Adaptive motion compensation using a plurality of motion compensators | |
US5473377A (en) | Method for quantizing intra-block DC transform coefficients using the human visual characteristics | |
US6393060B1 (en) | Video coding and decoding method and its apparatus | |
Chin et al. | A software-only videocodec using pixelwise conditional differential replenishment and perceptual enhancements | |
CA2037444C (en) | Video signal hybrid-coding systems | |
EP0734164B1 (en) | Video signal encoding method and apparatus having a classification device | |
Schiller et al. | Efficient coding of side information in a low bitrate hybrid image coder | |
WO1996014711A1 (en) | Method and apparatus for visual communications in a scalable network environment | |
KR0181032B1 (en) | Object-based encoding and apparatus using an interleaving technique | |
US20050036549A1 (en) | Method and apparatus for selection of scanning mode in dual pass encoding | |
WO1991014295A1 (en) | Digital image coding using a random scanning of image frames | |
US8064516B2 (en) | Text recognition during video compression | |
EP0892557A1 (en) | Image compression | |
US6445823B1 (en) | Image compression | |
KR19980017213A (en) | Image Decoding System with Compensation Function for Degraded Image | |
US6956973B1 (en) | Image compression | |
Shyu et al. | Detection and correction of transmission errors in facsimile images | |
Netravali et al. | A high quality digital HDTV codec | |
EP0848557A2 (en) | Subband image encoding method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19980127 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE FR GB NL SE |
|
17Q | First examination report despatched |
Effective date: 19991216 |
|
APAB | Appeal dossier modified |
Free format text: ORIGINAL CODE: EPIDOS NOAPE |
|
APAB | Appeal dossier modified |
Free format text: ORIGINAL CODE: EPIDOS NOAPE |
|
APAD | Appeal reference recorded |
Free format text: ORIGINAL CODE: EPIDOS REFNE |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL) |
|
APBX | Invitation to file observations in appeal sent |
Free format text: ORIGINAL CODE: EPIDOSNOBA2E |
|
APBT | Appeal procedure closed |
Free format text: ORIGINAL CODE: EPIDOSNNOA9E |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
APAF | Appeal reference modified |
Free format text: ORIGINAL CODE: EPIDOSCREFNE |
|
18D | Application deemed to be withdrawn |
Effective date: 20050604 |