WO2014041355A1 - Multi-view high dynamic range imaging - Google Patents
Multi-view high dynamic range imaging Download PDFInfo
- Publication number
- WO2014041355A1 WO2014041355A1 PCT/GB2013/052390 GB2013052390W WO2014041355A1 WO 2014041355 A1 WO2014041355 A1 WO 2014041355A1 GB 2013052390 W GB2013052390 W GB 2013052390W WO 2014041355 A1 WO2014041355 A1 WO 2014041355A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dynamic range
- video stream
- high dynamic
- relatively high
- relatively low
- Prior art date
Links
- 238000003384 imaging method Methods 0.000 title description 12
- 238000000034 method Methods 0.000 claims abstract description 237
- 238000013507 mapping Methods 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims description 11
- 238000004519 manufacturing process Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 description 29
- 238000007906 compression Methods 0.000 description 24
- 230000006835 compression Effects 0.000 description 22
- 238000003860 storage Methods 0.000 description 9
- 230000008901 benefit Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000003068 static effect Effects 0.000 description 5
- 238000012937 correction Methods 0.000 description 4
- 238000007667 floating Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 241001669573 Galeorhinus galeus Species 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005549 size reduction Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
Definitions
- This invention relates to multi-view high dynamic range imaging, and more particularly to methods of producing compressed multiview, for example stereoscopic, images, either still or moving.
- Preferred implementations of the invention are concerned with compression techniques which are backwards compatible with established methods such as JPEG or MPEG and/or similar established methods.
- Stereoscopic imagery and high dynamic range (HDR) imagery are relatively new imaging methods that both provide advantages over traditional monocular low dynamic range (LDR) imagery. Possibly due to their novelty and perhaps due to technical complications, these methods have not been combined together as yet; however, there is no reason to envisage that they cannot be complementary.
- SHDR imagery has the potential of providing richer fidelity when compared to the traditional LDR methods by combining the advantages of HDR and stereoscopic methods.
- compression methods that are not only efficient but also provide backward compatibility with the traditional JPEG LDR and JPEG HDR (as in WARD G.: JPEG-HDR: A backwards- compatible, high dynamic range extension to JPEG. ACM SIGGRAPH 2006 Courses (2006), 8. 1 , 2) to improve the possibilities of early adoption.
- Stereoscopic (frequently called 3D) imaging is a technology that captures and displays images representing the two human eyes.
- the advantages of such a method rely on enabling or improving the illusion of depth. It can also improve task performance and provide a strong cue for distance judgements.
- Stereoscopic imagery recently became more widely available as it has now become popular in the consumer market. Consumer products that both capture and display stereoscopic images and video can be purchased at reasonable price and are slowly becoming standard.
- the entertainment industry is also adopting these techniques with more movies being released in 3D format and video games allowing 3D visualisation. Even some of the latest smart- phones implement this technology.
- HDR imagery is a novel technology that has seen several advances over the past few years.
- HDR enables the capture, storage and display of real world luminance, thus producing images that are more representative of the real world. While a traditional LDR image is limited to around eight exposure stops, i.e. eight fold doubling of the amount of light, HDR does not have such limitations and is able to handle any luminance values that the human eye can see and more.
- HDR capture of static images has now become widespread, and is available on standard smartphones. HDR video is less common at present although some cameras exist.
- HDR high definition digital versatile disc
- SHDR stereoscopic content in one of its stronger areas, depth perception, as it can provide further depth information via contrast.
- SHDR content thus has the potential of achieving the best of both worlds, resulting in images with higher fidelity than current imaging methods.
- One drawback of SHDR may be the size of images. SHDR images can be quite large if left uncompressed.
- a raw SHDR image at an HD resolution of 1 ,920 ⁇ 1 ,080 can be just short of 48 MB as it stores 96-bits per pixel. With the emergence of higher resolutions, such as super HD's 7,680x4,320 resolution, this would balloon to 759 MB.
- Embodiments of the present invention provide the ability to efficiently store SHDR images.
- a number of novel methods are disclosed which make it possible to store SHDR images more efficiently, in some cases with sizes which are only marginally larger than traditional LDR images or videos.
- Some embodiments provide the ability of opening any SHDR image in a traditional LDR viewer and in a traditional HDR viewer, and some embodiments are backwards compatible with LDR stereo viewers.
- the invention provides a method of compressing first and second relatively high dynamic range original images, representing different views of the same scene, comprising the steps of tone mapping the first relatively high dynamic range original image to create a first relatively low dynamic range image; creating difference data which represents the difference between the first relatively low dynamic range image and the first relatively high dynamic range image; tone mapping the second relatively high dynamic range original image to create a second relatively low dynamic range image; using the first and second relatively low dynamic range images to create encoded image data representing the first and second relatively low dynamic range images; and creating a package comprising the encoded image data and the difference data.
- the first and second images may represent stereoscopic views, i.e. left and right images of the same scene.
- the method is carried out on a series of first original images constituting a first relatively high dynamic range original video stream, and a series of second original images constituting a second relatively high dynamic range original video stream, the first and second video streams representing different views of the same scene.
- a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene comprising the steps of tone mapping the first relatively high dynamic range original video stream to create a first relatively low dynamic range video stream; creating difference data which represents the difference between the first relatively low dynamic range video stream and the first relatively high dynamic range video stream; tone mapping the second relatively high dynamic range original video stream to create a second relatively low dynamic range video stream; using the first and second relatively low dynamic range video streams to create encoded video data representing the first and second relatively low dynamic range video streams; and creating a package comprising the encoded video data and the difference data.
- the first and second relatively low dynamic range video streams are used to create a tone mapped multiview video encoded stream; and the package comprises the multiview video encoded stream and the difference data.
- the multiview video encoded stream may be created by using the first and second relatively low dynamic range video streams together to create a single encoded video stream using for example MVC (Multiview Video Coding) based on the
- H.264/MPEG-4 AVC a block-oriented motion-compensation-based codec standard.
- the invention provides a decoding method for producing first and second original high dynamic range images each of which represents a different view of the same scene, from a package comprising encoded image data representing first and second relatively low dynamic range tone mapped images and difference data which represents the difference between the first relatively low dynamic range image and the first original relatively high dynamic range image, comprising the steps of decoding the encoded video data to generate first relatively low dynamic range image data representing a first view of the scene, and second relatively low dynamic range image data representing a second view of the scene; using the difference data and the first relatively low dynamic range image data to generate the first high dynamic range image data representing the first view of the scene; using the second relatively low dynamic image data and the difference data to create second difference data; and using the second relatively low dynamic range image data and the second difference data to generate the second high dynamic range image representing a second view of the scene.
- the first and second images may represent stereoscopic views, i.e. left and right images of the same scene.
- the method is carried out on a package comprising encoded video stream data representing first and second relatively low dynamic range tone mapped video streams and difference data which represents the difference between the first relatively low dynamic range video stream and a first original relatively high dynamic range video stream, to produce a series of first original images constituting the first relatively high dynamic range original video stream, and a series of second original images constituting a second relatively high dynamic range original video stream, the first and second video streams representing different views of the same scene.
- a method of producing first and second original high dynamic range video streams each of which represents a different view of the same scene from a package comprising a tone mapped encoded stream of relatively low dynamic range video stream data and difference data which represents the difference between one of the multiview streams and the
- the multiview video encoded stream to be decoded may have been created by using the first and second relatively low dynamic range video streams together to create a single encoded video stream using for example MVC (Multiview Video Coding) based on the H.264/MPEG-4 AVC.
- MVC Multiview Video Coding
- Various methods disclosed herein share the same concept of exploiting the natural spatial coherence that exists between the left and right views, so most methods either store both views or as much of one view as is possible and some form of extra information to construct the other view.
- HDR images When stored as a raw data, considering by way of example only the situation when using RGB colour space, HDR images are generally considered to be composed of three floating point values, one for each of the red, green and blue channels, for a total of 96-bits per pixel (bpp); this results in 24 MB per frame at HD resolution. Similarly large image sizes will result if using other colour spaces such as HSV or any form of spectral data. Due to such prohibitively large sizes, a number of methods for storing HDR images exist. WARD G.: Real pixels. Graphics Gems II (1991 ), 80-83. 2 introduced the RGBE format for use with the Radiance software.
- This format stores RGB values as an 8-bit mantissa per pixel and furthermore stores an 8-bit exponent for a total of 32-bpp.
- the LogLuv format also proposed by Ward stores images by storing colour and luminance separately.
- LogLuv supports two formats, a 24-bpp format and a 32-bpp format.
- the 24-bpp stores 10 bits for the luminance channel and 14 bits for the colour channels, but only supports just short of 5 orders of luminance magnitude.
- the 32-bpp format can achieve 38 orders of magnitude with 16 bits for chroma and 16 for luminance.
- the OpenEXR format is a common HDR format used by the entertainment industry, and stores each of the channels as 16-bit half precision floating point values for a total of 48-bpp. It is frequently further compressed using lossless methods.
- the above methods are considered storage methods for representing HDR.
- This method stores a tone mapped version of the HDR image which is encoded using JPEG and the extra information used to reconstruct the HDR image, termed the ratio image, is stored in a sub-band of the JPEG format.
- the ratio image can be downsampled significantly as the human visual system is not very sensitive to low frequency changes in luminance.
- the sub-band is ignored and the tone mapped image can be seen.
- the decoding process recovers the missing information from the tone mapped image from the sub-band image (which is upsampled back to the original size) to recover the HDR image.
- HDR-JPEG2000 ( XU R., PATTANAIK S., HUGHES C: High-dynamic- range still- image encoding in jpeg 2000. Computer Graphics and Applications, IEEE 25, 6 (2005), 57-64. 2).
- HDR-JPEG2000 transforms the HDR data into the 16-bit unsigned shorts supported natively by JPEG2000.
- Okuda and Adami (OKUDA M., ADAMI N.: Two-layer coding algorithm for high dynamic range images based on luminance compensation. Journal of Visual Communication and Image
- Stereoscopic images in their most basic form are stored as two separate images. Perhaps, the most popular stereo image format is the JPEG stereo (JPS) which supports a number of formats, storing images side by side, potentially at half the resolution or interleaved.
- JPS JPEG stereo
- the JFIF JPEG File Interchange Format
- the main entry is used to store a tone mapped version of one of the stereo pairs (and for one embodiment a side by side version).
- This format provides storage channels for metadata. This is used to store the additional information such as ratio images, disparity maps and motion compensation information, depending on the technique proposed in order to restore the full SHDR content, as shall be discussed below. While, the number (16) of metadata channels and size (64 KB) are limited, they are sufficient for the proposed methods.
- this limitation can be overcome by using more storage channels which have the same identifier.
- Some embodiments aim to achieve a balance between image size, quality and backward compatibility. For videos there could be stored separate streams of data within a single media container, such as AVI.
- JPEG and MPEG are the coding methods used but the invention is applicable to other codecs. Where there are references to JPEG or MPEG below they extend to other still image or video codecs.
- a first technique aims to store images together before or after the compression process. Even though the preferred embodiment involves stereo, multi-view is achieved simply by increasing the number of images stored. While in the description of an embodiment of this technique images are stored side-by-side, the invention includes storing images in any predetermined spatiotemporal pattern which is used for reconstruction.
- images/frames can be stored next to each other horizontally (as in the SBS embodiment disclosed below), vertically (over-under), one after the other (time sequence) or interleaved in blocks of pixels (which could be of different sizes and even generated dynamically).
- a first embodiment of this technique (a "Side-by-Side” or SBS method) aims to preserve quality of the original image and minimise data loss at the expense of larger sizes compared to the other methods. It is also a good foundation for further examination as it provides an initial reference point in terms of quality which other approaches should try to attain.
- Both of the images in this case are coded using JPEG-HDR so that the quality of the resorted image is kept high and further file size reduction is not considered.
- This method starts by appending the right HDR image of a stereo pair to the left one. The result is a single side-by-side HDR image. This image is then compressed using JPEG-HDR.
- each image could be compressed first using JPEG-HDR, and then the results would get formatted as explained below.
- Data from the second image can be put into JPEG sub-bands making it available for traditional monocular image viewers which would show only one of the image pair.
- the tone-mapped images can be left side-by-side and saved as a stereo JPEG (JPS) which can be then viewable in LDR stereo viewers. Leaving all or some (more than two) images like this in the multi-view case would not be compatible with LDR stereo viewers.
- JPS stereo JPEG
- This variant can also be opened in traditional viewers but all the views would be displayed. While such behaviour provides at least some insight into the content of the file, it may not be desirable for the user.
- SBSV side by side video
- SHDR MPEG stores double the number of streams, the one half containing streams of tone mapped images and the other streams containing ratio information. Encoding of the tone mapped streams will be conducted using MPEG.
- MPEG MPEG or other lossy or lossless methods can be used.
- the ratio streams can also be decreased in size before being compressed using filtering methods such as bilateral filters. This method is backward compatible with both stereo MPEG and HDR MPEG and consequently with traditional MPEG also.
- HSBS Half Side-by-Side
- the coding process involves resizing the images of the HDR pair as described and proceeds in the same manner as SBS.
- images are resized back up again.
- the image size is roughly halved but so is its resolution.
- This method is backward compatible with LDR stereo viewers. Images could be resized using any scaling factor, and not only 0.5.
- HSBSV High Side by Side Video
- one stream composed of side by side tone mapped images of the video stored at half vertical or horizontal resolution each and a second data stream with side by side ratio information also at the same half resolution.
- interlaced images in a side by side method by storing each line from each image alternately, or any of the image arrangements discussed earlier.
- the tone mapped image is encoded using MPEG and similar methods to what are described above are used for the ratio stream.
- HSBSV is backward compatible with stereo MPEG. It can also be viewed on traditional LDR displays but the images will be displayed side by side.
- any disparity map can be used with this method (calculated or captured); however the smooth low frequency ones are preferred because of better compression rates.
- the disparity maps are obtained employing techniques suggested by Mei et al. (MEI X., CUI C, SUN X., ZHOU M., WANG Q., WANG H.: On Building an Accurate Stereo Matching System on Graphics Hardware. In Computer Vision Workshops (ICCV Workshops) (201 1 ), pp. 467— 474. 3) which is named AD-census (ADC).
- This method utilises the Graphics Processing Unit (GPU) during calculations leading to fast performance, and also the technique produces a low number of errors.
- An "Image Plus Disparity Video” (IPDV) method stores sources of data (in the preferred embodiment, at least three streams, although all the data can be stored in one or more streams arranged in any predefined manner as explained for the case of the SBS method) containing the tone mapped sequence representing one of the eye sequences, the ratio sequence of the same eye and the disparity.
- the disparity can be computed using any pixel or stereo correspondence methods or calculated as a depth map from hardware or computed using software.
- the tone mapped and ratio sequences are stored as above and the disparity method can be compressed using lossy or lossless compression, depending on the quality required.
- IPDC Image Plus Disparity with Corrections
- An alternative to the method used for IPD is to use a disparity map that maps only the closeness of the values, such as the sum of absolute differences (SAD) method (CYGANEK B., SIEBERT J. P.: An Introduction to 3D Computer Vision Techniques and Algorithms. John Wiley & Sons, Ltd, Chichester, UK, Jan. 2009. 4).
- SAD sum of absolute differences
- Such disparity maps do without the smoothness condition that maps such as AD-census (ADC) use and are therefore higher frequency. The high frequency means that these disparity maps would not compress well if used by themselves.
- a method disclosed herein avoids the problems of both these methods by combining them and using SAD or other similar matching methods in regions with large differences only.
- a further method disclosed is based on the observation that views differ only by the camera position, which is not dissimilar from the temporal motion between subsequent frames in videos, and may thus be termed a "Motion Compensation" (MC) method.
- MC Motion Compensation
- Known video coders are used to compress the SHDR image. Two views of a stereo pair (or multiple views in case of multiview) are treated as consecutive frames in the video which is then processed. Any video coder or camera motion compensation can be used here. In some embodiments H.264 codec is used.
- MMV Motion Compensation Video
- MPEG or similar codecs
- tone mapped streams are stored, one for each camera position; the first is compressed traditionally but the remainder use motion compensation information from the first also.
- Further streams are stored with the ratio sequence information, possibly using a similar method to the first.
- This method is backward compatible with HDR MPEG and traditional MPEG and also with stereo MPEG (in the case of two streams). It is possible to compress the first eye information by using the information in both sequences, such as is used by Multi View Codec in H.264, in which case the backward compatibility with HDR MPEG and traditional MPEG is compromised but it remains compatible with versions of stereo MPEG.
- any tone mapping and inverse tone mapping methods can be applied to these methods and would not affect the quality.
- the methods can be expanded for multiview HDR video by storing more than a single stream per view or using disparity similar to IPDV.
- HDR Stereoscopic High Dynamic Range
- HDR High Dynamic Range
- Figure 1 illustrates a process for encoding a stereoscopic High Dynamic Range image using an Image Plus Disparity (IPD) technique
- Figure 2 illustrates a process for decoding the image using the Image Plus Disparity (IPD) technique
- Figure 3 illustrates part of a process for encoding a stereoscopic High Dynamic Range image using an Image Plus Disparity with Corrections (IPDC) technique
- IPD Image Plus Disparity
- IPDC Image Plus Disparity with Corrections
- Figure 4 illustrates a process for encoding a stereoscopic High Dynamic Range image using a Motion Compensation technique
- Figure 5 illustrates a process for decoding a stereoscopic High Dynamic Range image using the using the Motion Compensation technique
- Figure 6 illustrates a process for encoding a stereoscopic High Dynamic Range video stream using an Image Plus Disparity (IPD) technique
- Figure 7 illustrates part of a process for encoding a stereoscopic High Dynamic Range video stream using an Image Plus Disparity with Corrections (IPDC) technique.
- IPDC Image Plus Disparity with Corrections
- Figure 8 illustrates a process for decoding a stereoscopic High Dynamic Range video stream using an Image Plus Disparity (IPD) technique (or an Image Plus Disparity with Corrections (IPDC) technique):
- Figure 9 illustrates a process for encoding a stereoscopic High Dynamic Range video stream using a Multiview Video Coding technique;
- IPD Image Plus Disparity
- IPDC Image Plus Disparity with Corrections
- Figure 10 illustrates a process for decoding a stereoscopic High Dynamic Range video stream using the Multiview Video Coding technique
- Figure 1 1 illustrates a process for encoding a stereoscopic High Dynamic Range video stream using a two stream technique
- Figure 12 illustrates a process for decoding a stereoscopic High Dynamic Range video stream using the two stream technique
- Figure 13 illustrates a process for encoding a stereoscopic High Dynamic Range video stream using a Side by Side technique
- Figure 14 illustrates a process for decoding a stereoscopic High Dynamic Range video stream using the Side by Side technique
- Figure 15 illustrates a process for encoding a stereoscopic High Dynamic Range video stream using a Half Side by Side technique
- Figure 16 illustrates a process for decoding a stereoscopic High Dynamic Range video stream using the Half Side by Side technique.
- FIG. 1 An encoding process using an IPD method is shown in Figure 1 . It includes known HDR JPEG compression to generate a tone mapped image from one of a left and right image pair, in this case the left image, and a sub-band ratio image, which effectively will permit the original HDR left image to be retrieved from the tone mapped image if the equipment reading the coded image is compatible with HDR images.
- the other image of the pair in this case the right image, is used together with the first image to generate a disparity map.
- This is compressed by lossless LZW and Huffman Coding, or another suitable method.
- This disparity data will be used to construct the right image from the left image, in stereoscopic compatible equipment.
- the disparity data is then formatted with the tone mapped left image and the ratio image to create the final coded image. Both the ratio image and the disparity date are in sub-bands.
- Figure 2 shows how the image is decoded.
- the ratio image is used together with the tone mapped image to create a restored HDR left image.
- the disparity data is decoded according to LZW and Huffman decoding to create a disparity map, which is used to warp the restored HDR left image data to create a restored HDR right image.
- This technique is therefore backwards compatible as either the ratio image data or the disparity data, or both can be used, or just the tone mapped image, depending on the capabilities of the display equipment.
- the output file size is not fixed but it is a fraction of the original image and can easily fit in JPEG sub-bands.
- this method can be modified by using an IPDC method, in which the disparity map is generated differently.
- This process is shown in Figure 3.
- a high frequency disparity map is obtained from the original HDR left and right images using the SAD method or any other similar method.
- a low frequency disparity map is also obtained from the original HDR left and right images using the ADC method or other correspondence methods.
- One of the pairs of images, in this case the right image is warped using the low frequency disparity map to create a restored version of the other original image, in this case the left image.
- This is then used in conjunction with the original left image, for example by division, to create data representing the difference between the restored image and the original image, which may be termed a ratio image. Differences above an empirically obtained threshold are identified.
- FIG. 4 illustrate an MC process for creating a coded image from an original pair of left and right stereoscopic HDR images. Encoding starts by compressing the left and right images separately using JPEG-HDR. This creates a pair of tone mapped images, each with a sub-band ratio image. The two tone-mapped images are merged to create a two frame video, which is processed by the video encoder. One is treated as the key frame, in this case the left image.
- the other image is used to create predicted frame data, in this case the right image.
- the second frame corresponds to the p-frame data.
- the left image is left stored in the original JPEG-HDR format and is not encoded in any other way, as are the sub-band ratio images for both the left and right images.
- the extracted second frame data depends on the compression quality used but for moderate compression values it is reasonably small and can fit in additional JPEG sub-bands together with the JPEG-HDR ratio images for both frames.
- the decoding process is the inverse of the coding technique as shown in Figure 5.
- a video file consisting of two frames is generated using the tone-mapped image of the left frame which is appended to the second frame data.
- This video is then decoded which provides the second tone mapped image for the reconstruction using JPEG-HDR.
- JPEG-HDR JPEG-HDR viewer
- Figure 6 is similar to the IPD embodiment of Figure 1 , but shows the encoding of left and right video streams.
- Tone mapping is used to create a tone mapped left HDR video data.
- the tone mapped stream is used together with the original left stream to create residual data representing the difference between the original and the tone mapped stream, and this is encoded.
- the right HDR video stream is used together with the original left HDR stream, to create disparity data, which is also encoded.
- the resultant encoded SHDR stream comprises the encoded tone mapped left HDR data, the encoded residual data and the encoded disparity data.
- Figure 7 corresponds to the process of generating disparity data for an IPDC embodiment, described with reference to Figure 3, but for left and right HDR video streams, and thus shows in detail the steps marked by the dotted lines on Figure 6.
- Figure 8 shows the decoding method for an IPD or IPDC encoded video stream.
- the tone mapped stream is used together with decoded residual data to create decoded HDR left image data.
- Decoded disparity data is used together with the decoded left HDR data to create decoded right HDR data.
- Figure 9 shows an encoding arrangement for a stereoscopic video stream with left and right HDR channels.
- the left and right HDR stream are both tone mapped and then are used together to create a single encoded video stream using for example MVC (Multiview Video Coding) based on the H.264/MPEG-4 AVC, a block-oriented motion-compensation-based codec standard.
- MVC Multiview Video Coding
- the tone mapped version of one stream, in this case the left is compared with the original HDR stream, to create residual data which then encoded so that the final encoded stream includes both the tone mapped video data and the residual data.
- the decoding process is shown in Figure 10.
- the tone mapped stream is decoded to create a left and a right tone mapped stream.
- the left tope mapped stream is then used together with the residual data stream to create the left HDR stream.
- the right tone mapped stream is first used with the residual data, which come from the left stream, to create residual right data.
- the right residual data is used together with the right tone mapped stream to create the right HDR stream.
- Figure 1 1 shows an encoding arrangement for a stereoscopic video stream with left and right HDR channels.
- the left and right HDR stream are both tone mapped and then are used together to create an encoded video stream with separate left and right tone mapped data.
- the tone mapped version of one stream, in this case the left is compared with the original HDR stream, to create residual data which then encoded so that the final encoded stream includes both the tone mapped video data and the residual data.
- the decoding process is shown in Figure 12.
- the data is decoded to create a left and a right tone mapped stream.
- the left tope mapped stream is then used together with the residual data stream to create the left HDR stream.
- the right tone mapped stream is first used with the residual data, which come from the left stream, to create residual right data.
- the right residual data is used together with the right tone mapped stream to create the right HDR stream.
- Figure 13 shows a Side by Side (SBS) encoding method for a stereoscopic video stream with left and right HDR channels.
- SBS Side by Side
- the decoding method is shown in Figure 14.
- the tone mapped stream is separated into left and right tone mapped streams, and the residual stream is separated into left and right residual streams.
- Each tone mapped stream is used with its residual stream, to create decoded left and right HDR streams.
- Figure 15 shows a Half Side by Side (HSBS) encoding method for a stereoscopic video stream with left and right HDR channels. The two channels are resized and one is appended to the other to create an HSBS HDR stream. This is then tone mapped to create a tone mapped HSBS stream, and a residual HSBS stream which are combined.
- Decoding is shown in Figure 16.
- the tone mapped HSBS stream and the residual HSBS stream are combined to create an HDR HSBS stream.
- the left and right images are separated and resized to create left and right HDR channels again.
- file size of the compressed SHDR image in kilobytes
- NRMSE normalised root- mean-square error
- PSNR peak signal-to-noise ratio
- the compression ratio gives the average compression ratio compared to a raw HD stereo image.
- the “quality ⁇ compression” measure is a multiplication of the average image size for each method multiplied by the average NRMSE. This value is presented to give an idea of what gives the best “bang for the buck" but should not be taken as a definitive measure as the different methods have distinct qualities. For “quality ⁇ compression” the smaller values are considered better.
- the MC method achieves the best overall "quality ⁇ compression"; it also has the added advantage that it is backward compatible with JPEG-HDR and JPEG.
- the HSBS achieves the second best overall score, this is primarily due to it having the highest compression ratio but this comes at the expense of quality, for which it is second from the bottom. It is backward compatible with LDR stereo JPEG but not fully compatible with JPEG-HDR and traditional JPEG; the image shown on a traditional JPEG viewer would be both images side by side.
- the SBS method is third overall and is backward compatible with all possible formats, however it is the largest in size, which may be to much of a prohibitive obstacle, potentially hampering the uptake of SHDR.
- the disparity methods IPDC and IPD are second from last and last respectively. However, the results depend on the quality of the disparity maps produced, and as these may become better, as this is an active research area the scope of such disparity based methods is likely to improve. In addition these methods are backward compatible with traditional JPEG and HDR JPEG and produce relatively small image sizes not much larger than JPEG-HDR images.
- SHDR as a combination of stereoscopic and high dynamic range imaging.
- SHDR compression methods based on the JPEG standard and the JPEG HDR compression method, and based on the MPEG standard.
- the MC, SBS, IPDC and IPD methods are all backward compatible with traditional JPEG and JPEG-HDR, and while the results demonstrate a better overall performance for MC, the disparity based methods may improve with time as disparity methods improve.
- the SBS and HSBS methods are backward compatible with stereo JPEG and could be used on occasions where this is sufficient.
- the backward compatibility constraint that has been imposed may be removed, although this may be better to use once the technology is widely adopted and if great improvements are possible.
- the methods have been evaluated with NRMSE and PSNR, since no perceptual metric currently exists for evaluating SHDR.
- the invention is not limited to JPEG and MPEG, and the encoded files could be made compatible or backwards compatible with equipment design to handle images or videos which have been encoded using other codecs.
- the images may be static images or frames of a video stream.
- the images may be static images or frames of a video stream.
- a method of a plurality of relatively high dynamic range video streams each of which represents a different view of the same scene from relatively low dynamic range video stream data and difference data which represents the difference between the relatively low dynamic range video stream data and original relatively high dynamic range video stream data.
- first and second relatively high dynamic range original images representing different views of the same scene
- a method of compressing first and second relatively high dynamic range original images comprising the steps of generating disparity data representing the disparity between the first relatively high dynamic range original image and the second relatively high dynamic range original image; tone mapping the first relatively high dynamic range original image to create a relatively low dynamic range image; creating difference data which represents the difference between the relatively low dynamic range image and the first relatively high dynamic range image; and creating a package comprising the relatively low dynamic range image, the difference data, and the disparity data.
- the disparity data may be compressed and included in the package in compressed form.
- the difference data may be compressed and included in the package in compressed form.
- the images may be still images or image frames of a video stream.
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range images.
- a preferred arrangement for generating disparity data representing the difference between the first relatively high dynamic range original images and another one of the relatively high dynamic range original images comprises the steps of:
- a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene comprising the steps of generating disparity data representing the disparity between the first relatively high dynamic range original video stream and the second relatively high dynamic range original video stream; tone mapping the first relatively high dynamic range original video stream to create a relatively low dynamic range video stream; creating difference data which represents the difference between the relatively low dynamic range video stream and the first relatively high dynamic range video stream; and creating a package comprising the relatively low dynamic range video stream, the difference data, and the disparity data.
- the disparity data may be compressed and included in the package in compressed form.
- the difference data may be compressed and included in the package in compressed form.
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range video streams.
- a preferred arrangement for generating disparity data representing the difference between the first relatively high dynamic range original video stream and another one of the relatively high dynamic range original video streams comprises the steps of: generating a first, relatively high frequency, disparity map representing the differences between said first relatively high dynamic range original stream and said other of the relatively high dynamic range original video stream; generating a second, relatively low frequency, disparity map representing the differences between said first relatively high dynamic range original video stream and said other of the relatively high dynamic range original video streams; using the second disparity map and said other of the relatively high dynamic range original video streams to create a restored version of said first relatively high dynamic range original video stream; comparing said first relatively high dynamic range original video stream and the restored version of said first relatively high dynamic range original video stream to create ratio data; and using the first disparity map, the second disparity map and the ratio data to generate a combined disparity map.
- the invention provides a method of compressing first and second relatively high dynamic range original images, representing different views of the same scene, comprising the steps of : tone mapping the first relatively high dynamic range original image to create a first relatively low dynamic range image; creating first difference data which represents the difference between the first relatively low dynamic range image and the first relatively high dynamic range image; tone mapping the second relatively high dynamic range original image to create a second relatively low dynamic range image; creating second difference data which represents the difference between the relatively low dynamic range image and said one of the relatively high dynamic range images; using video encoding of the first relatively low dynamic range image and the second relatively low dynamic range image to create predicted image data in respect of the second relatively low dynamic range image; and creating a package comprising the first relatively low dynamic range image, the first difference data, the predicted image data, and the second difference data.
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range images.
- a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene comprising the steps of tone mapping the first relatively high dynamic range original video stream to create a first relatively low dynamic range video stream; creating difference data which represents the difference between the first relatively low dynamic range video stream and the first relatively high dynamic range video stream; tone mapping the second relatively high dynamic range original video stream to create a second relatively low dynamic range video stream; using the first and second relatively low dynamic range video streams to create a tone mapped multiview video encoded stream; and creating a package comprising the multiview video encoded stream and the difference data.
- the difference data may be compressed and included in the package in
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range video streams.
- the invention provides a method of producing first and second original high dynamic range video streams each of which represents a different view of the same scene, from a package comprising a tone mapped multiview video encoded stream of relatively low dynamic range video stream data and difference data which represents the difference between one of the multiview streams and the corresponding original relatively high dynamic range video stream data, comprising the steps of decoding the multiview video encoded stream to generate first relatively low dynamic range video stream data representing the first view of the scene, and second relatively low dynamic range video stream data representing the second view of the scene; using the difference data and the first relatively low dynamic range video stream data to generate the first high dynamic range video stream representing the first view of the scene; using the second relatively low dynamic range video stream data and the difference data to create second difference data; and using the second relatively low dynamic range video stream data and the second difference data to generate the second high dynamic range video stream representing a second view of the scene.
- a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene comprising the steps of tone mapping the first relatively high dynamic range original video stream to create a first relatively low dynamic range video stream; creating difference data which represents the difference between the first relatively low dynamic range video stream and the first relatively high dynamic range video stream; tone mapping the second relatively high dynamic range original video stream to create a second relatively low dynamic range video stream; using the first and second relatively low dynamic range video streams to create tone mapped video encoded streams; and creating a package comprising the encoded streams and the difference data.
- second difference data which represents the difference between the second relatively low dynamic range video stream and the second relatively high dynamic range video stream
- the package comprises the encoded streams, the difference data and the second difference data
- the difference data may be compressed and included in the package in
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range video streams.
- the method of encoding corresponds to the previous method save that the package includes the separate stream rather than a multiview stream.
- the second relatively high dynamic range video stream is re-created using the second difference data.
- the invention provides a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene, comprising the steps of tone mapping the first relatively high dynamic range original video stream to create a first relatively low dynamic range video stream; creating first difference data which represents the difference between the first relatively low dynamic range video stream and the first relatively high dynamic range video stream; tone mapping the second relatively high dynamic range original video stream to create a second relatively low dynamic range video stream; creating second difference data which represents the difference between the second relatively low dynamic range video stream and the second relatively high dynamic range video stream; appending the second relatively low dynamic range video stream to the first relatively low dynamic range video stream to create combined video data; appending the second difference data to the first difference data to create combined difference data; and creating a package comprising the combined video data and the combined difference data.
- the difference data may be compressed and included in the package in
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range video streams.
- the combined video data is separated into first and second relatively low dynamic range video stream; and the combined difference data is separated into first and second difference data.
- the first and second relatively high dynamic range video streams can then be created.
- the invention provides a method of compressing first and second relatively high dynamic range original video streams, representing different views of the same scene, comprising the steps of resizing the first and second relatively high dynamic range original video streams and combining them to create a combined relatively high dynamic range video stream; tone mapping the combined relatively high dynamic range video stream to create a combined relatively low dynamic range video stream; creating difference data which represents the difference between the combined relatively low dynamic range video stream and the combined relatively high dynamic range video stream; and creating a package comprising the combined relatively low dynamic range video stream and the difference data.
- the difference data may be compressed and included in the package in compressed form.
- the invention provides a method of decoding the package to generate the first and second relatively high dynamic range video streams.
- the combined relatively low dynamic range video stream is used with the difference data to re-create the combined relatively high dynamic range video stream, and this is then separated into two streams which are re-sized to create the first and second relatively high dynamic range video streams.
- the invention provides a method of encoding; a method of decoding; an encoder; a decoder; data processing apparatus configured to carry out the method of encoding; data processing apparatus configured to carry out the method of decoding; computer software for configuring data processing apparatus to carry out the method of encoding; and computer software for configuring data processing apparatus to carry out the method of encoding.
- the computer program may be in tangible form such as data encoded on a DVD or a solid state memory device, or may be intangible in the form of signals transmitted over a network over the network.
- the encoding process involves not only tone mapping of an HDR image or a video stream, but also compression of the tone mapped image or video stream.
- decoding involves de-compression of an image or a video stream, and not just reversal of the tone mapping process to provide an HDR image or video stream.
- Techniques can be applied to still images, video stream or frames of video streams, as appropriate for the method used. Whilst backwards compatibility is an advantage of embodiments described, it is not an absolute requirement of the inventions in their broadest sense.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1604165.9A GB2534061B (en) | 2012-09-12 | 2013-09-12 | Multi-view high dynamic range imaging |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GBGB1216263.2A GB201216263D0 (en) | 2012-09-12 | 2012-09-12 | Multi-view high dynamic range imaging |
GB1216263.2 | 2012-09-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014041355A1 true WO2014041355A1 (en) | 2014-03-20 |
Family
ID=47137324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2013/052390 WO2014041355A1 (en) | 2012-09-12 | 2013-09-12 | Multi-view high dynamic range imaging |
Country Status (2)
Country | Link |
---|---|
GB (2) | GB201216263D0 (en) |
WO (1) | WO2014041355A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9275445B2 (en) | 2013-08-26 | 2016-03-01 | Disney Enterprises, Inc. | High dynamic range and tone mapping imaging techniques |
CN110770787A (en) * | 2017-06-16 | 2020-02-07 | 杜比实验室特许公司 | Efficient end-to-end single-layer reverse display management coding |
CN110910336A (en) * | 2019-10-30 | 2020-03-24 | 宁波大学 | Three-dimensional high dynamic range imaging method based on full convolution neural network |
WO2022066353A1 (en) * | 2020-09-23 | 2022-03-31 | Qualcomm Incorporated | Image signal processing in multi-camera system |
WO2022228368A1 (en) * | 2021-04-30 | 2022-11-03 | 华为技术有限公司 | Image processing method, device and system |
WO2023215108A1 (en) * | 2022-05-05 | 2023-11-09 | Dolby Laboratories Licensing Corporation | Stereoscopic high dynamic range video |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997043863A1 (en) * | 1996-05-15 | 1997-11-20 | Deutsche Telekom Ag | Process for storage or transmission of stereoscopic video signals |
EP1827024A1 (en) * | 2006-02-24 | 2007-08-29 | Sharp Kabushiki Kaisha | High dynamic range video coding |
WO2011037933A1 (en) * | 2009-09-22 | 2011-03-31 | Panasonic Corporation | Image coding apparatus, image decoding apparatus, image coding method, and image decoding method |
WO2012004741A1 (en) * | 2010-07-06 | 2012-01-12 | Koninklijke Philips Electronics N.V. | Generation of high dynamic range images from low dynamic range images in multi-view video coding |
-
2012
- 2012-09-12 GB GBGB1216263.2A patent/GB201216263D0/en not_active Ceased
-
2013
- 2013-09-12 WO PCT/GB2013/052390 patent/WO2014041355A1/en active Application Filing
- 2013-09-12 GB GB1604165.9A patent/GB2534061B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997043863A1 (en) * | 1996-05-15 | 1997-11-20 | Deutsche Telekom Ag | Process for storage or transmission of stereoscopic video signals |
EP1827024A1 (en) * | 2006-02-24 | 2007-08-29 | Sharp Kabushiki Kaisha | High dynamic range video coding |
WO2011037933A1 (en) * | 2009-09-22 | 2011-03-31 | Panasonic Corporation | Image coding apparatus, image decoding apparatus, image coding method, and image decoding method |
WO2012004741A1 (en) * | 2010-07-06 | 2012-01-12 | Koninklijke Philips Electronics N.V. | Generation of high dynamic range images from low dynamic range images in multi-view video coding |
Non-Patent Citations (3)
Title |
---|
ANTHONY VETRO ET AL: "Overview of the Stereo and Multiview Video Coding Extensions of the H.264/MPEG-4 AVC Standard", PROCEEDINGS OF THE IEEE, IEEE. NEW YORK, US, vol. 99, no. 4, 1 April 2011 (2011-04-01), pages 626 - 642, XP011363626, ISSN: 0018-9219, DOI: 10.1109/JPROC.2010.2098830 * |
SELMANOVIC E ET AL: "Backwards Compatible JPEG Stereoscopic High Dynamic Range Imaging", PREPRINTS OF CONFERENCE PROCEEDINGS. EG UK THEORY AND PRACTICE OF COMPUTER GRAPHICS, RUTHERFORD APPLETON LABORATORY, DIDCOT, UK, SEPTEMBER 13-14, 2012,, 13 September 2012 (2012-09-13), pages 1 - 8, XP008165726 * |
WARD GREG ET AL: "JPEG-HDR: A BACKWARDS-COMPATIBLE, HIGH DYNAMIC RANGE EXTENSION TO JPEG", PROCEEDINGS OF THE ANNUAL ACM SYMPOSIUM ON THE THEORY OFCOMPUTING, XX, XX, 1 January 2005 (2005-01-01), pages 283 - 290, XP008079630 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9275445B2 (en) | 2013-08-26 | 2016-03-01 | Disney Enterprises, Inc. | High dynamic range and tone mapping imaging techniques |
CN110770787A (en) * | 2017-06-16 | 2020-02-07 | 杜比实验室特许公司 | Efficient end-to-end single-layer reverse display management coding |
CN110770787B (en) * | 2017-06-16 | 2023-04-07 | 杜比实验室特许公司 | Efficient end-to-end single-layer reverse display management coding |
CN110910336A (en) * | 2019-10-30 | 2020-03-24 | 宁波大学 | Three-dimensional high dynamic range imaging method based on full convolution neural network |
CN110910336B (en) * | 2019-10-30 | 2022-08-30 | 宁波大学 | Three-dimensional high dynamic range imaging method based on full convolution neural network |
WO2022066353A1 (en) * | 2020-09-23 | 2022-03-31 | Qualcomm Incorporated | Image signal processing in multi-camera system |
WO2022228368A1 (en) * | 2021-04-30 | 2022-11-03 | 华为技术有限公司 | Image processing method, device and system |
WO2023215108A1 (en) * | 2022-05-05 | 2023-11-09 | Dolby Laboratories Licensing Corporation | Stereoscopic high dynamic range video |
Also Published As
Publication number | Publication date |
---|---|
GB201216263D0 (en) | 2012-10-24 |
GB201604165D0 (en) | 2016-04-27 |
GB2534061A (en) | 2016-07-13 |
GB2534061B (en) | 2018-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12205333B2 (en) | Method, an apparatus and a computer program product for volumetric video encoding and decoding | |
US11425363B2 (en) | System and method for generating light field images | |
EP3751857A1 (en) | A method, an apparatus and a computer program product for volumetric video encoding and decoding | |
CN101822068B (en) | Method and device for processing depth-map | |
US8451320B1 (en) | Methods and apparatus for stereoscopic video compression, encoding, transmission, decoding and/or decompression | |
KR102588146B1 (en) | Multi-view signal codec | |
US12096027B2 (en) | Method, an apparatus and a computer program product for volumetric video encoding and decoding | |
US10827161B2 (en) | Depth codec for 3D-video recording and streaming applications | |
HUE026534T2 (en) | Hybrid video encoding to support intermediate view synthesis | |
CN106068645A (en) | Method for full parallax squeezed light field 3D imaging system | |
WO2014041355A1 (en) | Multi-view high dynamic range imaging | |
CN103609111A (en) | Method and apparatus for video encoding using inter layer prediction with pre-filtering, and method and apparatus for video decoding using inter layer prediction with post-filtering | |
US20230262208A1 (en) | System and method for generating light field images | |
CN108259917A (en) | 3 D video decoding method and system based on depth time domain down-sampling | |
Wang et al. | Hiding depth information in compressed 2D image/video using reversible watermarking | |
US20140146134A1 (en) | Method and system for encoding 3d video | |
CN112806015A (en) | Encoding and decoding of omni-directional video | |
Selmanovic et al. | Backwards Compatible JPEG Stereoscopic High Dynamic Range Imaging. | |
CN104883558B (en) | K-means clustering based depth image encoding method | |
EP3035688B1 (en) | Encoding and decoding of 3d hdr images using a tapestry representation | |
KR20250002718A (en) | Stereoscopic high dynamic range video | |
Sgouros et al. | A reversible self watermarking framework for Integral Images | |
HK1220065B (en) | Encoding and decoding of 3d hdr images using a tapestry representation | |
HK1230822A1 (en) | Methods for full parallax compressed light field 3d imaging systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13766629 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WPC | Withdrawal of priority claims after completion of the technical preparations for international publication |
Ref document number: 1216263.2 Country of ref document: GB Date of ref document: 20150312 Free format text: WITHDRAWN AFTER TECHNICAL PREPARATION FINISHED |
|
ENP | Entry into the national phase |
Ref document number: 201604165 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20130912 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13766629 Country of ref document: EP Kind code of ref document: A1 |