WO2024096931A1 - High dynamic range image format with low dynamic range compatibility - Google Patents
High dynamic range image format with low dynamic range compatibility Download PDFInfo
- Publication number
- WO2024096931A1 WO2024096931A1 PCT/US2023/023998 US2023023998W WO2024096931A1 WO 2024096931 A1 WO2024096931 A1 WO 2024096931A1 US 2023023998 W US2023023998 W US 2023023998W WO 2024096931 A1 WO2024096931 A1 WO 2024096931A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- dynamic range
- display
- recovery
- luminance
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20208—High dynamic range [HDR] image processing
Definitions
- High dynamic range (HDR) images can be captured by some cameras, which offer a greater dynamic range and a more true-to- life picture quality than low dynamic range (LDR) images captured by many other (e.g., older) cameras.
- HDR images are best viewed by a display device capable of displaying the full dynamic ranges of the HDR images.
- a computer-implemented method includes obtaining a first image depicting a particular scene, the first image having a first dynamic range; obtaining a second image depicting the particular scene, the second image having a second dynamic range that is different than the first dynamic range; and generating a recovery map based on the first image and the second image.
- the recovery map encodes differences in luminances between portions of the first image and corresponding portions of the second image, the differences scaled by a range scaling factor that includes a ratio of a maximum luminance of the second image to a maximum luminance of the first image.
- the first image and the recovery map are provided in an image container that is readable to display a derived image based on applying the recovery map to the first image, and the derived image has a dynamic range that is different than the first dynamic range.
- the second dynamic range is greater than the first dynamic range and the dynamic range of the derived image is greater than the first dynamic range.
- the image container is readable to display the first image by a first display device capable of displaying the first dynamic range, and to display the derived image by a second display device capable of displaying a dynamic range greater than the first dynamic range.
- obtaining the first image comprises performing range compression on the second image.
- generating the recovery map includes encoding luminance gains such that applying the luminance gains to luminances of individual pixels of the first image results in pixels corresponding to the second image.
- generating the recovery map can include encoding the luminance gains in a logarithmic space, and recovery map values are proportional to the difference of logarithms of the luminances divided by a logarithm of the range scaling factor.
- generating the recovery map includes encoding the recovery map into a bilateral grid, which in some examples includes determining a three-dimensional data structure of grid cells, each grid cell being a vector element that is mapped to multiple pixels of the first image.
- the method further includes encoding the recovery map into a recovery image that is provided in the image container.
- the range scaling factor is encoded into the recovery image as metadata.
- the recovery image has the same aspect ratio as the aspect ratio of the first image.
- the method further includes obtaining the image container; determining to display the second image by a display device; scaling a plurality of pixel luminances of the first image in the image container based on a particular luminance output of the display device and based on the recovery map to obtain the derived image; and after the scaling, causing the derived image to be displayed by the display device as an output image that has a greater dynamic range than the first image.
- the method includes determining a maximum luminance display capability of the display device, and the scaling of the pixel luminances includes increasing luminances of highlights in the first image to a luminance level that is less than or equal to the maximum luminance display capability.
- the luminances of the derived image are increased up to a maximum luminance of the second image.
- scaling the plurality of pixel luminances includes increasing a range of shadows in the first image to a level that the display device is capable of displaying.
- the second dynamic range is lower than the first dynamic range and the dynamic range of the derived image is lower than the first dynamic range.
- obtaining the second image comprises performing range compression on the first image.
- the image container is readable to display the first image by a first display device capable of displaying the first dynamic range, and to display the derived image by a second display device only capable of displaying a dynamic range lower than the first dynamic range.
- a computer-implemented method includes obtaining an image container that includes a first image having a first dynamic range and a recovery map.
- the recovery map encodes luminance gains of pixels of the first image that are scaled by a range scaling factor that includes a ratio of a maximum luminance of a second image to a maximum luminance of the first image.
- the second image corresponds to the first image in depicted subject matter and has a second dynamic range that is different than the first dynamic range.
- the method determines whether to display one of: the first image or a derived image having a dynamic range greater than the first dynamic range.
- the first image is caused to be displayed by a display device.
- the luminance gains of the recovery map are applied to luminances of pixels of the first image to determine respective corresponding pixel values of the derived image, and the derived image is caused to be displayed by the display device.
- the second dynamic range is greater than the first dynamic range and the dynamic range of the derived image is greater than the first dynamic range.
- the method includes determining a maximum luminance display capability of the display device. In some implementations, determining to display the first image is in response to determining that the display device is capable of displaying only a display dynamic range that is equal to or less than the first dynamic range. In some implementations, determining to display the derived image is in response to determining that the display device is capable of displaying a display dynamic range that is greater than the first dynamic range. In some implementations, the display device is capable of displaying a display dynamic range that is greater than the first dynamic range, and applying the gains of the recovery map includes adapting the luminances of the pixel values of the derived image to the display dynamic range of the display device.
- the display device is capable of displaying a display dynamic range that is greater than the first dynamic range, and applying the gains of the recovery map includes adapting the luminances of the pixel values of the derived image to the display dynamic range of the display device.
- the dynamic range of the derived image has a maximum that is the lesser of: a maximum luminance of the display device, and a maximum luminance of a dynamic range of the second image.
- applying the gains of the recovery map includes scaling the luminances of the first image based on a particular luminance output of the display device and based on the recovery map.
- the method further includes, in response to determining to display the derived image, decoding the range scaling factor from the image container. In some implementations, the method further includes decoding the recovery map from a recovery image included in the image container. In some implementations, the method further includes extracting the recovery map from a bilateral grid stored in the image container.
- causing the derived image to be displayed by the display device includes reducing a maximum luminance level of the displayed derived image based on a system setting of a device that causes the derived image to be displayed, wherein the system setting is selected by a user.
- causing the derived image to be displayed by the display device includes: causing the derived image to be displayed with a first maximum luminance level that is below a second maximum luminance level determined for the pixel values of the derived image by applying the luminance gains, and gradually increasing the first maximum luminance level of the derived image over a particular period of time up to the second maximum luminance level.
- a system includes a processor; and a memory coupled to the processor, with instructions stored thereon that, when executed by the processor, cause the processor to perform operations including obtaining an image container that includes a first image having a first dynamic range and a recovery map that encodes luminance gains of pixels of the first image.
- the method determines to display a derived image on a display device, wherein the derived image corresponds to the first image in depicted subject matter and has a dynamic range different than the first dynamic range.
- the gains of the recovery map are applied to luminances of pixels of the first image to determine respective corresponding pixel values of the derived image, and the luminances of the first image are scaled based on a particular luminance output of the display device and based on the recovery' map.
- the derived image is caused to be displayed by the display device.
- the dynamic range of the derived image is greater than the first dynamic range
- the instructions cause the processor to perform operations further including determining a maximum luminance display capability of the display device, wherein determining to display the derived image is in response to determining that the display device is capable of displaying a display dynamic range that is greater than the first dynamic range.
- the dynamic range of the derived image has a maximum that is the lesser of: a maximum luminance of the display device, and a maximum luminance of a dynamic range of an original image used in generation of the recovery map.
- the recovery map encodes luminance gains of pixels of the first image that are scaled by a range scaling factor that includes a ratio of a maximum luminance of a second image to a maximum luminance of the first image, wherein the second image corresponds to the first image in depicted subject matter and has a second dynamic range that is greater than the first dynamic range.
- the processor performs further operations including extracting the recovery map from a bilateral grid stored in the image container.
- the system includes one or more of the operations and/or features of the methods described above.
- Some implementations may include a computing device that includes a processor and a memory coupled to the processor.
- the memory has instructions stored thereon that, when executed by the processor, cause the processor to perform operations that include one or more of the operations and/or features of the methods described above.
- Some implementations include a non-transitory computer-readable medium with instructions stored thereon that, when executed by a processor, cause the processor to perform operations that can be similar to operations and/or features of the methods, systems, and/or computing devices described above.
- Fig. 1 is a block diagram of an example network environment which may be used for one or more implementations described herein.
- Fig. 2 is a flow diagram illustrating an example method to encode an image in a backward-compatible high dynamic range image format, according to some implementations.
- Fig. 3 is a flow diagram illustrating an example method to generate a recovery map based on an HDR image and an LDR image, according to some implementations.
- Fig. 4 is a flow diagram illustrating an example method to encode a recovery map into a bilateral grid, according to some implementations.
- Fig. 5 is a flow diagram illustrating an example method to decode an image of a backward-compatible high dynamic range image format and display an HDR image, according to some implementations.
- FIGs. 6-9 are illustrations of example images representing a high dynamic range (Fig, 6) and a low dynamic range (Figs. 7-9), according to some implementations.
- Fig. 10 is a block diagram of an example computing device which may be used to implement one or more features described herein.
- This disclosure relates to backward-compatible HDR image formats.
- the image formats provide a container from which both low dynamic range (LDR) and high dynamic range (HDR) images can be obtained and displayed.
- the image formats can be used to display LDR versions of images by LDR display devices, and can be used to display HDR versions of the images by HDR display devices.
- an image container is created that implements an image format.
- An image of lower dynamic range e.g., an LDR image
- a corresponding image of greater (e.g., higher) dynamic range e.g., an HDR image
- the HDR image is captured by a camera
- the LDR image is created from the HDR image, e.g., using tone mapping or other process.
- the LDR image can be in a standard image format, such as JPEG or other format.
- a recovery map is generated based on the HDR and LDR images, that encodes differences (e.g., gams) in luminances between portions (e.g., pixels or pixel values) of the LDR image and corresponding portions (e.g., pixels or pixel values) of the HDR image, where the differences are scaled by a range scaling factor that includes a ratio of a maximum luminance of the HDR image to a maximum luminance of the LDR image.
- the recovery map can be a scalar function, and encodes the luminance gains in a logarithmic space, where recovery map values are proportional to the difference of logarithms of the luminances divided by a logarithm of the range scaling factor.
- the recovery map can be encoded into a data structure or image representation that can reduce the storage space required to store the recovery map.
- the data structure can be a bilateral grid, in which a three- dimensional data structure of grid cells is used.
- the recovery image and the LDR image are provided in an image container that can be stored, for example, as a new HDR image format.
- the image container can be read and processed by a device to display an output image that is the LDR image or is an output HDR image (e.g., a derived image).
- a device that is to read and display the image stored in the described image format may use an LDR display device (e.g., display screen) that can display LDR images, e.g., it has a display dynamic range that can only display images up to the low dynamic range of LDR images, and cannot display greater dynamic ranges of HDR images.
- LDR display device e.g., display screen
- the device accesses only the LDR image in the image container and ignores the recovery image. Since the LDR image is in a standard format, the device can readily display the image.
- the LDR image may still be displayed by a device that does not implement code to detect or apply the recovery image.
- an accessing device can use a display device that is capable of displaying HDR images (e.g., can display a greater luminance range in its output than the maximum luminance of LDR images)
- the device accesses the LDR image and recovery map in the image container, and applies the luminance gains encoded in the recovery map to pixels of the LDR image to determine respective corresponding pixel values of an output HDR image that is to be displayed.
- the device scales the pixel luminances of the LDR image based also on a luminance output of the HDR display device (e.g., a maximum luminance output of the display device) and as well as the luminance gains stored in the recovery map.
- the output HDR image is displayed on the HDR display device with a high dynamic range.
- an HDR image and a recovery map can be stored in the image container and the recovery map can be applied to the HDR image to obtain an LDR image having a lower dynamic range, which has a high visual quality due to being locally tone mapped via the recovery map.
- Described features provided several technical advantages, including enabling efficient storage and high quality display of images with high dynamic range or images with lower (e.g., standard) dynamic range by a wide range of devices and using a single image format.
- an image format is provided that allows any device to display an image having a dynamic range appropriate to its display capability.
- the images displayed from these formats have no loss of visual quality (especially in their local contrast, e.g., detail) from converting from one dynamic range to another dynamic range.
- the output HDR image generated from the image container can be lossless, exact versions of the original HDR image that were used to create the image container, or the output HDR image can be a very similar version (visually) to the original HDR image, due to the recovery map storing the luminance information.
- Described features can include using a recovery map to provide an output HDR image derived from an LDR image.
- the recovery map can include values based on a luminance gam ratio that is scaled by a range scaling factor, where the factor is a ratio of the maximum luminance of the original HDR image to the maximum luminance of the LDR image.
- the recovery map indicates a relative amount to scale each pixel
- the range scaling factor indicates a specific amount of scaling to be performed, which provides context to the relative values in the recovery map (e.g., extending the range of the amount that pixels may be scaled for the particular image).
- the range scaling factor is advantageous in that it enables efficient, precise, and compact specification of recovery map values in a normalized range, which can then be efficiently adjusted for a particular output dynamic range using the range scaling factor.
- the use of the range scaling factor makes efficient use of all the bits used to store the recovery map, thus allowing encoding of a larger variety of images accurately.
- the pixel luminances of the LDR image are also scaled based on a display factor.
- the display factor can be based on a luminance output of the HDR display device that is to display the output image. This allows an HDR image having any arbitrary dynamic range above LDR to be generated from the container based on the display capability of the display device.
- the dynamic range of the output HDR image is not constrained to any standard high dynamic range nor to the dynamic range of the original HDR image. Because HDR display devices may vary considerably as to how bright they can display images, a benefit of this feature is that an image can be displayed with a quality rendition on display devices of any dynamic range.
- described features allow changes in dynamic range (e.g., above LDR) in the output HDR image in real time based on user input or other conditions, e.g., to reduce viewing strain or fatigue in a user or for other applications.
- the display scaling allows an output HDR image to have an arbitrary dynamic range above the range of the LDR image.
- prior techniques may encode an HDR image for a display with a specific and fixed dynamic range or maximum brightness. Such a technique cannot take advantage of an HDR display device that has a greater dynamic range or brightness than the encoded dynamic range.
- prior techniques may produce a lower quality image if the HDR display device is not capable of displaying as high a dynamic range as encoded in the original HDR image; for example, techniques such as rolloff curves may be required to reduce the dynamic range of the HDR image to display on the HDR display device, which, for example, often reduces local contrast in the image in undesirable ways, reducing the visual quality of the image.
- Described features provide shareable, backwards-compatible mechanisms to create and render images that contain high dynamic range (HDR) content, beyond current formats.
- HDR high dynamic range
- Described image formats can enable the next generation of consumer HDR content, and also allow supporting professional image workflows, in some implementations without the need for specialized hardware codecs for decoding/encoding.
- one or more described implementations of HDR formats address issues with current HDR transfer functions that assume a global tone-mapping.
- described features enable local tone-mapping, which is better at preserving details in scenes with varying levels of brightness.
- the described HDR formats can produce higher fidelity and quality HDR rendered content compared with traditional HDR formats.
- the image format allows a single LDR image to be stored with a recovery map that has low storage requirements, thus saving considerable storage space over a format that stores multiple full images.
- some implementations include compressing the recovery map, and in some of these examples, the recovery map can be encoded into a bilateral grid. Such a grid can store the recovery map with a significant reduction in required storage space, and allows an output HDR image to be provided from the recovery map with almost no loss in visual quality compared to the original HDR image.
- a technical effect of one or more described implementations is that devices can display images having a dynamic range that better corresponds to the dynamic range of a particular output display device that is being used to display the images, as compared to prior systems. For example, such a prior system may provide an image that does not have a dynamic range that corresponds to the dynamic range of a display device, resulting in poorer image display quality.
- Features described herein can reduce such disadvantages by, e.g., providing a recovery map in the image format, and/or scaling image output, to set the dynamic range of an image to better suit a particular display device.
- a technical effect of one or more described implementations is that devices expend fewer computational resources to obtain results.
- a technical effect of described techniques is a reduction in the consumption of system processing resources and/or storage resources as compared to prior systems that do not provide one or more of the described techniques or features.
- a prior system may require a full LDR image, and a full HDR image to be stored and/or provided in order to display an image that has an appropriate dynamic range for a particular display device, which requires additional storage and communication bandwidth resources in comparison to described techniques.
- a prior system may only store an HDR image, and then rely on tone-mapping the HDR image to a lower dynamic range for LDR display; such tone mapping may reduce visual quality in representing the HDR image on a LDR display for a large variety of HDR images.
- dynamic range relates to a ratio between the brightest and darkest parts of a scene.
- the dynamic range of LDR formats typically does not exceed a particular low range, such as a standard dynamic range (SDR) file in a low range color space/color profile.
- SDR standard dynamic range
- the displayed dynamic range output of LDR display devices is low.
- an image with dynamic range greater (or higher) than an LDR image e.g., JPEG
- a display device capable of displaying that greater dynamic range is considered an HDR display device.
- An HDR image can store pixel values that span a greater tonal range than LDR.
- an HDR image can more accurately display the dynamic range of a real-world scene, and/or may have a lower dynamic range that is greater than the dynamic range of an LDR image (e.g., HDR images may commonly have greater than 8 bits per color channel).
- log refers to a particular base of logarithm, and can be any base number, e.g., all logs herein can be base 2, or base 10, etc.
- a user may be provided with controls allowing the user to make an election as to both if and when systems, programs, or features described herein may enable collection of user information (e.g., images from a user’s library, social network, social actions, or activities, profession, a user’s preferences, a user’s current location, a user’s messages, or characteristics of a user’s device), and if the user is sent content or communications from a server.
- user information e.g., images from a user’s library, social network, social actions, or activities, profession, a user’s preferences, a user’s current location, a user’s messages, or characteristics of a user’s device
- certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
- a user’s identity may be treated so that no personally identifiable information can be determined for the user, or a user’s geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined.
- location information such as to a city, ZIP code, or state level
- the user may have control over what information is collected about the user, how that information is used, and what information is provided to the user.
- Fig. 1 illustrates a block diagram of an example network environment 100, which may be used in some implementations described herein.
- network environment 100 includes one or more server systems, e.g., server system 102 in the example of Fig. 1, and a plurality of client devices, e.g., client devices 120-126, each associated with a respective user of users U1-U4.
- server system 102 and client devices 120-126 may be configured to communicate with a network 130.
- Server system 102 can include a server device 104 and an image database 110.
- server device 104 may provide image application 106a.
- a letter after a reference number e.g., “106a”
- a reference number in the text without a following letter e.g., “106”
- Image database 110 may be stored on a storage device that is part of server system 102.
- image database 110 may be implemented using a relational database, a key-value structure, or other type of database structure.
- image database 110 may include a plurality of partitions, each corresponding to a respective image library for each of users 1-4.
- image database 110 may include a first image library (image library 1, 108a) for user 1, and other image libraries (image library 2, ..., image library n) for various other users.
- Fig. 1 shows a single image database 1 10, it may be understood that image database 1 10 may be implemented as a distributed database, e.g., over a plurality of database servers.
- Fig. 1 shows a plurality of partitions, one for each user, in some implementations, each image library may be implemented as a separate database.
- Image library 108a may store a plurality of images (including videos) associated with user 1, metadata associated with the plurality of images, and one or more other database fields, stored in association with the plurality of images. Access permissions for image library 108a may be restricted such that user 1 can control how images and other data in image library 108a may be accessed, e.g., by image application 106, by other applications, and/or by one or more other users. Server system 102 may be configured to implement the access permissions, such that image data of a particular user is accessible only as permitted by the user.
- An image as referred to herein can include a digital image having pixels with one or more pixel values (e.g., color values, brightness values, etc.).
- An image can be a still image (e.g., still photos, images with a single frame, etc.), a dynamic image (e.g., animations, animated GIFs, cinemagraphs where a portion of the image includes motion while other portions are static, etc.), or a video (e.g., a sequence of images or image frames that may optionally include audio).
- An image as used herein may be understood as any of the above. For example, implementations described herein can be used with still images (e.g., a photograph, or other image), videos, or dynamic images.
- Network environment 100 can include one or more client devices, e.g., client devices 120, 122, 124, and 126, which may communicate with each other and/or with server system 102 via network 130.
- Network 130 can be any type of communication network, including one or more of the Internet, local area networks (LAN), wireless networks, switch or hub connections, etc.
- network 130 can include peer-to-peer communication between devices, e.g., using peer-to-peer wireless protocols (e.g., Bluetooth®, Wi-Fi Direct, etc.), etc.
- peer-to-peer communication between two client devices 120 and 122 is shown by arrow 132.
- users 1, 2, 3, and 4 may communicate with server system 102 and/or each other using respective client devices 120, 122, 124, and 126.
- users 1, 2, 3, and 4 may interact with each other via applications running on respective client devices and/or server system 102 and/or via a network service, e.g., a social network service or other type of network service, implemented on server system 102.
- a network service e.g., a social network service or other type of network service, implemented on server system 102.
- respective client devices 120, 122, 124, and 126 may communicate data to and from one or more server systems, e.g., server system 102.
- the server system 102 may provide appropriate data to the client devices such that each client device can receive communicated content or shared content uploaded to the server system 102 and/or a network service.
- users 1-4 can interact via image sharing, audio or video conferencing, audio, video, or text chat, or other communication modes or applications.
- a network service implemented by server system 102 can include a system allowing users to perform a variety of communications, form links and associations, upload and post shared content such as images, text, audio, and other types of content, and/or perform other functions.
- a client device can display received data such as content posts sent or streamed to the client device and originating from a different client device via a server and/or network service (or from the different client device directly), or originating from a server system and/or network service.
- client devices can communicate directly with each other, e.g., using peer-to-peer communications between client devices as described above.
- a “user” can include one or more programs or virtual entities, as well as persons that interface with the system or network.
- any of client devices 120, 122, 124, and/or 126 can provide one or more applications.
- client device 120 may provide image application 106b.
- Client devices 122-126 may also provide similar applications.
- Image application 106a may be implemented using hardware and/or software of client device 120.
- image application 106a may be a standalone client application, e.g., executed on any of client devices 120-124, or may work in conjunction with image application 106b provided on server system 102.
- Image application 106 may provide various features, implemented with user permission, that are related to images (including videos). For example, such features may include one or more of capturing images using a camera, modifying the images, determining image quality (e.g., based on factors such as face size, blurriness, number of faces, image composition, lighting, exposure, etc.), storing images in an image library 108, encoding and decoding images and videos into any of various image and video formats (including formats described herein), providing user interfaces to view displayed images or image-based creations or compilations, etc.
- image quality e.g., based on factors such as face size, blurriness, number of faces, image composition, lighting, exposure, etc.
- Client device 120 may include an image library 108b of user 1, which may be a standalone image library.
- image library 108b may be usable in combination with image library 108a on server system 102.
- image library 108a and image library 108b may be synchronized via network 130.
- image library 108 may include a plurality of images associated with user 1, e.g., images captured by the user (e.g., using a camera of client device 120, or other device), images shared with the user 1 (e.g., from respective image libraries of other users 2-4), images downloaded by the user 1 (e.g., from websites, from messaging applications, etc.), screenshots, and other images.
- image library 108b on client device 120 may include a subset of images in image library 108a on server system 102. For example, such implementations may be advantageous when a limited amount of storage space is available on client device 120.
- client device 120 and/or server system 102 may include other applications (not shown) that may be applications that provide various types of functionality.
- a user interface on a client device 120, 122, 124, and/or 126 can enable the display of user content and other content, including images, image-based creations, data, and other content as well as communications, privacy settings, notifications, and other data.
- Such a user interface can be displayed using software on the client device, software on the server device, and/or a combination of client software and server software executing on server device 104, e.g., application software or client software in communication with server system 102.
- the user interface can be displayed by a display device of a client device or server device, e.g., a touchscreen or other display screen, projector, etc.
- application programs running on a server system can communicate with a client device to receive user input at the client device and to output data such as visual data, audio data, etc. at the client device.
- Fig. 1 shows one block for server system 102, server device 104, image database 110, and shows four blocks for client devices 120, 122, 124, and 126.
- Server blocks 102, 104, and 110 may represent multiple systems, server devices, and network databases, and the blocks can be provided in different configurations than shown.
- server system 102 can represent multiple server systems that can communicate with other server systems via the network 130.
- server system 102 can include cloud hosting servers, for example.
- image database 110 may be stored on storage devices provided in server system block(s) that are separate from server device 104 and can communicate with server device 104 and other server systems via network 130.
- Each client device can be any type of electronic device, e.g., desktop computer, laptop computer, portable or mobile device, cell phone, smartphone, tablet computer, television, TV set top box or entertainment device, wearable devices (e.g., display glasses or goggles, wristwatch, headset, armband, jewelry, etc.), personal digital assistant (PDA), media player, game device, etc.
- network environment 100 may not have all of the components shown and/or may have other elements including other types of elements instead of, or in addition to, those described herein.
- implementations of features described herein can use any type of system and/or service.
- other networked services e.g., connected to the Internet
- Any type of electronic device can make use of features described herein.
- Some implementations can provide one or more features described herein on one or more client or server devices disconnected from or intermittently connected to computer networks.
- a client device including or connected to a display device can display content posts stored on storage devices local to the client device, e g., received previously over communication networks.
- Fig. 2 is a flow diagram illustrating an example method 200 to encode an image in a backward-compatible high dynamic range image format, e.g., an HDR image format that has low dynamic range compatibility, according to some implementations.
- method 200 can be performed, for example, on a server system 102 as shown in Fig. 1.
- some or all of the method 200 can be implemented on one or more client devices such as client devices 120, 122, 124, or 126 of Fig. 1, one or more server devices such as server device 104 of Fig. 1, and/or on both server device(s) and client device(s).
- the implementing system includes one or more digital processors or processing circuitry ("processors"), and one or more storage devices (e.g., a database or other storage).
- processors digital processors or processing circuitry
- storage devices e.g., a database or other storage
- different components of one or more servers and/or clients can perform different blocks or other parts of the method 200.
- a device is described as performing blocks of method 200.
- Some implementations can have one or more blocks of method 200 performed by one or more other devices (e.g., other client devices or server devices) that can send results or data to the first device.
- the method 200 can be initiated automatically by a system.
- the method (or portions thereof) can be performed periodically, or can be performed based on one or more particular events or conditions, e.g., a client device launching image application 106, capture of new images by an image capture device of a client device, reception of images over a network by a device, upload of new images to a server system 102, a predetermined time period having expired since the last performance of method 200, and/or one or more other conditions occurring which can be specified in settings read by the method.
- User permissions can be obtained to use user data in method 200 (blocks 210- 220).
- user data for which permission is obtained can include images stored on a client device (e.g., any of client devices 120-126) and/or a server device, image metadata, user data related to the use of an image application, other image-based creations, etc.
- the user is provided with options to selectively provide permission to access all, any subset, or none of user data. If user permission is insufficient for particular user data, method 200 can be performed without use of that of user data, e.g., using other data (e.g., images not having an association with the user).
- Method 200 may begin at block 210.
- a high dynamic range (HDR) image is obtained by the device (an “original HDR image”), the HDR image depicting a particular scene.
- the HDR image is provided in any of multiple standard formats for high dynamic range images.
- the HDR image can be captured by an image capture device, e.g., a camera of a client device or other device.
- the HDR image is received over a network from a different device, or obtained from storage accessible by the device.
- the HDR image can be generated based on any of multiple image generation techniques, e.g., ray tracing, etc.
- a low dynamic range (LDR) image is obtained by the device (e.g., an “original LDR image”), the LDR image depicting the same scene and/or subject matter as the HDR image, e.g., corresponds to the HDR image in depicted subject matter.
- the LDR image has a lower dynamic range than the HDR image.
- the LDR image can have a dynamic range of a standard LDR image format, or can have any lower dynamic range than the HDR image.
- the LDR image can be provided in a standard image format, e.g., JPEG, AVI Image File Format (AVIF), TIFF, etc.
- the LDR image has a dynamic range that is provided in 8 bits per pixel per channel
- the HDR image has a dynamic range above the dynamic range of the LDR image (e.g., commonly a bit depth that is 10 bits or greater, a wide-gamut color space, and HDR-oriented transfer functions).
- the HDR image of block 210 can be obtained before or can be obtained after the LDR image is obtained.
- an HDR image is obtained, and the LDR image is derived from the HDR image.
- the LDR image can be generated based on the HDR image using local tone mapping or other process. Local tone mapping, for example, changes each pixel according to its local features of the image (not according to a global tone mapping function that changes each pixel the same way). Any of a variety of local tone mapping techniques can be used to reduce tonal values within the HDR image to be appropriate for the LDR image having a lower dynamic range.
- local Laplacian filtering can be used in the tone mapping process, and/or other techniques can be used.
- a different global tone-curve (or digital gain) can be used in different areas of the image.
- different tone mapping techniques can be used for different types of image data, e.g., different techniques for still images and for video images.
- exposure fusion techniques can be used to produce the LDR image from the HDR image and/or from multiple captured or generated images, e.g., combining portions of multiple images captured or synthetically generated at different exposure levels, into a combined LDR image that includes these portions.
- a combination of tone mapping and exposure fusion techniques can be used to produce the LDR image (e.g., use Laplacian tone mapping pyramids from different images having different exposure levels, provide a weighted blend of the pyramids, and collapse the blended pyramid).
- the LDR image can be obtained from other sources.
- the HDR image can be captured by a camera device and the LDR image can be captured by the same camera device, e.g., immediately before or after, or at least partially simultaneously with, the capture of the HDR image.
- the camera device can capture the HDR image and LDR image based on camera settings, e.g., with dynamic ranges indicated in the camera settings or user-selected settings.
- Block 212 may be followed by block 214.
- a recovery map is generated based on the HDR image and the LDR image.
- the recovery map encodes differences in luminances between portions (e.g. pixels) of the HDR image and corresponding portions (e.g., pixels) of the LDR image.
- the recovery map is to be used to convert luminances of the LDR image into luminances of the HDR image.
- luminance gains and a range scaling factor can be encoded in the recovery map, so that applying the luminance gains to luminance values of individual pixels of the LDR image results in corresponding pixels of the HDR image. Examples of generating a recovery map are described in greater detail below with reference to Fig. 3.
- Block 214 may be followed by block 216.
- the recovery map can be encoded into a recovery element.
- the recovery element can be an image, data structure, algorithm, neural network, etc. that includes or provides the recovery map information.
- the recovery element can be a recovery image that is compressed into a standard fonnat and has the same aspect ratio as an aspect ratio of the LDR image.
- the recovery' image can have any resolution, e.g., the same or a different resolution than the LDR image.
- the recovery image can have a smaller resolution than the LDR image, e.g., one quarter the resolution of the LDR image (e.g., a 480x270 recovery image for a 1920x1080 LDR image).
- the recovery' image can be encoded as single channel 8-bit unsigned integer values, where each value represents a recovery value and is stored in one pixel of the recovery image.
- the encoding can result in a continuous representation from -2.0 to 2.0 that can be compressed, e.g., JPEG compressed.
- the encoding can include 1 bit to indicate if the map value is positive or negative and the rest of the bits interpreted as the magnitude of the value in the range of recovery map values (e.g., -1 to +1).
- the magnitude representation can exceed 1.0 and -1.0, since some pixels may require greater attenuation or gain than represented by the range scaling factor to accurately recover or represent the original HDR image luminance range.
- the channel can specify an adjustment to make to a target image based on luminance (brightness).
- luminance luminance
- chromaticity or hue of the target image would not be changed during the application of the recovery map to that image.
- Single channel encoding can preserve existing chromaticity or hue of the target image (e.g., preserve RGB ratios), and typically requires less storage space than multi-channel encoding.
- the recovery image can be encoded as multi-channel values. Multi-channel encoding can allow adjustment of the colors (e.g., chromaticity or hue) of the target image during the application of the recovery map to the target image.
- Multi-channel encoding can compensate for loss of colors that may have occurred in the LDR image, e.g., rolloff of color at high brightness. This can allow recovery of HDR luminance as well as correct color (for example, instead of only increasing the brightness that causes the sky to look white, also resaturating the sky to a blue hue).
- a multi-channel encoded recovery map can be used to recover a wider color gamut (or compensate for a loss of color gamut) in the target image.
- an ITU-R Recommendation BT.2020 gamut can be recovered from a standard RGB (sRGB)/BT.7O9 gamut LDR image.
- a different bit-depth e.g., a bit-depth greater than 8-bit
- 8-bit depth may be a minimum bit-depth to allow for HDR representation in combination with an 8-bit single channel gain map, where a lower bit depth may not provide enough information to provide HDR image content without artifacts such as banding.
- the recovery image can be encoded as floating point values instead of integers.
- the recovery element can be a data structure, algorithm, neural network, etc. that encodes the values of the recovery map.
- the recovery map can be encoded in weights of a small neural network that is used as the recovery element.
- Block 216 may be followed by block 218 [0068]
- the LDR image and the recovery element are provided in an image container that can be stored as a new HDR image format.
- the LDR image can be considered a base image in the image container.
- the image container can be read by a device to display the LDR image or HDR image.
- the image container can be a standard image container, e.g., an AVIF (AVI Image File Format) image container.
- the recovery map can be quantized into a precision of the image container into which the recovery map is to be placed.
- the values of the recovery map can be quantized into an 8- bit format for storage.
- each value can represent a recovery value and is stored in one pixel of a recovery' image.
- This encoding results in a representation in a range from -2.0 to 2.0, with each value quantized to one of the 256 possible encoded values, for example.
- Other ranges and quantizations can be used in other implementations, e.g., -1.0 to 1.0 or other ranges.
- the image container can include additional information that relates to the display of an output HDR image that is based on the contents of the image container.
- the additional information can include metadata that is provided in the image container, in the recovery element, and/or in the LDR image.
- Metadata can encode information about how to present the LDR image, and/or an HDR image derived therefrom, on a display device.
- Such metadata can include, for example, the version of recovery map format used in the container, the range scaling factor used in the recovery' map, a storage method (e.g., whether the recovery map is encoded in a bilateral grid or other data structure, or is otherwise compressed by a specified compression technique), resolution of the recovery map and/or recovery image, guide weights for bilateral grid storage, and/or other recovery map properties or image properties.
- a storage method e.g., whether the recovery map is encoded in a bilateral grid or other data structure, or is otherwise compressed by a specified compression technique
- resolution of the recovery map and/or recovery image e.g., whether the recovery map is encoded in a bilateral grid or other data structure, or is otherwise compressed by a specified compression technique
- guide weights for bilateral grid storage e.g., guide weights for bilateral grid storage, and/or other recovery map properties or image properties.
- the LDR image can include a metadata data structure or directory that defines the order and properties of the items (e.g., files) in the image container, and each file in the container can have a corresponding media item in the data structure.
- the media item can describe the location of the associated file in the image container and basic properties of the associated file.
- a container element can be encoded into metadata of the LDR image, where the element defines the format version and a data structure of media items in the container.
- metadata is stored according to a data model that allows the LDR image to be read by devices or applications that do not support the metadata and cannot read it.
- the metadata can be stored in the recovery image and/or the LDR image according to the Extensible Metadata Platform (XMP) data mode.
- Block 218 may be followed by block 220
- the image container can be stored, e.g., in storage for access by one or more devices.
- the image container can be sent to one or more server systems over a network to be made accessible to multiple client devices that can access the server systems.
- a server device can save the image container in cloud storage (or other storage) that includes the additional information such as metadata.
- the server can store the LDR image in any format in the container.
- the server device can determine which image data to serve to the requesting client device based on system settings, user preferences, characteristics of the client device such as type of client device (e.g., mobile client device, desktop computer client device), display device characteristics of the client device such as dynamic range, resolution, etc.
- the server device can serve the image container to the client device, and the client device can decode the image container to obtain an image to display (base image or derived image) in the appropriate dynamic range for the client’s display device.
- the server device can extract and decode an image from the image container and serve the image to the client device, where the served image has an appropriate dynamic range for the client’s display device as determined by the server device.
- the server device can determine the base image and derived image from the image container and serve both images to the client device, and the client device can select which of these images to display based on the dynamic range of its display device.
- multiple recovery maps e.g., recovery elements
- Each recovery map can include different values to provide one or more different characteristics to an output image that is based on the recovery map as described herein. For example, a particular one of the multiple recovery maps can be selected and applied to provide an output image having characteristics based on the selected recovery map and, e.g., thus may be more suited to a particular use or application than the other recovery map(s) in the image container.
- some or all of the multiple recovery maps can be associated with particular and different physical display device characteristics of a target output display device.
- a user may select to use different recovery maps to display LDR or HDR images on display devices with different dynamic ranges (e.g., different peak brightnesses) or different color gamuts, or when using different image decoders.
- a respective range scaling factor can be stored in the image container to use with each of these recovery maps.
- one or more of the recovery elements can be associated with stored indications of particular display device characteristics of a target display device, as described below with respect to Fig. 5, to enable selection of one of the recovery map tracks based on a particular target display device being used for display.
- a single range scaling factor can be used with multiple recovery maps.
- a universal range scaling factor can be used for multiple recovery maps that are respectively associated with multiple images stored in the image container.
- additional metadata can be stored in the image container that indicates an intended usage of the recovery map.
- metadata can include indications of intended usage for each of multiple recovery maps. For example, if recovery map 1 is intended for mapping to LDR images, then a map, table, or index with a key of map 1 can indicate a value that is a list of specified output formats associated with the recovery map.
- the metadata can specify an indication of an output range (e.g., “LDR” or “HDR,” or HDR can be specified as a multiplier of LDR range).
- the metadata can specify a bit depth, e.g., “output metadata: bit_depth X,” where X can be 8, 10, 12, etc.
- the specified bit depth can be a minimum bit depth, e.g., “LDR” can indicate 8-bit minimum for output via opto-electronic transfer function (OETF) and 12-bit minimum for linear/extended range output; and “HDR” can indicate 10-bit minimum for output via OETF and 16-bit minimum for linear/extended range output.
- This type of metadata can also be used for multi-channel recovery maps, where the color space can change and the metadata can include an indication of the intended output color space.
- the metadata can be specified as “output metadata: bit_depth X, color_space,” where color_space can be, e.g., Rec. 709, Rec. 2020, etc.
- Fig. 3 is a flow diagram illustrating an example method 300 to generate a recovery map based on an HDR image and an LDR image, according to some implementations.
- method 300 can be implemented for block 214 of method 200 of Fig. 2, or can be performed in other implementations.
- an LDR image and (original) HDR image have been obtained as described with reference to Fig. 2.
- Method 300 may begin at block 302, in which a linear luminance can be determined for the LDR image.
- the original LDR image e.g., obtained in block 212 of Fig. 2
- a linear version of the LDR image can be generated, e.g., by transforming the primary image color space of the non-linear LDR image to a linear version.
- a color space with a standard RGB (sRGB) transfer function is transformed to a linear color space that preserves the sRGB color primaries.
- a linear luminance is determined for the linear LDR image.
- a luminance (Y) function can be defined as:
- Yidi(x,y) primary _color_profile_to_luminance(LDR(x,y)) where Yldr is the low dynamic range image linear luminance defined on a range of 0.0 to 1.0, and primary _color_profile_to_luminance is a function that converts the primary colors of an image (the LDR image) to the linear luminance value Y for each pixel of the LDR image at coordinates (x,y).
- Block 302 may be followed by block 304.
- a linear luminance is determined for the HDR image.
- the original HDR image e.g., obtained in block 210 of Fig. 2
- the original HDR image can be a non-linear and/or three-channel encoded image (e.g., Perceptual Quantizer (PQ) encoded or hybrid-log gamma (HLG) encoded), and a three-channel linear version of the HDR image can be generated, e.g., by transforming the non-linear HDR image to a linear version.
- PQ Perceptual Quantizer
- HLG hybrid-log gamma
- any other color space and color profile can be used.
- a linear luminance is determined for the linear HDR image.
- a luminance (Y) function can be defined as:
- Yhdr(x,y) primary _color_profile_to_luminance(HDR(x,y))
- Yhdr is the low dynamic range image linear luminance defined on a range of 0.0 to the range scaling factor
- primary _color_profile_to_luminance is a function that converts the primary colors of an image (the HDR image) to the linear luminance value Y for each pixel of the HDR image at coordinates (x,y).
- Block 304 may be followed by block 306.
- a pixel gain function is determined based on the linear luminances determined in blocks 302 and 304.
- Yhdr is the high dynamic range image linear luminance defined on a range of 0.0 to the range scaling factor
- primary color profilc tojuminance is a function that converts the primary colors of an image (the HDR image) to the linear luminance value Y for each pixel of the HDR image at coordinates (x,y).
- Yhdr or Yldr can be zero, leading to potential issues in the equation above or when determining a logarithm as described below.
- Block 306 may be followed by block 308.
- a range scaling factor is determined based on a maximum luminance of the HDR image to a maximum luminance of the LDR image.
- the range scaling factor can be a ratio of a maximum luminance of the HDR image to a maximum luminance of the LDR image.
- the range scaling factor can also be referred to herein as a range compression factor, and/or a range expansion factor that is the multiplicative inverse of the range compression factor. For example, if the LDR image is determined from the HDR image, the range scaling factor can be derived from the amount that the total HDR range was compressed to produce the LDR image.
- this factor can indicate an amount that the highlights of the HDR image are lowered in luminance to map the HDR image to the LDR dynamic range, or an amount that the shadows of the HDR image are increased in luminance to map the HDR image to the LDR dynamic range.
- the range scaling factor can be a linear value that may be multiplied by the total LDR range to get the total HDR range in linear space. In some examples, if the range scaling factor is 3, then the shadows of the HDR image are to be boosted 3 times, or the highlights of the HDR image are to be decreased 3 times (or a blend of such shadow boosting and highlight decreasing are to be performed).
- the range scaling factor can be defined in other ways (e.g., by a camera application or other application, a user, or other content creator) to provide a particular visual effect that may change the appearance of an output image derived from the range scaling factor.
- Block 308 may be followed by block 310.
- a recovery map is determined that encodes the pixel gain function scaled by the range scaling factor.
- the recovery map is thus, via the pixel gain function, based on the two linear images containing the desired HDR image luminance (Yhdr), and the LDR image luminance (Yldr).
- the recovery map is a scalar function that encodes the normalized pixel gain in a logarithmic space, and is scaled by the range scaling factor (e.g., multiplied by the inverse of the range scaling factor).
- the recovery map values are proportional to the difference of logarithmic HDR and LDR luminances, divided by the log of the range scaling factor.
- recovery(x,y) can tend to be in a range of -1 to +1. Values below zero make pixels (in the LDR image) darker, for display on an HDR display device, while values above zero make the pixels brighter.
- these values can typically be in the range ⁇ [O..l] (e.g., applying the recovery map holds shadows steady, and boosts highlights), so that the brightest area of an image can have values close to 1, and darker areas of the image can typically have values close to 0.
- these values can typically be in the range ⁇ [-l ..0] (e.g., applying the recovery map holds highlights steady , and re-darkens the shadows).
- these values can encompass the full [-1 . 1] range.
- the images are converted to, and processing is performed in, the logarithmic space for ease of operations.
- the processing can be performed in linear space via exponentiation and interpolating the exponents, which is mathematically equivalent.
- the processing can be performed in linear space without exponentiation, e.g., naively interpolate/extrapolate in linear space.
- the range scaling factor in general indicates an amount to increase pixels' brightness by and the map indicates that some pixels may be made brighter or darker.
- the recovery map indicates a relative amount to scale each pixel, and the range scaling factor indicates a specific amount of scaling to be performed.
- the range scaling factor provides normalization and context to the relative values in the recovery map. The range scaling factor enables malting efficient use of all the bits used to store the recovery map, regardless of the amount of scaling to apply via the recovery map. This means a larger variety of images can be encoded accurately.
- every map may contain values in the range of, e.g., -1 to 1, and the values are given context by the range scaling factor, such that this range of values can represent maps that scale to a larger total range (e.g., 2 or 8) equally well.
- one image in the described format can have a range scaling factor of 2 and a second image can have a range scaling factor of 8. Without a range scaling factor, the maximum/minimum absolute gain values are decided in the map. If this value is, e.g., 2, then the second image with a range scale of 8 cannot be represented. If this value is, e.g., 8, then 2 bits of information are wasted for every pixel in the recovery map when storing the image with a range scale of 2.
- Such implementations without a range scaling factor may be more likely to have an (undesired) visual difference in the displayed output HDR image compared to the original HDR image used for encoding (e.g., a severe form of such a difference could be banding).
- the recovery function when pixel gain is 0.0, can be defined to be -2.0, which is the largest representable attenuation.
- the recovery function can be outside the range -1.0 to +1.0, since one or more areas or locations in the image may require greater scaling (attenuation or gam) than represented by the range scaling factor to recover the original HDR image.
- the range of -2.0 to +2.0 may be sufficient to provide such greater scaling (attenuation or gain).
- Block 310 may be followed by block 312.
- the recovery map can be encoded into a compressed form. This allows the recovery map to occupy reduced storage space.
- the compressed form can be any of a variety of different compressions, formats, etc.
- the recovery map is determined and encoded into a bilateral grid, as described in greater detail with respect to Fig. 4.
- the recovery map can be compressed or otherwise reduced in storage requirements based on one or more additional or alternative compression techniques.
- the recovery map can be compressed to a lower- resolution recovery image that has a lower resolution than the recovery image determined in block 310. This recovery image can occupy less storage space than the full-resolution recovery map.
- JPEG compression can be used in implementations that provide an LDR image in JPEG format, or other compression types can be used (e.g., Run Length Encoding (RLE)) or other image format compressions (e.g., HEIC, AVIF, PNG, etc.).
- RLE Run Length Encoding
- image format compressions e.g., HEIC, AVIF, PNG, etc.
- Fig. 4 is a flow diagram illustrating an example method 400 to encode a recovery map into a bilateral grid, according to some implementations.
- method 400 can be implemented as blocks 310 and 312 of method 300 of Fig. 3, e.g., where a recovery map is determined based on luminances of an HDR image and an LDR image and is stored in compressed form, which in described implementations of Fig. 4 is a bilateral grid.
- the bilateral grid is a three-dimensional data structure of grid cells mapped to pixels of the LDR image based on guide weights, as described below.
- Method 400 may begin at block 402, a three-dimensional data structure of grid cells is defined.
- the three-dimensional data structure has a width, height, and depth in a number of cells, indicated as a size WxHxD, where the width and height correspond to width and height of the LDR image and the depth corresponds to a number of layers of cells of WxH.
- a cell in the grid is defined as the vector element at (width, height) and of length D, and is mapped to multiple LDR image pixels.
- the number of cells in the grid can be much less than the number of pixels in the LDR image.
- the width and height can be approximately 3% of the LDR image size and the depth can be 16.
- a 1920x1080 LDR image can have a grid of a size on the order of 64x36x16.
- Block 402 may be followed by block 404.
- a set of guide weights is defined.
- the set of guide weights can be defined as a multiple-element vector of floating point values, which can be used for generating and decoding a bilateral grid.
- the guide weights represent the weight of the respective elements of an input pixel of the LDR image, LDR(x,y), for determining the corresponding depth D for the cell that will map to that input pixel.
- weight b weight min, weight max ⁇
- weight_r, weight_g, and weight_b refers to the weight of the respective input pixel’s color component (red, green, or blue)
- weight_min and weight_max respectively refer to the weight of the minimum component value of the r, g, and b values
- weight of the maximum component value of the r, g, and b values e.g., min(r,g,b) and max(r,g,b)
- the LDR image is in the Rec. 709 color space
- these guide weight values weight the grid depth lookup by 50% for the pixel luminance (r, g, b values), by 12.5% for the minimum component value, and by 37.5% for the pixel maximum component value.
- these values add up to 1.0, e.g., so that every grid cell can always be looked up (unlike when the values add up to less than 1.0), and the depth of a pixel can't overflow beyond the final depth level in the grid (unlike when the values add up to more than 1.0).
- Guide weight values can be determined, for example, heuristically by having persons visually evaluate results. In some implementations, a different number of guide weights can be used, e.g., only three guide weights for the pixel luminance (r, g, b values).
- the grid cells are mapped to pixels of the LDR image based on the guide weights.
- the guide weights are used to determine the corresponding depth D for the cell that will map to that input pixel.
- the width (x) and height (y) of grid cells are mapped to the pixels (x,y) of the LDR image.
- grid_y ldr_y I ldr_H * (grid H - 1)
- grid x and grid_y are the x and y locations of a grid cell
- ldr_x and ldr_y are the x and y coordinates of a pixel of the LDR image being mapped to the grid cell
- ldr_W and ldr_H are the total pixel width and height of the LDR image
- grid W and grid H are the total width and height cell dimensions of the bilateral grid, respectively.
- z_idx(x,y) (D - 1) * (guide_weights • ⁇ r, g, b, min(r,g,b), max(r,g,b) ⁇ ) where D is the total cell depth dimension of the grid and z_idx(x,y) is the depth in the bilateral grid for the cell mapped to the LDR image pixel (x,y).
- This equation determines the dot-product of the vector of guide weights and the vector of corresponding values for a particular pixel of the LDR image, and then scales the result by the size of the depth dimension in the grid to indicate the depth of the corresponding cell for that particular pixel.
- the guide weight multiplication result can be a value between 0.0 and 1.0 (inclusive), and this result is scaled to the various depth levels (buckets) of the grid. For example, if D is 16, a multiplication value of 0.0 results in mapping into bucket 0, a value of 0.5 results in mapping into bucket 7, and a value of 1.0 results in mapping into bucket 15 (which is the last bucket).
- Block 406 may be followed by block 408.
- recovery map values are determined using solutions to a set of linear equations.
- Bilateral grid linear is the bilateral grid contents representing pixel gains in linear space.
- the coordinates x, y, and z_idx are the coordinates of the bilateral grid, which are different from the x,y coordinates of the LDR image.
- the guide weights inform what depth that pixel_gain is located in the bilateral grid for each pixel of the LDR image, and this informs how to set up a linear set of equations to solve in block 410.
- Block 408 may be followed by block 410.
- the encoding is defined as the solution to the set of linear equations that minimizes the following equation for each grid cell:
- the Yldr and Yhdr values are luminance values of the LDR image and the HDR image, respectively, at the (x,y) pixel position of the LDR image.
- the pixel__gain(x,y) values that minimize the value of the above equation are solved for based on where each pixel is looked up into the depth vectors as indicated in the previous equation above; this is predetermined based on the guide weights.
- This definition specifies that the bilateral grid is defined for x from 0 to the total grid width, for y from 0 to the total grid height, and for z from 0 to the total grid depth. For example, if the bilateral grid is 64x36x16, then grid values are defined for x over 0 to 63 inclusive, for y over 0 to 35 inclusive, and for z over 0 to 15 inclusive. Block 408 may be followed by block 410.
- the solved values are transformed and stored in the bilateral grid.
- Bilaleral grid is the bilateral grid contents representing pixel gains in log space. These values are stored as an encoded recovery map. As indicated in these equations, the normalized pixel gain between the LDR image and the HDR image is encoded in a logarithmic space and scaled by the range scaling factor, similarly as described above with respect to block 310 of Fig. 3.
- each recovery map value is extracted from the bilateral gnd.
- r, g, and b are the respective component values for a pixel LDR(x,y) of the LDR image.
- z_idx as determined in the first equation, can be rounded to the nearest whole number before being input to bilateral gridix.y.z) in the second equation.
- Fig. 5 is a flow diagram illustrating an example method 500 to decode and display an image of a backward-compatible high dynamic range image format, according to some implementations.
- method 500 can be performed, for example, on a server system 102 as shown in Fig. 1.
- some or all of the method 500 can be implemented on one or more client devices such as client devices 120, 122, 124, or 126 of Fig. 1, one or more server devices such as server device 104 of Fig. 1, and/or on both server device(s) and client device(s).
- the implementing system includes one or more digital processors or processing circuitry ("processors"), and one or more storage devices (e.g., a database or other storage).
- processors digital processors or processing circuitry
- storage devices e.g., a database or other storage
- different components of one or more servers and/or clients can perform different blocks or other parts of the method 500.
- a device is described as performing blocks of method 500.
- Some implementations can have one or more blocks of method 500 performed by one or more other devices (e.g., other client devices or server devices) that can send results or data to the first device.
- the method 500 can be initiated automatically by a system.
- the method (or portions thereof) can be performed periodically, or can be performed based on one or more particular events or conditions, e.g., a client device launching image application 106, capture or reception of new images by a client device, upload of new images to a server system 102, a predetermined time period having expired since the last performance of method 500, and/or one or more other conditions occurring which can be specified in settings read or received by the system performing the method.
- Method 500 can be performed by a different device than the device that created the image container.
- a first device can create the image container and store it in a storage location (e.g., a server device) that is accessible by a second device that is different than the first device.
- the second device can access the image container to display an image therefrom according to method 500.
- the application of recovery map to base image can be performed by a hardware decoder, and/or by a CPU and/or GPU of the second device.
- a single device can create the image container as described in method 200 and can decode and display the image container as described in method 500.
- a first device can decode an image from the image container based on method 500, and can provide the decoded image to a second device to display the image.
- a first device e g., a server device in the cloud
- can decode two images from the image container e.g., a base image and a derived image based on the base image and recovery map
- the image container e.g., a base image and a derived image based on the base image and recovery map
- send the two images to a second device to select and display one of the images as determined by the second device, e.g., based on the display device characteristics of the second device.
- User permissions can be obtained to use user data in method 500 (blocks 510- 534).
- user data for which permission is obtained can include images stored on a client device (e.g., any of client devices 120-126) and/or a server device, image metadata, user data related to the use of an image application, other image-based creations, etc.
- the user is provided with options to selectively provide permission to access all, any subset, or none of user data. If user permission is insufficient for particular user data, method 500 can be performed without use of that of user data, e.g., using other data (e.g., images not having an association with the user).
- Method 500 may begin at block 502.
- an image container is obtained that includes an LDR image and a recovery map.
- the image container can be a container created by method 200 of Fig. 2, that generates a recovery map from an HDR image and an LDR image, where the HDR image has a greater dynamic range than the LDR image.
- the recovery map can be provided in the container as a recovery element, such as a recovery image.
- the image container can have a standardized format that is supported and recognized by the device performing method 500, e.g., a container for JPEG, custom JPEG markers (e.g., APP1 segment to store metadata in a JPEG/Exif file, or markers to store chunks of data), International Organization for Standardization Base Media File Format (ISOBMFF), a container for AVI Image File Format (AVIF), HEIF (High Efficiency Image File), Multi-Picture Format (MPF) container, Extensible Metadata Platform (XMP) encoded data, etc.
- a container for JPEG custom JPEG markers (e.g., APP1 segment to store metadata in a JPEG/Exif file, or markers to store chunks of data), International Organization for Standardization Base Media File Format (ISOBMFF), a container for AVI Image File Format (AVIF), HEIF (High Efficiency Image File), Multi-Picture Format (MPF) container, Extensible Metadata Platform (XMP) encoded data, etc.
- the recovery map can be stored as a secondary image according to a standard (e.g., AVIF, HEIF), or the recovery map can be stored as a new type of auxiliary image (e.g., define an AVI Gain Map Image Item that signals to the reading device that the image item is a gain map and used accordingly; and, e.g., defining the range scaling factor in a header).
- a standard e.g., AVIF, HEIF
- auxiliary image e.g., define an AVI Gain Map Image Item that signals to the reading device that the image item is a gain map and used accordingly; and, e.g., defining the range scaling factor in a header.
- Block 510 may be followed by block 512.
- the LDR image in the image container is extracted.
- this extraction can include decoding the LDR image from its standard file format (e.g., JPEG) to a raw image format or other format for use in display and/or processing.
- Block 512 may be followed by block 514.
- the output image is to be displayed on at least one target display device associated with or in communication with the device that is performing method 500 and accessing the image container.
- the target display device can be any suitable output display device, e.g., display screen, touchscreen, projector, display goggles or glasses, etc.
- An output HDR image to be displayed can have any dynamic range above the LDR image in the container, up to a dynamic range of the original HDR image used in creating the image container (e.g., in block 210 of Fig. 2).
- the determination of whether to display an output HDR image can be based on one or more characteristics of the target display device that is to display the output image. For example, the dynamic range of the target display device can determine whether to display an HDR image or the LDR image as the output image. In some examples, if the target display device is capable of displaying a greater dynamic range than the dynamic range of the LDR image in the image container, then it is determined to display an HDR image. If the target display device is capable only of displaying the low dynamic range of the LDR image in the container, then an HDR image is not to be displayed.
- user settings or preferences may indicate to display an image having a lower dynamic range, or the displayed image is to be displayed as an LDR image to be visually compared to other LDR images, etc.
- an HDR image is determined to not be displayed in block 514, then the method proceeds to block 516, in which the LDR image extracted from the image container is caused to be displayed by the target display device.
- the LDR image is displayed by an LDR-compatible target output device that is not capable of displaying a greater dynamic range than the range of the LDR image.
- the LDR image can be output directly by the target display device, and the recovery map and any metadata pertaining to the HDR image format in the image container is ignored.
- the LDR image is provided in a standard format, and thus it is readable and display able by any device that can read the standard format.
- an HDR image is determined in block 514 to be displayed as the output image, then the method proceeds to block 518, in which the recovery element in the image container is extracted.
- Metadata related to display of an output HDR image based on the recovery element can also be obtained from the image container.
- the recovery element can be a recovery image as described above, and this recovery image can include and/or be accompanied by metadata as described above. Metadata can also or alternatively be included in the container and/or in the LDR image from the image container.
- multiple recovery maps can be included in the image container as described above.
- one of the multiple recovery elements can be selected in block 518 for processing below.
- the recovery element that is selected can be specified by user input or user settings, or in some implementations can be automatically selected (e.g., without current user input) by a device performing the processing, e.g., based on one or more characteristics of a target output display device or other output display component, where such characteristics can be obtained using via an operating system call or other available source.
- one or more of the recovery elements can be associated with indications of particular display device characteristics that are stored in the image container (or are otherwise accessible to the device performing block 518). If the target display device is determined to have one or more particular characteristics (e.g., dynamic range, peak brightness, color gamut, etc.) the particular recovery element that is associated with those characteristics can be selected in block 51 . If the multiple recovery elements are associated with different range scaling factors, the associated range scaling factor is also selected and retrieved. Block 518 may be followed by block 520.
- particular characteristics e.g., dynamic range, peak brightness, color gamut, etc.
- the recovery map is optionally decoded from the recovery element.
- the recovery map may be in a compressed or other encoded form in the recovery element, and can be uncompressed or decoded.
- the recovery map can be encoded in a bilateral grid as described above, and the recovery map is decoded from the bilateral grid.
- Block 520 may be followed by block 522.
- block 524 it is determined whether to scale the luminance of the extracted LDR image for the output HDR image. In some implementations, this scaling is based on the display output capability of the target display device. For example, in some implementations or cases, the generation of the output HDR image is based in part on a particular luminance that the target display device is capable of displaying. In some implementations or cases, this particular luminance can be the maximum luminance that the target display device is capable of displaying, such that the display device maximum luminance caps the luminance of the output HDR image. In some implementations or cases, in block 524 it is determined not to scale the luminance of the LDR image, e.g., the generation of the output HDR image does not take the luminance capability of the target display device into account.
- the method proceeds to block 526, in which the luminance gains of the recovery map are applied to the pixels of the LDR image and the pixels are scaled based on the range scaling factor to determine corresponding pixels of the output HDR image.
- the luminance gains and scaling are applied to luminances of pixels of the LDR image to determine respective corresponding pixel values of the output HDR image that is to be displayed. This causes the output HDR image to have the same dynamic range as the original HDR image used to create the image container.
- the LDR image and range scaling factor are converted to a logarithmic space, and the following equation can be used:
- HDR*(x, y) LDR*(x, y) + log(range scaling factor) * recovery(x,y)
- HDR*(x,y) is pixel (x,y) of the recovered HDR image (output image) in the logarithmic space
- LDR*(x,y) is the pixel (x,y) of the logarithmic space version of the extracted LDR image (e.g., log(LDR(x,y))
- log(range_scaling_factor) is the range scaling factor in the logarithmic space
- recovery (x,y) is the recovery map value at pixel (x,y).
- recovery(x,y) is the normalized pixel gain in the logarithmic space.
- interpolation is provided between the LDR and HDR versions of the image when the per-pixel gain is applied, e.g., recovery (x,y) * range scaling factor.
- recovery (x,y) * range scaling factor This is essentially interpolating between the original version and the range-compressed version of an image, and the interpolation amount is dynamic, both in the sense of being pixel by pixel, and also globally, because it is dependent on the range scaling factor. In some examples, this can be considered extrapolation if the recovery map value is less than 0 or greater than 1.
- the LDR image from the image container is nonlinearly encoded, e g., gamma encoded, and the primary image color space of the non-linear LDR image is converted to a linear version before the LDR image is converted to LDR*(x,y) in the logarithmic space for the equation above.
- a color space with a standard RGB (sRGB) transfer function can be transformed to a linear color space that preserves the sRGB color primaries, similarly as described above for block 302 of Fig. 3.
- the LDR image in the container can be linearly encoded, and such a transformation from a non-linear color space is not used.
- the output HDR image can have the same pixel values as the original HDR image (e.g., if the recovery map has the same resolution as the LDR image). Otherwise, the output HDR image resulting from block 532 can be a very close approximation to the original HDR image, where the differences are usually not visually noticeable to users in a display of the output image.
- the output image can be transformed from the logarithmic space to the output color space for display.
- the output image color space can be different from the color space of the LDR image extracted from the image container. Block 526 may be followed by block 534, described below.
- the method proceeds to block 528, in which the maximum luminance output of the target display device is determined.
- the target display device can be an HDR display device, and these types of devices are capable of displaying maximum luminance values that may vary based on model, manufacturer, etc. of the display device.
- the maximum luminance output of the target display device can be obtained via an operating system call or other source.
- Block 528 may be followed by block 530, described below.
- a display factor is determined.
- the display factor is used to scale the dynamic range of the output HDR image that is to be displayed, as described below.
- the display factor is determined to be equal to or less than the minimum of the luminance output of the target display device and the range scaling factor. For example, this can be stated as: display factor ⁇ min (maximum display luminance, range scaling factor)
- the luminance scaling of the output image allows the luminance scaling of the output image to be capped at the maximum display luminance of the target display device. This prevents scaling the output image to a luminance range that is present in the original HDR image from which the image container was created (as indicated in the range scaling factor), but which is greater than the maximum luminance output range of the target display device.
- the maximum display luminance may be a dynamic characteristic of the target display device, e.g., may be adjustable by user settings, application program, etc.
- the display factor can be set to a luminance that is greater than the range scaling factor and is less than or equal to the maximum display luminance. For example, this can be performed if the target display device can output a greater dynamic range than the dynamic range present in the original HDR image (which can be indicated in the range scaling factor).
- a user through user settings or selection input may indicate that the output image be brightened.
- the display factor can be set to a luminance that is below the maximum display luminance of the target display device and the maximum luminance of the original HDR image.
- the display factor can be set based on a particular display luminance of the display device indicated by one or more display conditions, by user input, user preferences or settings, etc., e.g., to cause output of a lower luminance level that may be desired in some cases or applications as described below.
- Block 530 may be followed by block 532, described below.
- the luminance gains of the recovery map are applied to the LDR image and the luminance of the LDR image is scaled based on the display factor to determine corresponding pixels of the HDR image.
- the luminance gains encoded in the recovery map are applied to pixel luminances of the LDR image to determine respective corresponding pixel values of the output image that is to be displayed.
- These pixel values are scaled based on the display factor determined in block 530, e.g., based on a particular luminance output of the target display device (which can be the maximum luminance output of the target display device).
- this allows the highlight luminances in the LDR image to be increased to a level that the display device is capable of displaying, up to a maximum level based on the dynamic range of the original HDR image; or allows the shadow luminances in the LDR image to be decreased to a level that the display device is capable of displaying, dow n to a lower limit based on the dynamic range of the original HDR image.
- the LDR image and display factor are converted to a logarithmic space, and the following equation can be used:
- HDR*(x, y) LDR*(x, y) + log(display factor) * recovery(x,y)
- HDR*(x,y) is pixel (x,y) of the recovered HDR image (output image) in the loganthmic space
- LDR*(x,y) is the pixel (x,y) of the logarithmic space version of the extracted LDR image (e.g., log(LDR(x,y))
- log(display factor) is the display factor in the logarithmic space
- recovery(x,y) is the recovery map value at pixel (x,y).
- recovery(x,y) is the normalized pixel gain in the logarithmic space.
- interpolation can be considered to be provided between the LDR and HDR versions of the image when the per-pixel gain is applied, e.g., recovery _map(x,y) * display factor.
- This can be considered essentially interpolating between the original version and the range-compressed version of an image, and the interpolation amount is dynamic, both in the sense of being pixel by pixel, and also globally, because it is dependent on the display factor. In some examples, this can be considered extrapolation if the recovery map value is less than 0 or greater than 1.
- the LDR image from the image container is nonlinearly encoded, e.g., gamma encoded, and the primary image color space of the non-linear LDR image is converted to a linear version before the LDR image is converted to LDR*(x,y) in the logarithmic space for the equation above.
- a color space with a standard RGB (sRGB) transfer function can be transformed to a linear color space that preserves the sRGB color primaries, similarly as described above for block 302 of Fig. 3.
- the LDR image in the container can be linearly encoded, and such a transformation from a non-linear color space is not used.
- the output HDR image can have the same pixel values as the original HDR image (e.g., if the recovery map has the same resolution as the LDR image). Otherwise, the output HDR image resulting from block 532 can be a very close approximation to the original HDR image, where the differences are usually not visually noticeable to users in a display of the output image.
- the output image can be transformed from the logarithmic space to the output color space for display.
- the output image color space can be different from the color space of the LDR image extracted from the image container. Block 532 may be followed by block 534.
- the output HDR image determined in block 526 or in block 532 is caused to be displayed by the target display device.
- the output HDR image is displayed by an HDR-compatible display device that is capable of displaying a greater dynamic range than the range of the LDR image.
- the output HDR image generated from the image container can be the same as the original HDR image that was used to create the image container, due to the recovery map storing the luminance information, or may be approximately the same and typically indistinguishable visually from the original HDR image.
- the output HDR image (from block 526 or block 532) may be suitable for display by the target display device, e.g., as a linear image or other output HDR image format.
- the output HDR image may have been processed via the display factor of block 532, and may not need to be modified further for display, such that the output HDR image resulting from application of the recovery map is directly rendered for display.
- the output HDR image can be converted to a different format for display by the target display device, e.g., to an image format that may be more suitable for display by the target display device than the output HDR image.
- the output HDR image from block 526 (or from block 532 in some implementations) can be converted to luminance range that is suited to the capabilities of the target display (which, in some implementations, can be in addition to the use of a display factor as in block 532).
- the output HDR image can be converted to a standard image having a standard image format, e.g., HLG/PQ or other standard HDR format.
- a generic tone mapping technique can be applied to such a standard image for display by the target display device.
- an output HDR image that has been processed for the target display device via the display factor of block 532, without being further converted to a standard format and processed for display using a generic tone mapping technique may provide a higher quality image display on the target device than an output HDR image that is so converted and processed via a generic tone mapping technique.
- local tone mapping provided via the recovery map can provide higher quality images than generic or global tone mapping, e.g., greater detail.
- a technique can be used to output the image to the display device in a linear format, e.g., an extended range format where 0 represents black, 1 represents SDR white, and values above 1 represent HDR brightnesses.
- a technique can avoid issues with global tone mapping techniques to handle display capabilities of the target display device.
- One or more techniques described herein can handle display device display capabilities via application of the recovery map via a display factor as described above, so that extended range values do not exceed display capabilities of the display device.
- a device can scale the pixel luminances of an LDR image based on a maximum luminance output of an HDR display device and based on the luminance gains stored in the recovery map.
- the scaling of the LDR image can be used to produce an HDR output image that reproduces the full dynamic range of the original HDR image used to create the image container (if the target display device is capable of displaying that full range).
- the scaling of the LDR image can set, for the output HDR image, an arbitrary dynamic range above the range of the LDR image.
- the dynamic range can also be below the range of the original HDR image.
- the output image luminances can be scaled based on the capability of the target display device and/or other criteria, or can be scaled lower than the original HDR image and/or lower than the maximum capability of the target display device so as to reduce the displayed brightness in the image that, e.g., may be visually uncomfortable or fatiguing for viewers.
- a fatigue-reducing algorithm and one or more light sensors in communication with a device implementing the algorithm can be used to detect one or more of various factors such as ambient light around the display device, time of day, etc. to determine a reduced dynamic range or reduced maximum luminance for the output image.
- the reduced luminance level of the output image can be obtained by scaling the recovery map values and/or display factor that are used in determining an HDR image as described for some implementations herein, e.g., scale the range_scaling_factor and/or display factor in the example equations indicated above.
- a device may determine that its battery power level is below a threshold level and that battery power is to be conserved (e.g., by avoiding high display brightness), in which case the device causes display of HDR images at less than the brightness of the original HDR image.
- the luminances of the LDR image can be scaled greater (e.g., higher) than the dynamic range of the original HDR image, e.g., if the target display device has a greater dynamic range than the original HDR image.
- the dynamic range of the output HDR image is not constrained to the dynamic range of the original HDR image, nor to any particular high dynamic range.
- the maximum of the dynamic range of the output HDR image can be lower than the maximum dynamic range of the original HDR image and/or lower than the maximum dynamic range of the target output device.
- display of the output HDR image can be gradually scaled from a lower luminance level to the full (maximum) luminance level of the output HDR image (the maximum luminance level determined as described above).
- the gradual scaling can be performed over a particular period of time, such as 2-4 seconds (which can be user configurable in some implementations).
- Gradual scaling can avoid user discomfort that may result from a large and sudden increase of brightness when displaying a higher-luminance HDR image after displaying a lower luminance (e.g., display of an LDR image).
- Gradual scaling from the LDR image to the HDR version of that image can be performed, for example, by scaling the recovery map values and/or display factor used in determining an HDR image as described herein, e.g., in multiple steps to gradually brighten the LDR image to the HDR version of the image.
- the range scaling factor and/or display_factor can be scaled in the equations indicated above.
- multiple scaling operations can be performed in succession to obtain intermediate images of progressively higher dynamic range over the period of time.
- Other HDR image formats that scale LDR images to HDR images using global tonemapping techniques as described above instead of local tonemapping as described for the HDR format herein may have poorer display quality when performing such gradual scaling or may not support gradual scaling.
- this gradual scaling of HDR image luminance can be performed or triggered for particular display conditions.
- Such conditions can include, for example, when the output HDR image is displayed immediately after display of an LDR image by the display device.
- a grid of LDR images can be displayed, and the user selects one of these images to fill the screen with a display of the HDR version of the selected image (where the HDR image can be determined based on the recovery element from an associated image container as described above).
- the HDR image can be initially displayed at the lower luminance level of the LDR image and gradually updated (e g., by progressively displaying the image content with higher dynamic range, increasing in increments up to its maximum luminance level; e.g., fade-in and fade-out animations.
- the display conditions to trigger gradual scaling can include an overall screen brightness that is set at a low level as described herein (e.g., based on an ambient low light level around the device).
- the overall screen brightness can have a luminance level below a particular threshold level to trigger the gradual scaling of HDR luminance.
- an output HDR image can be immediately displayed at its maximum luminance without the gradual scaling, e g., when one or more HDR images are being displayed and are to be replaced by the output HDR image.
- the second HDR image can be immediately displayed at its maximum luminance since the user is already accustomed to viewing the increased luminance of the first HDR image.
- the device that provides the display of the output image can receive user input from a user (or other source, e.g., other device) that instructs to modify the display of the output image, e.g., to increase or reduce the dynamic range of the display , up to the maximum dynamic range of the display device and/or the original HDR image.
- the device in response to receiving user input that modifies the display, can modify the display factor (described above) in accordance with the user input, which changes the dynamic range of the displayed output image.
- the display factor can be reduced by a corresponding amount and the output image displayed with a resulting lower dynamic range. This change of display can be provided in a real-time manner in response to user input.
- the device providing the display of the output image can receive input that instructs to modify or edit the output image, e.g., input from a user or other source, e.g., other device.
- the recovery map can be discarded after such modification, e.g., it may no longer properly apply to the LDR image.
- the recovery 7 map can be preserved after such modification.
- the recovery map can be updated based on the modification to the output image to reflect the modified version of the LDR image.
- the modification is performed by a user on an HDR display and/or via an HDR-aware editing application program, then characteristics of the modification are known in the greater dynamic range and the recovery map can be determined based on the edited image.
- the recovery 7 map can be decoded to a fullresolution image and presented in the editor as an alpha channel, other channel, or data associated with the output image.
- the recovery map can be displayed in the editor as an extra channel in a list of channels for the image.
- the LDR image and the HDR output image can both be displayed in the editor, to provide visualization of modifications to the image in both versions.
- corresponding edits are made to the recovery map and to the LDR image (base image).
- the user can decide whether to edit RGB, or RGBA (red green blue alpha), for the image.
- the user can edit the recovery map (e.g., alpha channel) manually, and view the displayed HDR image change while the displayed LDR image stays the same.
- an overall dynamic range control can be provided in the image editing interface, which can adjust the range scaling factor higher or lower. Edit operations to images such as cropping, resampling, etc. can cause the alpha channel to be updated in the same way as the RGB channels.
- an LDR image can be saved back into the image container.
- the saved LDR image may have been edited in the interface, or may be a new LDR image generated via local tone mapping from an edited HDR image or generated based on the edited HDR image scaled by (e.g., divided by) the recovery map at each pixel.
- An edited recovery map can be saved back into the image container.
- the recovery map can be processed back into a bilateral grid, if such encoding is used; or the image and recovery map can be saved losslessly, with greater storage requirements, e.g., if additional editing to the image is to be performed.
- an editor or any other program can convert an LDR image and recovery map of the described image format into a standard HDR image, e.g., a 10-bit HDR AVIF image or an image of a different HDR format.
- a standard HDR image e.g., a 10-bit HDR AVIF image or an image of a different HDR format.
- various blocks of methods 200, 300, 400, and/or 500 may be combined, split into multiple blocks, performed in parallel, or performed asynchronously. In some implementations, one or more blocks of these methods may not be perfonned or may be performed in a different order than shown in these figures. For example, in various implementations, blocks 210 and 212 of Fig. 2, and/or blocks 302 and 304 of Fig. 3, can be performed in different orders or in parallel. Methods 200, 300, 400, and/or 500, or portions thereof, may be repeated any number of times using additional inputs. For example, in some implementations, method 200 may be performed when one or more new images are received by a device performing the method and/or are stored in a user’s image library.
- the HDR image and LDR image can be reversed in their respective roles described above.
- an original HDR image can be stored in the image container (e.g., as the base image, in block 218) instead of an original LDR image that has lower dynamic range than the HDR image.
- a recovery map can be determined and stored in the image container that indicates how to determine a derived output LDR image from the base HDR image for an LDR display device that can only display a lower dynamic range that is lower than the dynamic range of the HDR image.
- the recovery map can include gains and/or scaling factor that are based on the base HDR image and original LDR image, where the original LDR image, in some implementations, can be a tone-mapped image (or otherwise range conversion of the HDR image), similarly as described above.
- Some of the described implementations can enable applications that provide HDR images to obtain and output higher quality LDR images based on tone mappings of the HDR images, without the limitations of current techniques (e.g., HLG/PQ transfer functions) that provide global or generic tone mapping for an entire image.
- the local tone mapping provided by use of the recovery map techniques as described herein, without applying global or generic tone mapping, can enable higher fidelity conversions to LDR images from HDR images, e.g., for uses such as sharing images, or uploading images to services or applications that only support LDR images.
- a luminance control can be provided for display of HDR images such as the HDR images described herein.
- the HDR luminance control can be adjusted by the user of the device to designate a maximum luminance (e.g., brightness) at which HDR images are displayed by the display device.
- a maximum luminance e.g., brightness
- the maximum luminance of the HDR image can be determined as a maximum of the average luminance of the displayed pixels of the HDR image, since such an average may be indicative of perceived image brightness to users.
- the maximum luminance can be determined as a maximum of the brightest pixel of the HDR image, as a maximum of one or more particular pixels of the HDR image (or a maximum of an average of such particular pixels), and/or based on one or more other characteristics of the HDR image. If the luminance control is set to a value below the maximum displayable luminance of HDR images, the device reduces the maximum luminance of the output HDR image to a lower luminance level that is based on the luminance control.
- a reduced maximum luminance level for HDR images can reduce the expenditure of battery power by the device providing the display due to the lower brightness of the display.
- the HDR luminance control can be implemented as a global or system luminance control provided by the device (e.g., provided by an operating system executing on the device) that controls HDR image luminance for applications that execute on the device and which display HDR images.
- the luminance control can be applied to all such applications that execute on the device, all applications of a particular type that execute on the device (e.g., image viewing and image editing applications), and/or particular applications designated by the user. This allows the user, for example, to designate a single luminance adjustment for all (or many) HDR images displayed by a device, without having to adjust luminance for each individual displayed HDR image.
- Such a system luminance control can provide consistency of display by all applications that execute on the device and also can provide more control to the user as to the maximum luminance of displayed HDR images.
- the HDR luminance control can be implemented with other system controls on a device, e.g., as accessibility controls, user preferences, user settings, etc.
- the value set by the system luminance control can be used to adjust the output of an HDR image output in addition to any other scaling factors being used to display that image, e g , the display factor based on the maximum display device luminance as described above, individual luminance settings for an image, etc.
- the system luminance control can be implemented as a slider or a value that can be adjusted by the user, e.g., to indicate a maximum luminance value or a percentage of maximum displayable luminance for HDR images by the display device (e.g., where the maximum luminance can be determined as the maximum of average luminance of the pixels of the HDR image, the maximum of the brightest pixel or particular pixels of the HDR image, or a maximum based on one or more other image characteristics). For example, if the luminance control is set to 70% by the user, HDR images can be displayed at 70% of their maximum luminance (e.g., 70% of the maximum luminance of the display device if that is used to determine maximum luminance as described above).
- the system luminance control can allow a user to designate different maximum HDR image luminances for different screen luminances. For example, if the entire screen is displaying content at a higher screen luminance, the maximum HDR image luminance can be higher, e.g., 90% or 100%, since there is not much contrast between the overall screen luminance and the HDR image luminance. At low screen luminance, the maximum HDR image luminance can be lower, e.g., 50% or 60%, to reduce the contrast between the overall screen luminance and HDR image luminance.
- the system luminance control can be provided as a setting (e.g., via a slider or other interface control) that is relative to the maximum SDR image luminance.
- the setting could set the maximum HDR luminance value to a value that is N times the maximum SDR image luminance, where N has a maximum settable value that is based on the maximum luminance of the display device used to display the images.
- a setting value of 1 can indicate a maximum luminance of the SDR (base) image, and a setting value that is a maximum value or setting of the luminance control indicates to display at the maximum luminance of the display device.
- variable settings for HDR image luminance can be input via user interface controls.
- the user can define a curve that has a variable maximum HDR luminance vs. cunent screen luminance.
- multiple predetermined HDR luminance configurations can be provided in a user interface, and the user can select one of these configurations for use on the device.
- each configuration can provide a different set of maximum HDR luminance values associated with particular ranges of screen luminance values.
- a toggle control can be provided that, if selected by the user, removes the HDR luminance such that HDR images are displayed at LDR luminance. For example, if the toggle is selected by the user, an associated recovery map or recovery element (as described herein) can be ignored and the base image is displayed.
- Fig. 6 is an illustration of an example approximation of an image 600 that has a high dynamic range (the full dynamic range of the HDR image is not presentable in this representation; e.g., it is an LDR depiction of an HDR scene).
- HDR image 600 may have been captured by a camera that is capable of capturing HDR images.
- image 600 the image regions 602, 604, and 606 are shown in detail and the greater dynamic range enables these regions to be portrayed as they naturally appear to the human observer.
- Fig. 7 shows an example of a representation of an LDR image 700 that has a lower dynamic range than the HDR image 600, such as the low dynamic range of a standard JPEG image.
- LDR image 700 depicts the same scene as in images 600 and 700.
- image 700 may be an LDR image captured by a camera.
- the sky image region 702 is exposed properly, but the foreground image regions 704 and 706 are shadows that are not fully exposed due to lack of range compression in the image, thus causing these regions to be too dark and to lose visual detail.
- Fig. 8 is another example of an image 800 that has a lower dynamic range than the HDR image 600 similarly to LDR image 700, and depicts the same scene as in images 600 and 700.
- image 800 may be an LDR image captured by a camera.
- the sky image region 802 is overexposed and thus its detail is washed out, while the foreground image regions 804 and 806 are exposed properly and include appropriate detail.
- Fig. 9 is an example of a range-compressed LDR image 900.
- image 900 can be locally tone mapped from HDR image 600.
- the local tone mapping can be performed using any of a variety of tone mapping techniques.
- the shadows of image 700 can be increased in luminance while preserving the contrast of the edges in the image.
- image 900 has a lower dynamic range than image 600 (which is not perceivable in these figures), it can preserve the detail over more of the image than the image 700 of Fig. 7 and image 800 of Fig. 8.
- a range-compressed image such as image 900 can be used as the LDR image in the image format described herein, and can be included in the described image container as an LDR version of a provided HDR image.
- Fig. 10 is a block diagram of an example device 1400 which may be used to implement one or more features described herein.
- device 1000 may be used to implement a client device, e.g., any of client devices 120-126 shown in Fig. 1.
- device 1000 can implement a server device, e.g., server device 104.
- device 1000 may be used to implement a client device, a server device, or both client and server devices.
- Device 1000 can be any suitable computer system, server, or other electronic or hardware device as described above.
- One or more methods described herein can operate in several environments and platforms, e.g., as a standalone computer program that can be executed on any type of computing device, as a web application having web pages, a program run on a web browser, a mobile application (“app”) run on a mobile computing device (e.g., cell phone, smart phone, tablet computer, wearable device (wristwatch, armband, jewelry, headwear, virtual reality goggles or glasses, augmented reality goggles or glasses, head mounted display, etc.), laptop computer, etc.).
- a mobile computing device e.g., cell phone, smart phone, tablet computer, wearable device (wristwatch, armband, jewelry, headwear, virtual reality goggles or glasses, augmented reality goggles or glasses, head mounted display, etc.).
- a client/server architecture can be used, e.g., a mobile computing device (as a client device) sends user input data to a server device and receives from the server the final output data for output (e.g., for display).
- a mobile computing device sends user input data to a server device and receives from the server the final output data for output (e.g., for display).
- all computations can be performed within the mobile app (and/or other apps) on the mobile computing device.
- computations can be split between the mobile computing device and one or more server devices.
- device 1000 includes a processor 1002, a memory 1004, and input/output (I/O) interface 1006.
- Processor 1002 can be one or more processors and/or processing circuits to execute program code and control basic operations of the device 1000.
- a “processor” includes any suitable hardware system, mechanism or component that processes data, signals or other information.
- a processor may include a system with a general -purpose central processing unit (CPU) with one or more cores (e.g., in a single-core, dual-core, or multi-core configuration), multiple processing units (e.g., in a multiprocessor configuration), a graphics processing unit (GPU), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a complex programmable logic device (CPLD), dedicated circuitry for achieving functionality (e.g., one or more hardware image decoders and/or video decoders), a special-purpose processor to implement neural network model-based processing, neural circuits, processors optimized for matrix computations (e.g., matrix multiplication), or other systems.
- CPU central processing unit
- cores e.g., in a single-core, dual-core, or multi-core configuration
- multiple processing units e.g., in a multiprocessor configuration
- GPU graphics processing unit
- FPGA field-programmable gate array
- ASIC application-specific
- processor 1002 may include one or more co-processors that implement neural -network processing.
- processor 1002 may be a processor that processes data to produce probabilistic output, e.g., the output produced by processor 1002 may be imprecise or may be accurate within a range from an expected output. Processing need not be limited to a particular geographic location, or have temporal limitations. For example, a processor may perform its functions in “real-time,” “offline,” in a “batch mode,” etc. Portions of processing may be performed at different times and at different locations, by different (or the same) processing systems.
- a computer may be any processor in communication with a memory.
- Memory 1004 is typically provided in device 1000 for access by the processor 1002, and may be any suitable processor-readable storage medium, such as random access memory (RAM), read-only memory (ROM), Electrical Erasable Read-only Memory (EEPROM), Flash memory, etc., suitable for storing instructions for execution by the processor, and located separate from processor 1002 and/or integrated therewith.
- Memory 1004 can store software operating on the server device 1000 by the processor 1002, including an operating system 1008, image application 1010 (e.g., which may be image application 106 of Fig. 1), other applications 1012, and application data 1014.
- Other applications 1012 may include applications such as a data display engine, web hosting engine, map applications, image display engine, notification engine, social networking engine, media display applications, communication applications, web hosting engines or applications, media sharing applications, etc.
- the image application 1010 can include instructions that enable processor 1002 to perform functions described herein, e.g., some or all of the methods of Figs. 2-5 and/or 10-12.
- images stored in the formats described herein can be stored as application data 1014 or other data in memory 1004, and/or on other storage devices of one or more other devices in communication with device 1000.
- image application 1010 can include an image encoding / video encoding and container creation module(s) (e.g., performing methods of Figs. 2-4) and/or an image decoding / video decoding module(s) (e.g., performing the methods of Fig. 5), or such modules can be integrated into fewer or a single module or application.
- an image encoding / video encoding and container creation module(s) e.g., performing methods of Figs. 2-4
- an image decoding / video decoding module(s) e.g., performing the methods of Fig. 5
- memory 1004 can alternatively be stored on any other suitable storage location or computer-readable medium.
- memory 1004 (and/or other connected storage device(s)) can store one or more messages, one or more taxonomies, electronic encyclopedia, dictionaries, digital maps, thesauruses, knowledge bases, message data, grammars, user preferences, and/or other instructions and data used in the features described herein.
- Memory 1004 and any other type of storage can be considered “storage” or “storage devices.”
- I/O interface 1006 can provide functions to enable interfacing the server device 1000 with other systems and devices. Interfaced devices can be included as part of the device 1000 or can be separate and communicate with the device 1000. For example, network communication devices, storage devices (e.g., memory and/or database), and input/output devices can communicate via I/O interface 1006. In some implementations, the I/O interface can connect to interface devices such as input devices (keyboard, pointing device, touchscreen, microphone, camera, scanner, sensors, etc.) and/or output devices (display devices, speaker devices, printers, motors, etc.).
- input devices keyboard, pointing device, touchscreen, microphone, camera, scanner, sensors, etc.
- output devices display devices, speaker devices, printers, motors, etc.
- Some examples of interfaced devices that can connect to I/O interface 1006 can include one or more display devices 1020 that can be used to display content, e.g., images, video, and/or a user interface of an application as described herein.
- Display device 1020 can be connected to device 1000 via local connections (e.g., display bus) and/or via networked connections and can be any suitable display device.
- Display device 1020 can include any suitable display device such as an LCD, LED, or plasma display screen, CRT, television, monitor, touchscreen, 3-D display screen, or other visual display device.
- Display device 1020 may also act as an input device, e.g., a touchscreen input device.
- display device 1020 can be a flat display screen provided on a mobile device, multiple display screens provided in glasses or a headset device, or a monitor screen for a computer device.
- the TO interface 1006 can interface to other input and output devices.
- Some examples include one or more cameras which can capture images and/or detect gestures.
- Some implementations can provide a microphone for capturing sound (e.g., as a part of captured images, voice commands, etc.), a radar or other sensors for detecting gestures, audio speaker devices for outputting sound, or other input and output devices.
- Fig. 10 shows one block for each of processor 1002, memory 1004, I/O interface 1006, and software blocks 1008-1014. These blocks may represent one or more processors or processing circuitries, operating systems, memories, I/O interfaces, applications, and/or software modules.
- device 1000 may not have all of the components shown and/or may have other elements including other types of elements instead of, or in addition to, those shown herein. While some components are described as performing blocks and operations as described in some implementations herein, any suitable component or combination of components of environment 100, device 1000, similar systems, or any suitable processor or processors associated with such a system, may perform the blocks and operations described.
- Methods described herein can be implemented by computer program instructions or code, which can be executed on a computer.
- the code can be implemented by one or more digital processors (e.g., microprocessors or other processing circuitry) and can be stored on a computer program product including a non-transitory computer-readable medium (e.g., storage medium), such as a magnetic, optical, electromagnetic, or semiconductor storage medium, including semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), flash memory, a rigid magnetic disk, an optical disk, a solid-state memory drive, etc.
- a non-transitory computer-readable medium e.g., storage medium
- a magnetic, optical, electromagnetic, or semiconductor storage medium including semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), flash memory, a rigid magnetic disk, an optical disk, a solid-state memory drive, etc.
- the program instructions can also be contained in, and provided as, an electronic signal, for example in the form of software as a service (SaaS) delivered from a server (e.g., a distributed system and/or a cloud computing system).
- SaaS software as a service
- a server e.g., a distributed system and/or a cloud computing system
- one or more methods can be implemented in hardware (logic gates, etc ), or in a combination of hardware and software.
- Example hardware can be programmable processors (e.g. Field-Programmable Gate Array (FPGA), Complex Programmable Logic Device), general purpose processors, graphics processors, Application Specific Integrated Circuits (ASICs), and the like.
- FPGA Field-Programmable Gate Array
- ASICs Application Specific Integrated Circuits
- One or more methods can be performed as part of or component of an application running on the system, or as an application or software running in conjunction with other applications and operating systems.
- a user may be provided with controls allowing the user to make an election as to both if and when systems, programs, or features described herein may enable collection of user information (e.g., information about a user’s social network, social actions, or activities, profession, a user’s preferences, or a user’s current location), and if the user is sent content or communications from a server.
- user information e.g., information about a user’s social network, social actions, or activities, profession, a user’s preferences, or a user’s current location
- certain data may be treated in one or more ways before it is stored or used, so that personally identifiable information is removed.
- a user’s identity may be treated so that no personally identifiable information can be determined for the user, or a user’s geographic location may be generalized where location information is obtained (such as to a city, ZIP code, or state level), so that a particular location of a user cannot be determined.
- location information such as to a city, ZIP code, or state level
- the user may have control over what information is collected about the user, how that information is used, and what information is provided to the user.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
Abstract
Description
Claims
Priority Applications (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202380011737.0A CN118302789A (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format with low dynamic range compatibility |
JP2023576390A JP2024543288A (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format compatible with low dynamic range |
KR1020257011651A KR20250057064A (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format with low dynamic range compatibility |
EP23734822.2A EP4392929A1 (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format with low dynamic range compatibility |
KR1020237042406A KR102798092B1 (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format with low dynamic range compatibility |
EP23813946.3A EP4548297A1 (en) | 2022-10-31 | 2023-10-30 | High dynamic range video formats with low dynamic range compatibility |
CN202380057462.4A CN119678180A (en) | 2022-10-31 | 2023-10-30 | High dynamic range video format with low dynamic range compatibility |
KR1020257003229A KR20250029222A (en) | 2022-10-31 | 2023-10-30 | High dynamic range video format with low dynamic range compatibility |
PCT/US2023/036292 WO2024097135A1 (en) | 2022-10-31 | 2023-10-30 | High dynamic range video formats with low dynamic range compatibility |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263421155P | 2022-10-31 | 2022-10-31 | |
US63/421,155 | 2022-10-31 | ||
US202363439271P | 2023-01-16 | 2023-01-16 | |
US63/439,271 | 2023-01-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024096931A1 true WO2024096931A1 (en) | 2024-05-10 |
Family
ID=87036106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/023998 WO2024096931A1 (en) | 2022-10-31 | 2023-05-31 | High dynamic range image format with low dynamic range compatibility |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024096931A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2025027555A1 (en) * | 2023-08-01 | 2025-02-06 | Imax Corporation | Structural fidelity index for tone mapped videos |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150237322A1 (en) * | 2014-02-19 | 2015-08-20 | DDD IP Ventures, Ltd. | Systems and methods for backward compatible high dynamic range/wide color gamut video coding and rendering |
US20220092749A1 (en) * | 2020-09-23 | 2022-03-24 | Apple Inc. | Backwards-Compatible High Dynamic Range (HDR) Images |
-
2023
- 2023-05-31 WO PCT/US2023/023998 patent/WO2024096931A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150237322A1 (en) * | 2014-02-19 | 2015-08-20 | DDD IP Ventures, Ltd. | Systems and methods for backward compatible high dynamic range/wide color gamut video coding and rendering |
US20220092749A1 (en) * | 2020-09-23 | 2022-03-24 | Apple Inc. | Backwards-Compatible High Dynamic Range (HDR) Images |
Non-Patent Citations (2)
Title |
---|
IS&T ELECTRONIC IMAGING (EI) SYMPOSIUM: "EI 2023 Plenary: Embedded Gain Maps for Adaptive Display of High Dynamic Range Images", 17 January 2023 (2023-01-17), XP093086658, Retrieved from the Internet <URL:https://www.youtube.com/watch?v=HBVBLV9KZNI&ab_channel=IS&TElectronicImaging(EI)Symposium> [retrieved on 20230927] * |
JARON SCHNEIDER: "You Don't Need an HDR Display to See Android 14's 'Ultra HDR' Photos | PetaPixel", 12 May 2023 (2023-05-12), XP093087526, Retrieved from the Internet <URL:https://petapixel.com/2023/05/12/you-dont-need-an-hdr-display-to-see-android-14s-ultra-hdr-photos/> [retrieved on 20230929] * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2025027555A1 (en) * | 2023-08-01 | 2025-02-06 | Imax Corporation | Structural fidelity index for tone mapped videos |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5974128B2 (en) | Generation and rendering of high dynamic range images | |
TWI690211B (en) | Decoding method for high dynamic range images, processor non-transistory readable medium and computer program product thereof | |
US11715184B2 (en) | Backwards-compatible high dynamic range (HDR) images | |
US20160125581A1 (en) | Local multiscale tone-mapping operator | |
US9042682B2 (en) | Content creation using interpolation between content versions | |
JP2018528666A (en) | Color volume conversion in coding of high dynamic range and wide color gamut sequences | |
US8600159B2 (en) | Color converting images | |
CN115526972A (en) | Scene interaction effect rendering method and system in lighting simulation | |
CN110770787A (en) | Efficient end-to-end single-layer reverse display management coding | |
WO2024096931A1 (en) | High dynamic range image format with low dynamic range compatibility | |
US10225485B1 (en) | Method and apparatus for accelerated tonemapping | |
CN115330633A (en) | Image tone mapping method and device, electronic device, storage medium | |
KR102798092B1 (en) | High dynamic range image format with low dynamic range compatibility | |
CN118302789A (en) | High dynamic range image format with low dynamic range compatibility | |
US20250106410A1 (en) | Beta scale dynamic display mapping | |
US20250133224A1 (en) | Supporting multiple target display types | |
CN118765500A (en) | Supports multiple target display types | |
CN117979017A (en) | Video processing method, device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 202380011737.0 Country of ref document: CN |
|
ENP | Entry into the national phase |
Ref document number: 2023576390 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202347087211 Country of ref document: IN |
|
ENP | Entry into the national phase |
Ref document number: 2023734822 Country of ref document: EP Effective date: 20231214 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18860510 Country of ref document: US |
|
WWD | Wipo information: divisional of initial pct application |
Ref document number: 1020257011651 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 1020257011651 Country of ref document: KR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |