US9892522B2 - Method, apparatus and computer program product for image-driven cost volume aggregation - Google Patents
Method, apparatus and computer program product for image-driven cost volume aggregation Download PDFInfo
- Publication number
- US9892522B2 US9892522B2 US15/116,819 US201515116819A US9892522B2 US 9892522 B2 US9892522 B2 US 9892522B2 US 201515116819 A US201515116819 A US 201515116819A US 9892522 B2 US9892522 B2 US 9892522B2
- Authority
- US
- United States
- Prior art keywords
- cost volume
- sampled
- reference image
- level
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/97—Determining parameters from multiple pictures
-
- H04N13/0018—
-
- H04N13/0022—
-
- H04N13/0037—
-
- H04N13/0239—
-
- H04N13/0271—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/122—Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/128—Adjusting depth or disparity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/15—Processing image signals for colour aspects of image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/239—Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/271—Image signal generators wherein the generated image signals comprise depth maps or disparity maps
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/133—Equalising the characteristics of different image components, e.g. their average brightness or colour balance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/207—Image signal generators using stereoscopic image cameras using a single 2D image sensor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0077—Colour aspects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- Various implementations relate generally to method, apparatus, and computer program product for image-driven cost volume aggregation.
- Various electronic devices such as cameras, mobile phones, and other devices are used for capturing multiple multimedia content such as two or more images of a scene. Such capture of the images, for example, stereoscopic images may be used for detection of objects and post processing applications. Some post processing applications include disparity and/or depth estimation of the objects in the multimedia content such as images, videos and the like. Although, electronic devices are capable of supporting applications that capture the objects in the stereoscopic images and/or videos; however, such capturing and post processing applications such as cost aggregation for estimating depth involve intensive computations.
- a method comprising: computing a cost volume associated with a reference image; performing down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume; performing down-sampling of the reference image into the at least one level to generate at least one down-sampled reference image associated with corresponding at least one down-sampled cost volume; performing backward up-sampling of the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; performing backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled reference image associated with corresponding backward up-sampled cost volume; computing at least one color weight map associated with at least one of the cost volume and the at least one down-sampled cost volume based on an associated reference image and the at least one backward up-sampled reference image
- an apparatus comprising at least one processor; and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least: compute a cost volume associated with a reference image; perform down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume; perform down-sampling of the reference image into the at least one level to generate at least one down-sampled reference image associated with corresponding at least one down-sampled cost volume; perform backward up-sampling of the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; perform backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; compute at least one color weight map
- a computer program product comprising at least one computer-readable storage medium, the computer-readable storage medium comprising a set of instructions, which, when executed by one or more processors, cause an apparatus to perform at least: compute a cost volume associated with a reference image; perform down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume; perform down-sampling of the reference image into the at least one level to generate at least one down-sampled reference image associated with corresponding at least one down-sampled cost volume; perform backward up-sampling of the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; perform backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; compute at least one color weight
- an apparatus comprising: means for performing down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume; means for performing down-sampling of the reference image into the at least one level to generate at least one down-sampled reference image associated with corresponding at least one down-sampled cost volume; means for performing backward up-sampling of the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; means for performing backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled reference image associated with corresponding backward up-sampled cost volume; means for computing at least one color weight map associated with at least one of the cost volume and the at least one down-sampled cost volume based on an associated reference image and the at least one backward up-sampled reference image at the at the at least one
- a computer program comprising program instructions which when executed by an apparatus, cause the apparatus to: compute a cost volume associated with a reference image; perform down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume; perform down-sampling of the reference image into the at least one level to generate at least one down-sampled reference image associated with corresponding at least one down-sampled cost volume; perform backward up-sampling of the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled cost volume; perform backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level to generate at least one backward up-sampled reference image associated with corresponding backward up-sampled cost volume; compute at least one color weight map associated with at least one of the cost volume and the at least one down-sampled cost volume based on an associated reference image
- FIG. 1 illustrates a device, in accordance with an example embodiment
- FIG. 2 illustrates an example block diagram of an apparatus, in accordance with an example embodiment
- FIG. 3 illustrates an example representation of a cost volume associated with a reference image, in accordance with an example embodiment
- FIG. 4 illustrates an example representation of down-sampling and backward up-sampling of a cost volume, in accordance with an example embodiment
- FIG. 5 illustrates an example representation of backward up-sampling, in accordance with an example embodiment
- FIG. 6 is a flowchart depicting an example method, in accordance with an example embodiment.
- FIG. 7 is a flowchart depicting an example method for image-driven cost volume aggregation, in accordance with another example embodiment.
- FIGS. 1 through 7 of the drawings Example embodiments and their potential effects are understood by referring to FIGS. 1 through 7 of the drawings.
- FIG. 1 illustrates a device 100 in accordance with an example embodiment. It should be understood, however, that the device 100 as illustrated and hereinafter described is merely illustrative of one type of device that may benefit from various embodiments, therefore, should not be taken to limit the scope of the embodiments. As such, it should be appreciated that at least some of the components described below in connection with the device 100 may be optional and thus in an example embodiment may include more, less or different components than those described in connection with the example embodiment of FIG. 1 .
- the device 100 could be any of a number of types of electronic devices, for example, portable digital assistants (PDAs), pagers, mobile televisions, gaming devices, cellular phones, all types of computers (for example, laptops, mobile computers or desktops), cameras, audio/video players, radios, global positioning system (GPS) devices, media players, mobile digital assistants, or any combination of the aforementioned, and other types of communications devices.
- PDAs portable digital assistants
- pagers mobile televisions
- gaming devices for example, laptops, mobile computers or desktops
- computers for example, laptops, mobile computers or desktops
- GPS global positioning system
- media players media players
- mobile digital assistants or any combination of the aforementioned, and other types of communications devices.
- the device 100 may include an antenna 102 (or multiple antennas) in operable communication with a transmitter 104 and a receiver 106 .
- the device 100 may further include an apparatus, such as a controller 108 or other processing device that provides signals to and receives signals from the transmitter 104 and receiver 106 , respectively.
- the signals may include signaling information in accordance with the air interface standard of the applicable cellular system, and/or may also include data corresponding to user speech, received data and/or user generated data.
- the device 100 may be capable of operating with one or more air interface standards, communication protocols, modulation types, and access types.
- the device 100 may be capable of operating in accordance with any of a number of first, second, third and/or fourth-generation communication protocols or the like.
- the device 100 may be capable of operating in accordance with second-generation (2G) wireless communication protocols IS-136 (time division multiple access (TDMA)), GSM (global system for mobile communication), and IS-95 (code division multiple access (CDMA)), or with third-generation (3G) wireless communication protocols, such as Universal Mobile Telecommunications System (UMTS), CDMA1000, wideband CDMA (WCDMA) and time division-synchronous CDMA (TD-SCDMA), with 3.9G wireless communication protocol such as evolved-universal terrestrial radio access network (E-UTRAN), with fourth-generation (4G) wireless communication protocols, or the like.
- 2G wireless communication protocols IS-136 (time division multiple access (TDMA)
- GSM global system for mobile communication
- IS-95 code division multiple access
- third-generation (3G) wireless communication protocols such as Universal Mobile Telecommunications System (UMTS), CDMA1000, wideband CDMA (WCDMA) and time division-synchronous CDMA (TD-SCDMA), with 3.9G wireless communication protocol such as evolved-universal terrestrial radio access network (E-
- computer networks such as the Internet, local area network, wide area networks, and the like; short range wireless communication networks such as Bluetooth® networks, Zigbee® networks, Institute of Electric and Electronic Engineers (IEEE) 802.11x networks, and the like; wireline telecommunication networks such as public switched telephone network (PSTN).
- PSTN public switched telephone network
- the controller 108 may include circuitry implementing, among others, audio and logic functions of the device 100 .
- the controller 108 may include, but are not limited to, one or more digital signal processor devices, one or more microprocessor devices, one or more processor(s) with accompanying digital signal processor(s), one or more processor(s) without accompanying digital signal processor(s), one or more special-purpose computer chips, one or more field-programmable gate arrays (FPGAs), one or more controllers, one or more application-specific integrated circuits (ASICs), one or more computer(s), various analog to digital converters, digital to analog converters, and/or other support circuits. Control and signal processing functions of the device 100 are allocated between these devices according to their respective capabilities.
- the controller 108 thus may also include the functionality to convolutionally encode and interleave message and data prior to modulation and transmission.
- the controller 108 may additionally include an internal voice coder, and may include an internal data modem.
- the controller 108 may include functionality to operate one or more software programs, which may be stored in a memory.
- the controller 108 may be capable of operating a connectivity program, such as a conventional Web browser.
- the connectivity program may then allow the device 100 to transmit and receive Web content, such as location-based content and/or other web page content, according to a Wireless Application Protocol (WAP), Hypertext Transfer Protocol (HTTP) and/or the like.
- WAP Wireless Application Protocol
- HTTP Hypertext Transfer Protocol
- the controller 108 may be embodied as a multi-core processor such as a dual or quad core processor. However, any number of processors may be included in the controller 108 .
- the device 100 may also comprise a user interface including an output device such as a ringer 110 , an earphone or speaker 112 , a microphone 114 , a display 116 , and a user input interface, which may be coupled to the controller 108 .
- the user input interface which allows the device 100 to receive data, may include any of a number of devices allowing the device 100 to receive data, such as a keypad 118 , a touch display, a microphone or other input device.
- the keypad 118 may include numeric (0-9) and related keys (#, *), and other hard and soft keys used for operating the device 100 .
- the keypad 118 may include a conventional QWERTY keypad arrangement.
- the keypad 118 may also include various soft keys with associated functions.
- the device 100 may include an interface device such as a joystick or other user input interface.
- the device 100 further includes a battery 120 , such as a vibrating battery pack, for powering various circuits that are used to operate the device 100 , as well as optionally providing mechanical vibration as a detectable output.
- the device 100 includes a media-capturing element, such as a camera, video and/or audio module, in communication with the controller 108 .
- the media-capturing element may be any means for capturing an image, video and/or audio for storage, display or transmission.
- the camera module 122 may include a digital camera (or array of multiple cameras) capable of forming a digital image file from a captured image.
- the camera module 122 includes all hardware, such as a lens or other optical component(s), and software for creating a digital image file from a captured image.
- the camera module 122 may include the hardware needed to view an image, while a memory device of the device 100 stores instructions for execution by the controller 108 in the form of software to create a digital image file from a captured image.
- the camera module 122 may further include a processing element such as a co-processor, which assists the controller 108 in processing image data and an encoder and/or decoder for compressing and/or decompressing image data.
- the encoder and/or decoder may encode and/or decode according to a JPEG standard format or another like format.
- the encoder and/or decoder may employ any of a plurality of standard formats such as, for example, standards associated with H.261, H.262/MPEG-2, H.263, H.264, H.264/MPEG-4, MPEG-4, and the like.
- the camera module 122 may provide live image data to the display 116 .
- the display 116 may be located on one side of the device 100 and the camera module 122 may include a lens positioned on the opposite side of the device 100 with respect to the display 116 to enable the camera module 122 to capture images on one side of the device 100 and present a view of such images to the user positioned on the other side of the device 100 .
- the camera module(s) can also be on anyside, but normally on the opposite side of the display 116 or on the same side of the display 116 (for example, video call cameras).
- the device 100 may further include a user identity module (UIM) 124 .
- the UIM 124 may be a memory device having a processor built in.
- the UIM 124 may include, for example, a subscriber identity module (SIM), a universal integrated circuit card (UICC), a universal subscriber identity module (USIM), a removable user identity module (R-UIM), or any other smart card.
- SIM subscriber identity module
- UICC universal integrated circuit card
- USIM universal subscriber identity module
- R-UIM removable user identity module
- the UIM 124 typically stores information elements related to a mobile subscriber.
- the device 100 may be equipped with memory.
- the device 100 may include volatile memory 126 , such as volatile random access memory (RAM) including a cache area for the temporary storage of data.
- RAM volatile random access memory
- the device 100 may also include other non-volatile memory 128 , which may be embedded and/or may be removable.
- the non-volatile memory 128 may additionally or alternatively comprise an electrically erasable programmable read only memory (EEPROM), flash memory, hard drive, or the like.
- EEPROM electrically erasable programmable read only memory
- the memories may store any number of pieces of information, and data, used by the device 100 to implement the functions of the device 100 .
- FIG. 2 illustrates an apparatus 200 for image-driven cost volume aggregation in multimedia content associated with a scene, in accordance with an example embodiment.
- the apparatus 200 may be employed for performing depth estimation using stereo-matching.
- the apparatus 200 may be employed in a stereo camera for 3D image capture.
- the apparatus 200 may be employed, for example, in the device 100 of FIG. 1 .
- the apparatus 200 may also be employed on a variety of other devices both mobile and fixed, and therefore, embodiments should not be limited to application on devices such as the device 100 of FIG. 1 .
- embodiments may be employed on a combination of devices including, for example, those listed above.
- various embodiments may be embodied wholly at a single device, (for example, the device 100 ) or in a combination of devices.
- the devices or elements described below may not be mandatory and thus some may be omitted in certain embodiments.
- the apparatus 200 includes or otherwise is in communication with at least one processor 202 and at least one memory 204 .
- the at least one memory 204 include, but are not limited to, volatile and/or non-volatile memories.
- volatile memory include, but are not limited to, random access memory, dynamic random access memory, static random access memory, and the like.
- the non-volatile memory include, but are not limited to, hard disks, magnetic tapes, optical disks, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, flash memory, and the like.
- the memory 204 may be configured to store information, data, applications, instructions or the like for enabling the apparatus 200 to carry out various functions in accordance with various example embodiments.
- the memory 204 may be configured to buffer input data comprising media content for processing by the processor 202 .
- the memory 204 may be configured to store instructions for execution by the processor 202 .
- the processor 202 may include the controller 108 .
- the processor 202 may be embodied in a number of different ways.
- the processor 202 may be embodied as a multi-core processor, a single core processor; or combination of multi-core processors and single core processors.
- the processor 202 may be embodied as one or more of various processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like.
- various processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application specific integrated
- the multi-core processor may be configured to execute instructions stored in the memory 204 or otherwise accessible to the processor 202 .
- the processor 202 may be configured to execute hard coded functionality.
- the processor 202 may represent an entity, for example, physically embodied in circuitry, capable of performing operations according to various embodiments while configured accordingly.
- the processor 202 may be specifically configured hardware for conducting the operations described herein.
- the processor 202 may specifically configure the processor 202 to perform the algorithms and/or operations described herein when the instructions are executed.
- the processor 202 may be a processor of a specific device, for example, a mobile terminal or network device adapted for employing embodiments by further configuration of the processor 202 by instructions for performing the algorithms and/or operations described herein.
- the processor 202 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor 202 .
- ALU arithmetic logic unit
- a user interface (UI) 206 may be in communication with the processor 202 .
- Examples of the user interface 206 include, but are not limited to, input interface and/or output user interface.
- the input interface is configured to receive an indication of a user input.
- the output user interface provides an audible, visual, mechanical or other output and/or feedback to the user.
- Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, and the like.
- the output interface may include, but are not limited to, a display such as light emitting diode display, thin-film transistor (TFT) display, liquid crystal displays, active-matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, ringers, vibrators, and the like.
- the user interface 206 may include, among other devices or elements, any or all of a speaker, a microphone, a display, and a keyboard, touch screen, or the like.
- the processor 202 may comprise user interface circuitry configured to control at least some functions of one or more elements of the user interface 206 , such as, for example, a speaker, ringer, microphone, display, and/or the like.
- the processor 202 and/or user interface circuitry comprising the processor 202 may be configured to control one or more functions of one or more elements of the user interface 206 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the at least one memory 204 , and/or the like, accessible to the processor 202 .
- the apparatus 200 may include an electronic device.
- the electronic device include communication device, media capturing device with communication capabilities, computing devices, and the like.
- Some examples of the electronic device may include a mobile phone, a personal digital assistant (PDA), and the like.
- Some examples of computing device may include a laptop, a personal computer, and the like.
- Some examples of electronic device may include a camera.
- the electronic device may include a user interface, for example, the UI 206 , having user interface circuitry and user interface software configured to facilitate a user to control at least one function of the electronic device through use of a display and further configured to respond to user inputs.
- the electronic device may include a display circuitry configured to display at least a portion of the user interface of the electronic device. The display and display circuitry may be configured to facilitate the user to control at least one function of the electronic device.
- the electronic device may be embodied as to include a transceiver.
- the transceiver may be any device operating or circuitry operating in accordance with software or otherwise embodied in hardware or a combination of hardware and software.
- the processor 202 operating under software control, or the processor 202 embodied as an ASIC or FPGA specifically configured to perform the operations described herein, or a combination thereof, thereby configures the apparatus 200 or circuitry to perform the functions of the transceiver.
- the transceiver may be configured to receive media content. Examples of media content may include images, audio content, video content, data, and a combination thereof.
- the electronic device may be embodied as to include at least one image sensor, such as an image sensor 208 and image sensor 210 . Though only two image sensors 208 and 210 are shown in the example representation of FIG. 2 , but the electronic device may include more than two image sensors or only one image sensor.
- the image sensors 208 and 210 may be in communication with the processor 202 and/or other components of the apparatus 200 .
- the image sensors 208 and 210 may be in communication with other imaging circuitries and/or software, and are configured to capture digital images or to capture video or other graphic media.
- the image sensors 208 and 210 and other circuitries, in combination, may be example of at least one camera module such as the camera module 122 of the device 100 .
- the image sensors 208 and 210 may also be configured to capture a plurality of multimedia content, for example images, videos, and the like depicting a scene from different positions (or different angles).
- the image sensors 208 and 210 may be accompanied with corresponding lenses to capture two views of the scene, such as stereoscopic views.
- there may be a single camera module having an image sensor used to capture an image of the scene from a position (x), and then move through a distance to another position (y) and capture another image of the scene.
- the centralized circuit system 212 may be various devices configured to, among other things, provide or enable communication between the components ( 202 - 210 ) of the apparatus 200 .
- the centralized circuit system 212 may be a central printed circuit board (PCB) such as a motherboard, main board, system board, or logic board.
- the centralized circuit system 212 may also, or alternatively, include other printed circuit assemblies (PCAs) or communication channel media.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to facilitate access of a first image and a second image.
- the first image and the second image may include slightly different views of a scene having one or more objects.
- the first image may be a reference image for which a depth map may be determined.
- the second image may be a target image that may represent another viewpoint or another time moment with respect to the first image.
- the first image and the second image may be left view image and a right view image, respectively of the same scene.
- the first image and the second image may be frames associated with a video.
- the first image and the second image of the scene may be captured such that there exists a disparity in at least one object point of the scene between the first image and the second image.
- the first image and the second image may form a stereoscopic pair of images.
- a stereo camera may capture the first image and the second image, such that, the first image includes a slight parallax with the second image representing the same scene.
- the first image and the second image may also be received from a camera capable of capturing multiple views of the scene, for example, a multi-baseline camera, an array camera, a plenoptic camera and a light field camera.
- the first image and the second image may be prerecorded or stored in an apparatus, for example the apparatus 200 , or may be received from sources external to the apparatus 200 .
- the apparatus 200 is caused to receive the first image and the second image from external storage medium such as DVD, Compact Disk (CD), flash drive, memory card, or from external storage locations through Internet, Bluetooth®, and the like.
- a processing means may be configured to facilitate access of the first image and the second image of the scene comprising one or more objects, where there exists a disparity in at least one object of the scene between the first image and the second image.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 , and/or the image sensors 208 and 210 .
- the first image and the second image may include various portions being located at different depths with respect to a reference location.
- the ‘depth’ of a portion in an image may refer to a distance of the object points (for example, pixels) constituting the portion from a reference location, such as a camera location.
- the first image and the second image may include depth information for various object points associated with the respective images.
- the terms ‘depth’ and disparity′ may be used interchangeably in various embodiments.
- the disparity is inversely proportional to the depth of the scene. The disparity may be related to the depth as per the following equation:
- D describes the depth
- b represents baseline between two cameras capturing the pair of stereoscopic image, for example, the first image and the second image
- f is the focal length for each camera
- d is the disparity value for two corresponding object points.
- the first image and the second image accessed by the apparatus 200 may be rectified stereoscopic pair of images with respect to each other.
- the apparatus 200 instead of accessing the rectified stereoscopic pair of images, the apparatus 200 may be caused to access at least one stereoscopic pair of images that may not be rectified.
- the apparatus 200 may be caused to rectify the at least one stereoscopic pair of images to generate rectified images such as the first image and the second image.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to rectify one of the stereoscopic pair of images with respect to the other image such that a row (for example, a horizontal line) in the image may correspond to a row (for example, a horizontal line) in the other image.
- a row for example, a horizontal line
- an orientation of one of the at least one stereoscopic pair of images may be changed relative to the other image such that, a horizontal line passing through a point in one of the image may correspond to an epipolar line associated with the point in the other image.
- every object point in one image has a corresponding epipolar line in the other image.
- a processing means may be configured to rectify the at least one stereoscopic pair of images such that a horizontal line in the one of the image may correspond to a horizontal line in the other image of the at least one pair of stereoscopic images.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 .
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to facilitate stereo correspondence between the first image and the second image of the scene by estimating a three dimensional (3D) model of the scene through determination of matching cost associated with matching of pixels between the first image and the second image, and converting two-dimensional (2D) positions of the matching pixels into 3D depths.
- the 3D model may refer to cost volume between the first image and the second image.
- stereo-correspondence includes the steps of matching cost computation, cost aggregation, disparity computation and disparity optimization.
- the stereo-correspondence may be performed via optimization of the cost volume computed from a pair of images, such as the first image and the second image.
- the first image and the second image may be utilized for constructing a 3D cost volume which may include the matching costs for selecting a disparity at image co-ordinates (x, y).
- the matching cost may be determined by performing pixel-wise correlation between corresponding pixels of the stereo pair of images (e.g., the first image and the second image).
- the cost volume may include a plurality of disparity values and a pixel value difference corresponding to the disparity values.
- the cost volume for a rectified stereo pair of images may include a plurality of slices/layers that may be computed based on a measure of dis-similarity between the corresponding pixels of the first image and the second image for different shifts of the second image.
- the first image may be a reference image and the second image may be a target image
- the cost volume between the reference image and the target image may be computed by shifting the target image in a direction and in response determining a plurality of shifted versions of target image.
- the plurality of shifted versions of the target image may be referred to as ‘plurality of shifted images’ for the brevity of description.
- the cost aggregation associated with depth estimation is described to be performed by computing the cost volume for the left image as an example, however similar description is applicable to cost volume computation for the right image.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to compute the cost volume based on a cost computation method.
- cost computation methods may include but are not limited to sum of absolute differences (SAD), sum of squared differences (SSD), and the like.
- a processing means may be configured to perform cost computation between the reference image and the plurality of shifted images, and determine the cost volume.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 .
- the cost volume may be associated with the reference image.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to perform a down-sampling of the cost volume into at least one level.
- the down-sampling of the cost volume into the at least one level includes the down-sampling of the cost volume into a single level.
- the down-sampling of the cost volume into the at least level may include recursively down-sampling the cost volume into a plurality of levels.
- the down-sampling of the cost volume into the plurality of levels may refer to down-sampling of individual slices of the cost volume at each level of the plurality of levels.
- the recursive down-sampling of the cost volume into the plurality of levels may be performed to generate a hierarchical structure, also known as a Gaussian pyramid such that the plurality of levels of down-sampled cost volumes may correspond to layers of the Gaussian pyramid.
- the Gaussian pyramid of the cost volume may include a hierarchical structure having ‘N’ layers (corresponding to the plurality of levels) of the down-sampled cost volumes. It may be noted that on down-sampling the cost volume, a disparity range associated with the down-sampled cost volume is not decimated, and remains same as the disparity range of the original cost volume.
- a processing means may be configured to perform at least one down-sampling of the cost volume to generate the at least one down-sampled cost volume.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 .
- the down-sampled cost volume computed at each level (or each layer) of the Gaussian pyramid may be optionally filtered by a spatial filter, such as cross-bilateral filter, a cross-non-local-means filter (where weights are calculated based on the corresponding down-sampled reference image) or similar type of spatial filter.
- a spatial filter such as cross-bilateral filter, a cross-non-local-means filter (where weights are calculated based on the corresponding down-sampled reference image) or similar type of spatial filter.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to perform backward up-sampling of the at least one down-sampled cost volume.
- the down-sampled cost volumes may be up-sampled one step back to a finer level.
- ⁇ tilde over (C) ⁇ i backward up-sample( C i+1 )
- the backward up-sampling of the at least one down-sampled cost volume may include recursively up-sampling of the down-sampled cost volumes.
- the backward up-sampling of the sampled cost volumes into the plurality of levels may include backward up-sampling of individual slices of the down-sampled cost volumes.
- a processing means may be configured to perform backward up-sampling of the at least one down-sampled cost volume.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 .
- the backward up-sampling of the cost volume may be performed in a color-weighed manner, using pixels associated with the at least one down-sampled cost volume.
- the backward up-sampling of the at least one down-sampled cost volume may be performed based on color parameter values of neighboring pixels associated with an individual pixel in the at least one down-sampled cost volume.
- performing the backward up-sampling in the color-weighted manner may facilitate in efficient up-sampling of pixels that may be close to strong boundaries and may have limited support because of penalized weighting on the finer levels, for example during down-sampling.
- the backward up-sampling of the down-sampled cost volume in a color weighted manner is described further in detail with reference to FIG. 5 .
- the cost volume may be associated with the reference image.
- the reference image may include the color information associated with the cost volume.
- the color information may include intensity (gray-scale) or color of the reference image.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to determine the color information from the reference image associated with the cost volume.
- the at least one down-sampled cost volume may be enhanced with a corresponding color information included in a corresponding reference image.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to determine the corresponding reference image for the at least one down-sampled cost volume.
- the corresponding reference image for the at least one down-sampled cost volume may be determined by recursively down-sampling the reference image associated with the cost volume.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to recursively down-sample the reference image associated with the cost volume, and generate a plurality of down-sampled reference images corresponding to the plurality of levels.
- the down-sampling of the reference image may be performed to generate a Gaussian pyramid.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to perform recursively, backward up-sampling of the at least one down-sampled reference image.
- a processing means may be configured to perform backward up-sampling of the at least one down-sampled reference image.
- An example of the processing means may include the processor 202 , which may be an example of the controller 108 .
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to compute at least one color weight map associated with at least one of the cost volume and the at least one of the down-sampled cost volume based on an associated reference image and the at least one backward up-sampled reference image at the at least one level.
- the color weight map associated with the cost volume may include computing a difference between the reference image and a backward up-sampled reference image associated with the reference image.
- the color weight map associated with the at least one down-sampled cost volume may be computed based on a difference between the down-sampled reference image and the backward up-sampled reference image at the at least one level.
- the difference between the down-sampled reference image and the backward up-sampled reference image at the at least one level may be determined by computing a residual image from a Laplacian pyramid.
- the Laplacian pyramid may be constructed corresponding to the Gaussian pyramid of the down-sampled cost volumes, and may be configured to provide the residual (or difference) image computed based on a difference of a down-sampled reference image and a backward up-sampled (or reconstructed) image at a level, for example, ith level.
- the color weight of a pixel (x,y) at a level i may be given by the following expression:
- W i (x,y) is a color weight constructed as in a bilateral filter
- ⁇ represents an adjusting factor of the bilateral filter.
- the residual image may include small (nearly zero) values, and accordingly, the color weight map may be nearly equal to 1.
- the down-sampling may result in serious smoothing and hence image details may be significant, and accordingly the color weight map may be nearly equal to 0.
- an aggregated cost volume for the corresponding level may be determined based on the computed color weight map.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to determine an aggregated cost volume at the at least one level based on a weighted averaging of the backward up-sampled cost volume at the at least one level and an associated cost volume.
- the associated cost volume may include one of the cost volume and the at least one down-sampled cost volume at the at least one level.
- the associated cost volume may be the original cost volume that is associated with the reference image.
- the associated cost volume may be down-sampled cost volumes at the respective levels.
- the weighted averaging is performed based on the color weight map.
- the aggregated cost volume computed at each level ⁇ i (x,y,d) may be again filtered with a spatial filter.
- the spatial filter may include, cross bilateral filter, cross non local means, and the like.
- the cost volume aggregation described herein facilitates in performing local cost volume aggregation on each level of the plurality of levels, and accordingly based on the color weight map associated with the respective level, it may be determined whether or not to perform cost volume aggregation for the respective level.
- the cost volume aggregation may be performed at the at least one level while performing a fine-to-coarse sampling.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to determine whether to perform cost volume aggregation at that level.
- the cost volume aggregation may be performed at the at least one level while performing a coarse-to-fine sampling, for example while performing backward up-sampling of the down-sampled cost volumes.
- the cost volume aggregation may be performed while performing coarse-to-fine sampling as well as while performing fine-to-coarse sampling.
- the processor 202 is configured to, with the content of the memory 204 , and optionally with other components described herein, to cause the apparatus 200 to recursively update weight between the levels/layers of the Gaussian pyramid and update a cost volume of a respective level based on the computed weight associated with the respective level, until a finest layer of the updated Gaussian pyramid is reached.
- the finest layer of the updated Gaussian pyramid may refer to a last level of the up-sampled cost volume.
- the up-sampling of a down-sampled cost volume associated with a subsequent level is performed from the previously updated cost volume.
- FIG. 3 illustrates an example representation of a cost volume 300 computed from a first image and a second image, in accordance with an example embodiment.
- the first image and the second image may be associated with a scene.
- the ‘scene’ refers to an arrangement (natural, manmade, sorted or assorted) of one or more objects of which the images or videos can be captured, or of which the preview can be generated.
- the first image and the second image are stereoscopic pair of images of the scene captured by a device (for example, a camera module including image sensor 208 and/or 210 ).
- rectifying the first image and the second image includes aligning the first image and the second image such that the horizontal lines (for example, pixel rows) of the aligned first image correspond to horizontal lines (for example, pixel rows) of the aligned second image.
- the process of rectification for the pair of images transforms planes of the original pair of stereoscopic images to different planes in the pair of rectified images such that resulting epipolar lines are parallel and equal along new scan lines.
- the first image may be a reference image and the second image may be a target image.
- each of the reference image and the target image may include a plurality of pixels.
- one of the images, for example the second image may be shifted in a direction and a matching cost between the pixels of the first image and the shifted second image may be computed at a particular value of shift.
- a fixed image such as the first image may be referred to as a reference image while the shifting image such as the second image may be referred to as a target image.
- the shifting of the target image in a direction may lead to generation of a plurality of shifted images.
- the matching of the corresponding pixels between the reference image and the plurality of shifted images may facilitate in generation of a 3-D volume, known as a cost volume, for example the cost volume 300 .
- the matching costs for the reference image and the plurality of target images may be computed based on methods such as sum of squared differences (SSD), sum of absolute differences (SAD), and the like.
- the methods such as SSD and SAD includes computing a squared difference or absolute differences of the reference image and the target image pixel-wise.
- SAD/SSD map for each target image of the plurality of target images, there may be a SAD/SSD map that may be equal to the size of the first image and/or second image, and thus leads to the formation of the cost volume.
- the cost volume 300 is shown to include a plurality of SAD/SSD maps (hereinafter, referred to as slices of the cost volume).
- the slices of the cost volume are represented by numerals 302 , 304 , 306 , and 308 .
- the cost volume 300 may be utilized for determination of disparity of each pixel.
- the cost volume 300 may be sampled at least once to facilitate in determination of the depth of the pixels of the first image and/or the second image. The sampling of the cost volume 300 is illustrated and explained in detail with reference to FIG. 4 .
- FIG. 4 illustrates an example representation of down-sampling and backward up-sampling of a cost volume, for example, a cost volume 410 , in accordance with an example embodiment.
- the cost volume 410 is an example of the cost volume 300 ( FIG. 3 ).
- the down-sampling of the cost volume facilitates in performing cost aggregation in a memory efficient manner.
- the cost aggregation may include performing down-sampling of the cost volume 410 followed by backward up-sampling of the down-sampled cost volume.
- the sampling of the cost volume 410 may include decimation (or down-sampling) applied to the plurality of slices of the cost volume 410 to generate the at least one level of the sampled cost volume.
- the down-sampling of the plurality of slices of the cost volume 410 to generate the at least one level of the down-sampled cost volume may include down-sampling of the cost volume from a finer level to a coarser level, and may be referred to as a fine-to-coarse sampling.
- the backward up-sampling of the plurality of slices of the down-sampled cost volume to the plurality of up-sampled cost volumes may include up-sampling of the cost volume (or the down-sampled cost volume) from the coarser level back to the finer level, and may be referred to as a coarse-to-fine sampling.
- the plurality of slices of the down-sampled cost volume 430 may be recursively up-sampled to a generate plurality of up-sampled cost volumes, for example an up-sampled cost volume 450 .
- the down-sampling and the up-sampling may be performed by applying a pre-defined scaling factor f.
- the cost volume 410 (C i ) may be down-sampled by a factor f to generate the down-sampled cost volume 430 (C i+1 ).
- the down-sampled cost volume 430 (C i+1 ) may be up-sampled by the scaling factor f to generate the up-sampled cost volume 450 ( ⁇ tilde over (C) ⁇ i ).
- the up-sampling of the down-sampled cost volume at a level facilitates in reconstructing the cost volume at that level (for example, at the level i).
- the sampling of the cost volume is shown for only one level.
- the cost volume 410 is down-sampled to the cost volume 430
- the cost volume 430 is up-sampled to the cost volume 450 to thereby generate a reconstructed cost volume.
- the process of the down-sampling of the cost volume and subsequent up-sampling of the down-sampled cost volume may be performed recursively for a plurality of levels.
- the decimation (or down-sampling) of the cost volume slices does not affect disparity range, and thus coarser cost volume is fully compatible with up-sampled (or finer) cost volume.
- the cost volume 410 may be down-sampled into a plurality of levels to generate a Gaussian pyramid.
- the Gaussian pyramid may include a plurality of levels that may be generated based on recursive down-sampling of the levels of the Gaussian pyramid.
- a cost volume C1 at a level-1 may be down-sampled to generate a down-sampled cost volume at a level-2
- the down-sampled cost volume at the level-2 may further be down-sampled to generate a down-sampled cost volume at a level-3, and so on.
- the cost volume 410 may be down-sampled into a single level. For example as illustrated in FIG. 4 , the cost volume 410 is down-sampled into a single level to generate a down-sampled cost volume 430 .
- the terminology ‘down-sampling of the cost volume’ and ‘backward up-sampling of the sampled cost volume’ may refer to down-sampling and up-sampling respectively of a plurality of slices of the cost volume.
- the cost volume 410 may include a plurality of slices, for example slices 412 , 414 , 416 , 418 , and 420 .
- the slices of the cost volume 410 may be down-sampled to form the respective slices of the down-sampled cost volume 430 .
- the slices 412 , 414 , 416 , 418 and 420 of the cost volume 410 may be down-sampled to form slices 432 , 434 , 436 , 438 , and 440 respectively of the sampled cost volume 430 .
- the down-sampled cost volume for example the cost volume 430 may be up-sampled such that upon up-sampling of the cost volume 430 , the slices of the cost volume 430 may be up-sampled to form the respective slices of the up-sampled cost volume 450 .
- the slices of the cost volume may be down-sampled such that an over-smoothening of the strong depth discontinuities in the estimated disparity map may be avoided.
- each level of the cost volume pyramid for example the Gaussian pyramid
- the corresponding down-sampled reference image associated with each level of the cost volume pyramid may be utilized to determine a color weight map associated with each level of the Gaussian pyramid that may be utilized to penalize over-smoothing across boundaries, thereby facilitating in color-weighted hierarchical aggregation.
- the cost aggregation may be applied either before subsequent decimation (when creating next coarser level) or after backward coarse-to-fine propagation (when current cost volume has been already fused with the coarser estimate), or even twice.
- a color-weighted up-sampling of the cost volume may be applied while interpolating from the coarser level to the finer level. The color weighted up-sampling of the cost volume is described in detail with reference to FIG. 5 .
- FIG. 5 illustrates an example representation of backward up-sampling of a slice 500 of a cost volume, in accordance with an example embodiment.
- the pixels that are closer to strong boundaries may have limited support because of penalized weighting on finer levels of pyramid.
- a color-weighted backward up-sampling may be performed for the pixels associated with the down-sampled cost volume.
- the slice 500 may be an example of a slice of the down-sampled cost volume 430 .
- a plurality of slices included in the down-sampled cost volume 430 may be backward up-sampled.
- a corresponding pixel in the up-sampled cost volume may be computed based on respective color parameter value of neighboring pixels associated with an individual pixel in the down-sampled cost volume 430 .
- the backward up-sampling may be represented as illustrated in FIG. 5 .
- the grid-lines 502 , 504 , 506 , 508 , 510 and 512 correspond to a grid of coarse pixels
- hollow dots 514 , 516 , 518 , and 520 correspond to the centers of coarse grid.
- a high-resolution pixels grid is shown with lines 522 , 524 , 526 , and 528 .
- the four pixels 514 , 516 , 518 , and 520 from coarser grid correspond to one finer pixel 530 that may be estimated from the four pixels 514 , 516 , 518 , and 520 .
- W ⁇ k ⁇ ( x , y ) W k ⁇ ( x , y ) ⁇ j ⁇ W j ⁇ ( x , y )
- W k (x,y) c k ⁇ e ⁇ I i (x,i) ⁇ I i+1 (u k ,v k ) ⁇ , and
- c k is a constant factor, and ⁇ . ⁇ denotes some color-distance metric, calculated as, for instance, L1 or L2 norm.
- FIG. 6 is a flowchart depicting an example method 600 for image driven cost volume aggregation in images, in accordance with an example embodiment.
- the method 600 includes cost aggregation in images of a scene, where the images of the scene are captured such that there exist a disparity in at least one object of the scene between the images.
- the method 600 depicted in the flow chart may be executed by, for example, the apparatus 200 of FIG. 2 .
- the images such as a first image and a second image of the scene may be accessed.
- the first image and the second image may be accessed from a media capturing device including two sensors and related components, or from external sources such as DVD, Compact Disk (CD), flash drive, memory card, or received from external storage locations through Internet, Bluetooth®, and the like.
- the first image and the second image may include two different views of the scene.
- the first image and the second image are rectified images.
- the first image may be a reference image while the second image may be a target image.
- the target image may be shifted in one direction to thereby facilitate in computation of a cost volume.
- the method 600 includes, computing a cost volume based on a matching of the reference image with a plurality of shifted versions of a target image.
- the ‘plurality of shifted versions of the target image’ may hereinafter be referred to as ‘a plurality of shifted images’.
- the plurality of shifted images associated with the target image may be generated by shifting the target image in one direction. The generation of the cost volume is explained in detail with reference to FIG. 3 .
- the cost volume is associated with the reference image.
- the method 600 includes performing a down-sampling of the cost volume into at least one level to generate at least one down-sampled cost volume.
- the down-sampling of the cost volume into the at least one level may include down-sampling the cost volume to generate a single sampled cost volume.
- down-sampling of the cost volume into the at least one level may include down-sampling the cost volume recursively to generate a plurality of down-sampled cost volumes.
- a down-sampling of the reference image into the at least one level may be performed to generate at least one down-sampled reference image associated with at least one corresponding down-sampled cost volume.
- a backward up-sampling of the at least one down-sampled cost volume into the at least one level may be performed to generate at least one backward up-sampled cost volume.
- the backward up-sampling of the at least one down-sampled cost volume may include the backward up-sampling of a down-sampled cost volume at a level, for example ith level.
- the backward up-sampling of the at least one down-sampled cost volume may be performed based on respective color parameters of neighboring pixels associated with an individual pixel in the at least one down-sampled cost volume. An example illustrating and describing the backward up-sampling based on the respective color parameters of neighboring pixels is explained with reference to FIG. 5 .
- a backward up-sampling of the at least one down-sampled reference image associated with the at least one down-sampled cost volume into the at least one level may be performed.
- the backward up-sampling may generate at least one backward up-sampled reference image associated with corresponding backward up-sampled cost volume.
- a color weight map associated with the cost volume and the at least one down-sampled cost volume at the at least one level is computed.
- the color weight map for the cost volume may be computed based on the reference image and a backward up-sampled reference image at a first level.
- the color weight map fir the at least one down-sampled cost volume is computed based on a corresponding down-sampled reference image and a backward up-sampled reference image at the at least one level.
- a color weight map at a level 3 of a plurality of levels of the Gaussian pyramid may be determined based on a difference of down-sampled reference image at level 3 and up-sampled reference image at level 3.
- the down-sampled reference image at level 3 is generated by down-sampling a finer level reference image (at level 2), and the up-sampled reference image at level 3 may be generated by up-sampling the down-sampled reference image at level 4 of the Gaussian pyramid.
- the value of color weight map in textureless portions of the image may be nearly 1, while in a high detailed zone, the value of color weight may be nearly equal to 0.
- an aggregated cost volume at the at least one level is determined based on a weighed averaging of the at least one backward up-sampled cost volume and an associated cost volume at the at least one level.
- the associated cost volume may be one of the cost volume and the at least one down-sampled cost volume at the at least one level.
- the associated cost volume may be the (original) cost volume that is associated with the reference image.
- the associated cost volume may be the at least one down-sampled cost volume at that level.
- the weighed averaging may be performed based on the color weight map associated with the at least one level.
- ⁇ i (x,y,d) represents aggregated cost volume for the at least one level.
- the low-textured pixels may replace their values by aggregated values while high textured pixels may retain their values while performing backward up-sampling, thereby avoiding cost aggregation.
- FIG. 7 is a flowchart depicting an example method 700 for performing image-driven cost volume aggregation, in accordance with another example embodiment.
- the method 700 depicted in the flow chart may be executed by, for example, the apparatus 200 of FIG. 2 .
- the method 700 facilitates in performing cost volume aggregation based on cost volumes associated with the images.
- the method 700 includes performing cost volume aggregation in images associated with a scene in a memory-efficient manner.
- the method 700 facilitates in aligning cost volume associated with a pair of images (including a target image and a reference image) with the reference image, and utilize intensity information and color information from the reference image for aggregation of the cost volume.
- method 700 is explained with the help of stereoscopic pair of images, but it should be noted that the various operations described in the method 700 may be performed at any two or more images of a scene captured by a multi-baseline camera, an array camera, a plenoptic camera and a light field camera.
- the method 700 may be utilized for applications involving cost volume aggregation such as image segmentation, image colorization and de-colorization, alpha-matting and the like.
- the method 700 includes facilitating receipt of at least one pair of images.
- the at least one pair of images may include stereoscopic images.
- the at least one pair of images may be captured by a stereo camera.
- the at least one pair of images may also be captured by a multi-baseline camera, an array camera, a plenoptic camera or a light-field camera.
- the at least one pair of images may be received at the apparatus 200 or otherwise captured by the sensors.
- the at least one pair of images may not be rectified images with respect to each other.
- the method 700 may include rectifying the at least one pair of images such that rows in the at least one pair of images may correspond to each other.
- the operation of rectification is not required.
- the at least one pair of images may be rectified to generate a rectified pair of images.
- the rectified pair of images may include a first image and a second image.
- the first image and the second image may be shifted and matched to generate a disparity of each pixel.
- the first image and the second image may be a target image and a reference image, respectively that may be utilized for estimation of disparity between the first image and the second image.
- the target image may be shifted in one direction and a map between the reference image and a plurality of shifted versions of target images may be generated. As illustrated and discussed in FIG.
- a map (or image) equal to the size of the target image may be generated based on a matching of the reference image with one of the plurality of shifted images associated with the target image.
- a 3D space may be created, also known as a cost volume (at block 708 ), wherein the maps (or images) corresponding to the multiple shifts may form a plurality of slices of the cost volume.
- the plurality of slices of the cost volume corresponds to a plurality of matching costs between pixels of the first image and the second image.
- the cost volume may be associated with the reference image.
- the reference image associated with the cost volume may include a color information and intensity information associated with the cost volume.
- the cost volume may be recursively down-sampled to generate a plurality of down-sampled cost volumes.
- the cost volume may be recursively down-sampled to generate a plurality of levels (N) of the cost volume being defined by a plurality of levels of a Gaussian pyramid.
- the cost volume (C) may be down-sampled to generate a first level of the Gaussian pyramid, the first level may be down-sampled to generate a second level, and so on.
- the cost volume computed at each level of the Gaussian pyramid may be filtered by a filter.
- An example illustrating down-sampling of the cost volume between the layers of the Gaussian pyramid is illustrated and explained further with reference to FIG. 4 .
- the reference image associated with the cost volume may be recursively down-sampled into the plurality of levels to generate a plurality of down-sampled reference images corresponding to the plurality of down-sampled cost volumes at block 712 .
- the down-sampling of the reference image may be performed to generate a pyramid.
- the plurality of down-sampled reference images may be up-sampled backwards recursively to generate a corresponding plurality of backward up-sampled (or reconstructed) reference images.
- a coarse-to-fine up-sampling of the plurality of down-sampled cost volumes may be performed for reconstructing the cost volume.
- the coarse-to-fine up-sampling may be performed based on a color weight map associated with each level of the down-sampled cost volume.
- a color weight map associated with the plurality of levels of the cost-volumes may be determined based on the color information associated with the respective level of the plurality of cost-volumes.
- N represents the number of levels associated with the Gaussian pyramid. For example, if the cost volume is recursively down-sampled so as to generate 5 levels of the Gaussian pyramid, then the value of N would be 5, and the level to be considered at block 716 is 4.
- a down-sampled cost volume associated with the (i+1)th level may be backward up-sampled to generate a backward up-sampled cost volume at the ith level. For example, in a first iteration, if the level under consideration is 4 (for example, where a total number of levels in the Gaussian pyramid is 5), then the up-sampled cost volume associated with the 5 th level may be backward up-sampled. In an embodiment, backward up-sampling of the down-sampled cost volume at the (i+1)th level may include up-sampling of the down-sampled cost volume at the (i+1)th level.
- the backward up-sampling of the down-sampled cost volume may be performed based on respective color parameter value of neighboring pixels associated with an individual pixel in the down-sampled cost volume.
- the backward up-sampling of the down-sampled cost volume based on the color parameter value of neighboring pixels of the individual pixels is already explained with reference to FIG. 5 .
- the residual image may be computed based on a difference of the backward up-sampled reference image at the ith level and a down-sampled reference image at the ith level.
- a color weight map associated with the ith level of the down-sampled cost volume may be computed based on the residual image.
- the color weight map for ith level may be computed based on the residual image associated with the ith level being generated at block 720 .
- the color weight map associated with an ith level of the cost volume may be given by the following expression:
- the residual image may include small (nearly zero) values, and accordingly, the color weight map may be nearly equal to 1.
- the down-sampling may result in serious smoothing and hence details may be significant, and accordingly the color weight map may be nearly equal to 0.
- an aggregated cost volume for the corresponding level may be determined based on the computed color weight map at block 724 .
- the aggregated cost volume for the ith level may be determined based on a weighted averaging of the at least one backward up-sampled cost volume at the ith level and an associated cost volume at the ith level.
- a level subsequent to the ith level may be considered, and an aggregated cost volume for the subsequent level may be determined by following blocks 718 to 724 until a last level of the plurality of levels is reached.
- the method 700 for cost aggregation facilitates in performing local aggregation on each level of the plurality of levels, and accordingly based on the color weight map associated with the respective levels it may be determined whether or not to perform cost aggregation for the respective level.
- the cost volume aggregation may be performed at the plurality of levels while performing a fine-to-coarse sampling. For example, on down-sampling of a level of the cost volume, the method 700 may be applied to determine whether to perform cost volume aggregation at that level.
- the cost volume aggregation may be performed at the plurality of levels while performing a coarse-to-fine sampling, for example while performing backward up-sampling. For example, while performing backward up-sampling of the down-sampled cost volume, at each level of the plurality of levels it may be determined whether or not to perform cost volume aggregation at that level.
- the method 700 for cost volume aggregation may be performed while performing coarse-to-fine sampling as well as while performing fine-to-coarse sampling.
- the method 700 for depth estimation may applied once or more than once for processing of images.
- two depth maps, associated with left and right cameras correspondingly may be estimated by applying method 700 twice.
- the left image may be assumed to be the reference image and right image may be the target image.
- the depth map associated with the left image may be obtained.
- the method 700 may be applied again such that the right image may be assumed as the reference image and the left image may be the target image.
- the depth map associated with the right image may be determined.
- the methods for image-driven cost volume aggregation such as methods 600 and 700 may be applied to multiple applications running in parallel in a device, for example, the device 100 ( FIG. 1 ).
- the method 600 may be applied simultaneously for performing stereo matching and image segmentation in the device for example, the device 100 .
- the apparatus such as the apparatus 200 ( FIG. 2 ) embodied in the device 100 may be configured to perform one or more steps of the method(s) 600 / 700 in parallel to thereby facilitate the processes of image segmentation and stereo-matching simultaneously.
- the methods depicted in these flow charts may be executed by, for example, the apparatus 200 of FIG. 2 .
- Operations of the flowchart, and combinations of operation in the flowcharts may be implemented by various means, such as hardware, firmware, processor, circuitry and/or other device associated with execution of software including one or more computer program instructions.
- one or more of the procedures described in various embodiments may be embodied by computer program instructions.
- the computer program instructions, which embody the procedures, described in various embodiments may be stored by at least one memory device of an apparatus and executed by at least one processor in the apparatus.
- Any such computer program instructions may be loaded onto a computer or other programmable apparatus (for example, hardware) to produce a machine, such that the resulting computer or other programmable apparatus embody means for implementing the operations specified in the flowchart.
- These computer program instructions may also be stored in a computer-readable storage memory (as opposed to a transmission medium such as a carrier wave or electromagnetic signal) that may direct a computer or other programmable apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture the execution of which implements the operations specified in the flowchart.
- the computer program instructions may also be loaded onto a computer or other programmable apparatus to cause a series of operations to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions, which execute on the computer or other programmable apparatus provide operations for implementing the operations in the flowchart.
- the operations of the methods are described with help of apparatus 200 . However, the operations of the methods can be described and/or practiced by using any other apparatus.
- a technical effect of one or more of the example embodiments disclosed herein is to perform image driven cost volume aggregation in images (for example, in stereoscopic images) of a scene, where there is a disparity between the objects in the images.
- Various embodiments provide techniques for reducing the computational complexity as well as providing memory-efficient solutions for cost volume aggregation by facilitating the cost volume aggregation based on the color image associated with the cost volume.
- Various embodiments facilitates in sampling of a cost volume constructed from a reference (or color) image and a target image for performing cost aggregation and subsequent depth estimation in the images.
- the down-sampling and backward up-sampling of the cost volumes occurs in the cost volume domain and not in disparity map domain, a backward up-sampling of a disparity range associated with the disparity map is avoided, and thus the coarser cost volumes are compatible with the finer cost volumes. Additionally, since the coarse-to-fine fusion is not affected by different frequency content in the coarse (down-sampled cost volume) and fine (backward up-sampled cost volume), the coarse-to-fine fusion and up-scale propagation is easy and straight-forward.
- Various embodiments described above may be implemented in software, hardware, application logic or a combination of software, hardware and application logic.
- the software, application logic and/or hardware may reside on at least one memory, at least one processor, an apparatus or, a computer program product.
- the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
- a “computer-readable medium” may be any media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer, with one example of an apparatus described and depicted in FIGS. 1 and/or 2 .
- a computer-readable medium may comprise a computer-readable storage medium that may be any media or means that can contain or store the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
- the computer readable medium may be non-transitory.
- the different functions discussed herein may be performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the above-described functions may be optional or may be combined.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Processing (AREA)
Abstract
Description
C L(x,y,d)=∥(L(x,y)−R(x−d,y)∥,
-
- Herein, x and y are spatial co-ordinates of the pixel, and the third dimension d corresponds to disparity between left (L) and right (R) images, and
- CL(x,y,d) refers to the cost volume associated with a pixel at coordinates (x,y).
C 1 =C,C 2=down-sample(C 1), . . . ,C N=down-sample(C N−1)
{tilde over (C)} i=backward up-sample(C i+1)
-
- where, {tilde over (C)}i represents backward up-sampled cost volume being generated from a coarser level of the down-sampled cost volume at level (i+1).
I 1 =I,I 2=down-sample(I 1), . . . ,I N=down-sample(I N−1)
Ĩ i=up-sample(I i+1),
-
- where, Ĩi is an up-sampled reference image being generated from the down-sampled reference image at level (i+1).
Δi(x,y)=I i(x,y)−Ĩ i(x,y),
-
- where, Ii(x,y) represents the reference image associated with at least one down-sampled cost volume, and
- Ĩi(x,y) represents the reference image associated with the at least one backward up-sampled cost volume.
Ĉ i(x,y,d)=W i(x,y)·{tilde over (C)} i(x,y,d)+(1−W i(x,y))·C i(x,y,d)
-
- where, Ĉi(x,y,d) represents aggregated cost volume for the at least one level of the plurality of levels.
C i(x,y)={tilde over (W)} 0 ·C i+1(u 0 ,v 0)+{tilde over (W)} 1 ·C i+1(u 1 ,v 1)+{tilde over (W)} 2 ·C i+1(u 2 ,v 2)+{tilde over (W)} 3 ·C i+1(u 3 ,v 3)
-
- where, (u0,v0), (u1,v1), (u2,v2) and (u3,v3) are low resolution grid coordinates, and
- {tilde over (W)}k(x,y) is a corresponding color parameter value between a current fine grid-pixel and neighboring coarse grid pixel (uk,vk).
C 1 =C,C 2=down-sample(C 1), . . . ,C N=down-sample(C N−1)
I 1 =I,I 2=down-sample(I 1), . . . ,I N=down-sample(I N−1)
Ĉ i(x,y,d)=W i(x,y)·{tilde over (C)} i(x,y,d)+(1−W i(x,y))·C i(x,y,d)
C 1 =C,C 2=down-sample(C 1), . . . ,C N=down-sample(C N−1)
I 1 =I,I 2=down-sample(I 1), . . . ,I N=down-sample(I N−1)
Ĩi=upsample(I i+1)
-
- where, Ĩi is a backward up-sampled image at a level ‘i’ being reconstructed based on backward up-sampling of the down-sampled reference image at (i+1)th level.
Δi(x,y)=I i(x,y)−Ī i(x,y), where
-
- Ii(x,y) represents an original reference image (at i=1) in some embodiments, and a down-sampled reference image (at levels other than level i=1) associated with an ith level of the plurality of levels in some other embodiments,
- Īi(x,y) represents a reconstructed (or backward up-sampled) reference image constructed by up-sampling the down-sampled reference image associated with the ith level of the plurality of levels, and
- Δi(x,y) represents the residual image at the ith level.
Ĉ i(x,y,d)=W i(x,y)·{tilde over (C)} i(x,y,d)+(1−W i(x,y))·C i(x,y,d)
Claims (21)
Ĉ i(x,y,d)=W i(x,y)·{tilde over (C)} i(x,y,d)+(1−W i(x,y))·C i(x,y,d)
Ĉ i(x,y,d)=W i(x,y)·{tilde over (C)} i(x,y,d)+(1−W i(x,y))·C i(x,y,d)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1402608.2A GB2523149A (en) | 2014-02-14 | 2014-02-14 | Method, apparatus and computer program product for image-driven cost volume aggregation |
GB1402608.2 | 2014-02-14 | ||
PCT/FI2015/050079 WO2015121535A1 (en) | 2014-02-14 | 2015-02-09 | Method, apparatus and computer program product for image-driven cost volume aggregation |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170178353A1 US20170178353A1 (en) | 2017-06-22 |
US9892522B2 true US9892522B2 (en) | 2018-02-13 |
Family
ID=50440153
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/116,819 Active 2035-04-18 US9892522B2 (en) | 2014-02-14 | 2015-02-09 | Method, apparatus and computer program product for image-driven cost volume aggregation |
Country Status (4)
Country | Link |
---|---|
US (1) | US9892522B2 (en) |
EP (1) | EP3105738B1 (en) |
GB (1) | GB2523149A (en) |
WO (1) | WO2015121535A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190057513A1 (en) * | 2017-08-21 | 2019-02-21 | Fotonation Cayman Limited | Systems and Methods for Hybrid Depth Regularization |
US10586345B2 (en) * | 2015-05-17 | 2020-03-10 | Inuitive Ltd. | Method for estimating aggregation results for generating three dimensional images |
US10839543B2 (en) * | 2019-02-26 | 2020-11-17 | Baidu Usa Llc | Systems and methods for depth estimation using convolutional spatial propagation networks |
US11270110B2 (en) | 2019-09-17 | 2022-03-08 | Boston Polarimetrics, Inc. | Systems and methods for surface modeling using polarization cues |
US11290658B1 (en) | 2021-04-15 | 2022-03-29 | Boston Polarimetrics, Inc. | Systems and methods for camera exposure control |
US11302012B2 (en) | 2019-11-30 | 2022-04-12 | Boston Polarimetrics, Inc. | Systems and methods for transparent object segmentation using polarization cues |
US11525906B2 (en) | 2019-10-07 | 2022-12-13 | Intrinsic Innovation Llc | Systems and methods for augmentation of sensor systems and imaging systems with polarization |
US11580667B2 (en) | 2020-01-29 | 2023-02-14 | Intrinsic Innovation Llc | Systems and methods for characterizing object pose detection and measurement systems |
US11689813B2 (en) | 2021-07-01 | 2023-06-27 | Intrinsic Innovation Llc | Systems and methods for high dynamic range imaging using crossed polarizers |
US11797863B2 (en) | 2020-01-30 | 2023-10-24 | Intrinsic Innovation Llc | Systems and methods for synthesizing data for training statistical models on different imaging modalities including polarized images |
US11954886B2 (en) | 2021-04-15 | 2024-04-09 | Intrinsic Innovation Llc | Systems and methods for six-degree of freedom pose estimation of deformable objects |
US11953700B2 (en) | 2020-05-27 | 2024-04-09 | Intrinsic Innovation Llc | Multi-aperture polarization optical systems using beam splitters |
US12020455B2 (en) | 2021-03-10 | 2024-06-25 | Intrinsic Innovation Llc | Systems and methods for high dynamic range image reconstruction |
US12069227B2 (en) | 2021-03-10 | 2024-08-20 | Intrinsic Innovation Llc | Multi-modal and multi-spectral stereo camera arrays |
US12067746B2 (en) | 2021-05-07 | 2024-08-20 | Intrinsic Innovation Llc | Systems and methods for using computer vision to pick up small objects |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9626590B2 (en) * | 2015-09-18 | 2017-04-18 | Qualcomm Incorporated | Fast cost aggregation for dense stereo matching |
KR20170040571A (en) * | 2015-10-05 | 2017-04-13 | 한국전자통신연구원 | Apparatus and method for generating depth information |
US20170358101A1 (en) * | 2016-06-10 | 2017-12-14 | Apple Inc. | Optical Image Stabilization for Depth Sensing |
US10321112B2 (en) * | 2016-07-18 | 2019-06-11 | Samsung Electronics Co., Ltd. | Stereo matching system and method of operating thereof |
CN106931910B (en) * | 2017-03-24 | 2019-03-05 | 南京理工大学 | A kind of efficient acquiring three-dimensional images method based on multi-modal composite coding and epipolar-line constraint |
US11430134B2 (en) * | 2019-09-03 | 2022-08-30 | Nvidia Corporation | Hardware-based optical flow acceleration |
CN115797185B (en) * | 2023-02-08 | 2023-05-02 | 四川精伍轨道交通科技有限公司 | Coordinate conversion method based on image processing and complex sphere |
CN117354645B (en) * | 2023-12-04 | 2024-03-19 | 河北建投水务投资有限公司 | On-line inspection method for inlet and outlet water quality of water purification plant |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120008857A1 (en) | 2010-07-07 | 2012-01-12 | Electronics And Telecommunications Research Institute | Method of time-efficient stereo matching |
-
2014
- 2014-02-14 GB GB1402608.2A patent/GB2523149A/en not_active Withdrawn
-
2015
- 2015-02-09 EP EP15749667.0A patent/EP3105738B1/en active Active
- 2015-02-09 WO PCT/FI2015/050079 patent/WO2015121535A1/en active Application Filing
- 2015-02-09 US US15/116,819 patent/US9892522B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120008857A1 (en) | 2010-07-07 | 2012-01-12 | Electronics And Telecommunications Research Institute | Method of time-efficient stereo matching |
Non-Patent Citations (22)
Title |
---|
Extended European Search Report received for corresponding European Patent Application No. 15749667.0, dated Aug. 1, 2017, 4 pages. |
Fattal, "Edge-Avoiding Wavelets and Their Applications", ACM Transactions on Graphics, vol. 28, No. 3, Aug. 2009, pp. 22:1-22:10. |
Hanika et al., "Edge-Optimized A-Trous Wavelets for Local Contrast Enhancement With Robust Denoising", Computer Graphics Forum, vol. 30, No. 7, Sep. 2011, 8 pages. |
He et al., "Guided Image Filtering", European Conference on Computer Vision, 2010, pp. 1-14. |
International Search Report and Written Opinion received for corresponding Patent Cooperation Treaty Application No. PCT/FI2015/050079, dated Jun. 22, 2015, 12 pages. |
Jen et al., "Adaptive Scale Selection for Hierarchical Stereo", Proceedings of the British Machine Vision Conference (BMVC), 2011, pp. 1-10. |
Min et al.,"Depth Video Enhancement Based on Weighted Mode Filtering", IEEE Transactions on Image Processing, vol. 21, No. 3, Mar. 2012, pp. 1176-1190. |
Min, D. el al.: "Cost aggregation and occlusion handling with WLS in stereo matching", IEEE Trans. on Image Processing, vol. 17, No. 8, Aug. 2008, pp. 1431-1442. |
Mizukami et al. "Sub-pixel disparity search for binocular stereo vision", 2012 21st International Conference on Pattern Recognition (ICPR 2012). |
Paris et al., "A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach", In Proceedings of the European Conference on Computer Vision, 2006, 38 pages. |
Scharstein, D. et al.: "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", Int. Journal of Computer Vision, vol. 47, No. 1, Apr.-Jun. 2002, pp. 7-42. |
Search Report eceived for corresponding GB Application No. 1402608.2, dated Jul. 30, 2014, 3 pages. |
Sizintsev, M. et al.: "Coarse-to-fine stereo vision with accurate 3D boundaries", Image and Vision Computing, vol. 28, No. 3, Mar. 2010, pp. 353-366. |
Smirnov et al., "Fast Hierarchical Cost Volume Aggregation for Stereo-Matching", IEEE Visual Communications and Image Processing Conference, Dec. 7-10, 2014, pp. 498-501. |
Smirnov, S. et al: "Fast hierarchical cost volume aggregation for stereomatching", Visual Communications and Image Processing Conference, Dec. 7-10, 2014, Valletta, Malta, pp. 498-501. |
Tan et al., "Cross Image Inference Scheme for Stereo Matching", Computer Vision-ACCV, 2013, pp. 217-230. |
Tan et al., "Cross Image Inference Scheme for Stereo Matching", Computer Vision—ACCV, 2013, pp. 217-230. |
Wang et al., "High-Quality Real-Time Stereo Using Adaptive Cost Aggregation and Dynamic Programming", Proceedings of the Third International Symposium on 3D Data Processing, Visualization, and Transmission, Jun. 14-16, 2006, 8 pages. |
Yang et al., "Real-Time O(1) Bilateral filtering", IEEE Conference on Computer Vision and Pattern Recognition, Jun. 20-25, 2009, pp. 557-564. |
Yang, "A Non-Local Cost Aggregation Method for Stereo Matching", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 16-21, 2012, pp. 1402-1409. |
Yang, Q.-Q., el al.: "Hierarchical joint bilateral filtering for depth postprocessing", Int. Conf. on Image and Graphics, Aug. 12-15, 2011, Hefei, China, pp. 129-134. |
Yoon et al., "Adaptive Support-Weight Approach for Correspondence Search", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, No. 4, Apr. 2006, pp. 650-656. |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10586345B2 (en) * | 2015-05-17 | 2020-03-10 | Inuitive Ltd. | Method for estimating aggregation results for generating three dimensional images |
US11562498B2 (en) | 2017-08-21 | 2023-01-24 | Adela Imaging LLC | Systems and methods for hybrid depth regularization |
US10482618B2 (en) * | 2017-08-21 | 2019-11-19 | Fotonation Limited | Systems and methods for hybrid depth regularization |
US10818026B2 (en) * | 2017-08-21 | 2020-10-27 | Fotonation Limited | Systems and methods for hybrid depth regularization |
US20190057513A1 (en) * | 2017-08-21 | 2019-02-21 | Fotonation Cayman Limited | Systems and Methods for Hybrid Depth Regularization |
US11983893B2 (en) | 2017-08-21 | 2024-05-14 | Adeia Imaging Llc | Systems and methods for hybrid depth regularization |
US10839543B2 (en) * | 2019-02-26 | 2020-11-17 | Baidu Usa Llc | Systems and methods for depth estimation using convolutional spatial propagation networks |
US11270110B2 (en) | 2019-09-17 | 2022-03-08 | Boston Polarimetrics, Inc. | Systems and methods for surface modeling using polarization cues |
US11699273B2 (en) | 2019-09-17 | 2023-07-11 | Intrinsic Innovation Llc | Systems and methods for surface modeling using polarization cues |
US11525906B2 (en) | 2019-10-07 | 2022-12-13 | Intrinsic Innovation Llc | Systems and methods for augmentation of sensor systems and imaging systems with polarization |
US12099148B2 (en) | 2019-10-07 | 2024-09-24 | Intrinsic Innovation Llc | Systems and methods for surface normals sensing with polarization |
US11982775B2 (en) | 2019-10-07 | 2024-05-14 | Intrinsic Innovation Llc | Systems and methods for augmentation of sensor systems and imaging systems with polarization |
US11842495B2 (en) | 2019-11-30 | 2023-12-12 | Intrinsic Innovation Llc | Systems and methods for transparent object segmentation using polarization cues |
US11302012B2 (en) | 2019-11-30 | 2022-04-12 | Boston Polarimetrics, Inc. | Systems and methods for transparent object segmentation using polarization cues |
US11580667B2 (en) | 2020-01-29 | 2023-02-14 | Intrinsic Innovation Llc | Systems and methods for characterizing object pose detection and measurement systems |
US11797863B2 (en) | 2020-01-30 | 2023-10-24 | Intrinsic Innovation Llc | Systems and methods for synthesizing data for training statistical models on different imaging modalities including polarized images |
US11953700B2 (en) | 2020-05-27 | 2024-04-09 | Intrinsic Innovation Llc | Multi-aperture polarization optical systems using beam splitters |
US12069227B2 (en) | 2021-03-10 | 2024-08-20 | Intrinsic Innovation Llc | Multi-modal and multi-spectral stereo camera arrays |
US12020455B2 (en) | 2021-03-10 | 2024-06-25 | Intrinsic Innovation Llc | Systems and methods for high dynamic range image reconstruction |
US11290658B1 (en) | 2021-04-15 | 2022-03-29 | Boston Polarimetrics, Inc. | Systems and methods for camera exposure control |
US11954886B2 (en) | 2021-04-15 | 2024-04-09 | Intrinsic Innovation Llc | Systems and methods for six-degree of freedom pose estimation of deformable objects |
US11683594B2 (en) | 2021-04-15 | 2023-06-20 | Intrinsic Innovation Llc | Systems and methods for camera exposure control |
US12067746B2 (en) | 2021-05-07 | 2024-08-20 | Intrinsic Innovation Llc | Systems and methods for using computer vision to pick up small objects |
US11689813B2 (en) | 2021-07-01 | 2023-06-27 | Intrinsic Innovation Llc | Systems and methods for high dynamic range imaging using crossed polarizers |
Also Published As
Publication number | Publication date |
---|---|
GB201402608D0 (en) | 2014-04-02 |
WO2015121535A1 (en) | 2015-08-20 |
US20170178353A1 (en) | 2017-06-22 |
EP3105738A1 (en) | 2016-12-21 |
EP3105738A4 (en) | 2017-08-30 |
EP3105738B1 (en) | 2018-08-29 |
GB2523149A (en) | 2015-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9892522B2 (en) | Method, apparatus and computer program product for image-driven cost volume aggregation | |
US9542750B2 (en) | Method, apparatus and computer program product for depth estimation of stereo images | |
EP2947627B1 (en) | Light field image depth estimation | |
EP2916291B1 (en) | Method, apparatus and computer program product for disparity map estimation of stereo images | |
EP2874395A2 (en) | Method, apparatus and computer program product for disparity estimation | |
US9443130B2 (en) | Method, apparatus and computer program product for object detection and segmentation | |
US10015464B2 (en) | Method, apparatus and computer program product for modifying illumination in an image | |
EP2736011B1 (en) | Method, apparatus and computer program product for generating super-resolved images | |
US9478036B2 (en) | Method, apparatus and computer program product for disparity estimation of plenoptic images | |
US20170323433A1 (en) | Method, apparatus and computer program product for generating super-resolved images | |
US20170351932A1 (en) | Method, apparatus and computer program product for blur estimation | |
US9679220B2 (en) | Method, apparatus and computer program product for disparity estimation in images | |
EP2750391B1 (en) | Method, apparatus and computer program product for processing of images | |
EP2991036B1 (en) | Method, apparatus and computer program product for disparity estimation of foreground objects in images | |
US20130107008A1 (en) | Method, apparatus and computer program product for capturing images | |
US10097807B2 (en) | Method, apparatus and computer program product for blending multimedia content | |
US20160253790A1 (en) | Method, apparatus and computer program product for reducing chromatic aberrations in deconvolved images | |
WO2015055892A1 (en) | Method, apparatus and computer program product for detection and correction of image defect |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NOKIA CORPORATION, FINLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SMIRNOV, SERGEY;GOTCHEV, ATANAS;SIGNING DATES FROM 20140221 TO 20140225;REEL/FRAME:039348/0578 Owner name: NOKIA TECHNOLOGIES OY, FINLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NOKIA CORPORATION;REEL/FRAME:039348/0573 Effective date: 20150116 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |